Project

General

Profile

Actions

Feature #16950

closed

[cli] add costanalyzer to arvados-client

Added by Ward Vandewege about 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-
Release relationship:
Auto

Description

The workflow costanalyzer script has lived out of tree for a long time. Add it as an arvados-client command.


Subtasks 1 (0 open1 closed)

Task #16954: Review 16950-add-costanalyzerResolvedWard Vandewege11/03/2020Actions

Related issues 1 (1 open0 closed)

Related to Arvados - Feature #17148: add logging middleware to lib/cmdNewActions
Actions #1

Updated by Ward Vandewege about 4 years ago

  • Status changed from New to In Progress
Actions #2

Updated by Ward Vandewege about 4 years ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz about 4 years ago

  • Target version changed from 2020-10-07 Sprint to 2020-10-21 Sprint
Actions #4

Updated by Ward Vandewege about 4 years ago

  • Target version changed from 2020-10-21 Sprint to 2020-11-04 Sprint
Actions #5

Updated by Ward Vandewege about 4 years ago

Ready for review at 9adb8ea50fcf555c7ec73cb0924c869af2345f86 on branch 16950-add-costanalyzer. Tests passed at developer-run-tests: #2163

Actions #6

Updated by Peter Amstutz about 4 years ago

  • Target version changed from 2020-11-04 Sprint to 2020-11-18
Actions #7

Updated by Tom Clegg about 4 years ago

NoPrefixFormatter looks pretty similar to &logrus.TextFormatter{DisableTimestamp: true}, except that it loses the structured-logging fields and just prints the message. That could be a feature, but I'm inclined to minimize the logging customization unless/until we're ready to put it somewhere like cmd/arvados-client to standardize across all the subcommands.

In tests: prefer to avoid the old Go SDK interface with arvadosclient.Dict... e.g., to get a CR, you can use

ac.RequestAndDecodeContext(ctx, &cr, "GET", "arvados/v1/container_requests/"+crUUID, nil, nil)

"Dict is a helper type so we don't have to write out 'map[string]interface{}' every time" ... is an antipattern that did not work out that well for us in the arvadosclient package.

You could do stuff with interfaces and embedded/wrapped types to handle the different kinds of nodeinfo without resorting to interface{} func params, but since you define both types right here, I think the simplest approach would be to make a single type that has both fields in it.

type nodeInfo struct {
  // Legacy
  Properties struct {
    CloudNode struct {
      Price float64
      Size  string
    } `json:"cloud_node"`
  }
  // Modern
  ProviderType string
  Price        float64
}

(You don't really need the json struct tags here for fields like Properties where the only difference is case: "properties" will decode into "Properties"... and you don't really need to define the fields like VCPUs that aren't being used.)

You can skip the bytes.Buffer and decode straight from the file: err = json.NewDecoder(f).Decode(&nodeInfo)

Should un-pyramid this:

if _, ok := itemMap["log_uuid"]; ok {
        if itemMap["log_uuid"] == nil {
                ...
        }
        // ...rest of func
}

...along these lines...

logUUID, ok := itemMap["log_uuid"]
if !ok {
        return
}
// ...rest of func

It seems like you could be using a cr arvados.ContainerRequest instead of an itemMap map[string]interface{} for the container request record, which would clean up some type casts/checks.

time.Parse("2006-01-02T15:04:05.000000000Z", ...) -- isn't this time.RFC3339Nano?

costanalyzer() and generateCrCsv()should return fmt.Errorf(...) instead of os.Exit(1) (except that generateCrCsv()

Could use backticks in flag descriptions -- e.g., description "output `directory` for the CSV reports" will render as "-output directory" instead of "-output string"

Actions #8

Updated by Tom Clegg about 4 years ago

One more thing: I found the behavior of creating a bunch of files and dirs in ./results/ a little surprising. The help text mentions that output csv files go there, but doesn't say anything about the cache behavior. I wonder if it would be a bit more friendly if the default output dir was somewhere under $HOME?

Actions #9

Updated by Ward Vandewege about 4 years ago

Tom Clegg wrote:

NoPrefixFormatter looks pretty similar to &logrus.TextFormatter{DisableTimestamp: true}, except that it loses the structured-logging fields and just prints the message. That could be a feature, but I'm inclined to minimize the logging customization unless/until we're ready to put it somewhere like cmd/arvados-client to standardize across all the subcommands.

Yeah, the idea is that we're logging to the console, and there's a human there, and structured logging in that context doesn't make a whole lot of sense. It's interesting, &logrus.TextFormatter{DisableTimestamp: true} is close but still starts each line with the (truncated) level, and I don't see a 'DisableLevel' option in logrus:

INFO Collecting child containers for container request ce8i5-xvhdp-9k61uz8l8eap5tr 
INFO  done                                        
INFO                                              
INFO Uuid report in results/ce8i5-dz642-f6zze094riq01ku.csv 
INFO 
Aggregate cost accounting for all supplied uuids in results/2020-11-11-11-37-13-aggregate-costaccounting.csv 

This is the second time we do this, NoPrefixFormatter is also in deduplicationreport. So, how about we standardize it? I made #17148 for that purpose.

In tests: prefer to avoid the old Go SDK interface with arvadosclient.Dict... e.g., to get a CR, you can use

Thanks, fixed. I also replaced a few of those uses in the code.

"Dict is a helper type so we don't have to write out 'map[string]interface{}' every time" ... is an antipattern that did not work out that well for us in the arvadosclient package.

OK, I had been wondering about that. I've just replaced it with map[string]interface{}.

You could do stuff with interfaces and embedded/wrapped types to handle the different kinds of nodeinfo without resorting to interface{} func params, but since you define both types right here, I think the simplest approach would be to make a single type that has both fields in it.

Thanks, done.

(You don't really need the json struct tags here for fields like Properties where the only difference is case: "properties" will decode into "Properties"... and you don't really need to define the fields like VCPUs that aren't being used.)

Simpler, thanks.

You can skip the bytes.Buffer and decode straight from the file: err = json.NewDecoder(f).Decode(&nodeInfo)

That one takes the cake for shortening the code. Thanks!

Should un-pyramid this:

Done.

It seems like you could be using a cr arvados.ContainerRequest instead of an itemMap map[string]interface{} for the container request record, which would clean up some type casts/checks.

Done, that was also a very nice cleanup, thanks!

time.Parse("2006-01-02T15:04:05.000000000Z", ...) -- isn't this time.RFC3339Nano?

Not really, time.RFC3339Nano has a trailing time: "2006-01-02T15:04:05.999999999Z07:00". In any case, it's moot now; the time.Parse calls are gone after the switch to using the struct types from the SDK.

costanalyzer() and generateCrCsv()should return fmt.Errorf(...) instead of os.Exit(1) (except that generateCrCsv()

Done.

Could use backticks in flag descriptions -- e.g., description "output `directory` for the CSV reports" will render as "-output directory" instead of "-output string"

Oh! I didn't know that. Neat, thanks.

One more thing: I found the behavior of creating a bunch of files and dirs in ./results/ a little surprising. The help text mentions that output csv files go there, but doesn't say anything about the cache behavior. I wonder if it would be a bit more friendly if the default output dir was somewhere under $HOME?

I changed it so that the cache goes into ~/.cache/arvados/costanalyzer. I also added a flag to disable the cache (and use it in the tests). For the results directory, I'm not sure what the best default is. It's ./results now, sounds like you would prefer it to be $HOME/something ?

Ready for another look in e1937e57fe2c0e99b6b636049142cc7598f80231

Actions #10

Updated by Ward Vandewege about 4 years ago

I changed the output directory behavior, removing the default. It has to be set explicitly now. See 6790e4990fa54d81decdd555beee337144d1fab1 on branch 16950-add-costanalyzer.

Actions #11

Updated by Peter Amstutz about 4 years ago

  • Target version changed from 2020-11-18 to 2020-12-02 Sprint
Actions #12

Updated by Ward Vandewege about 4 years ago

Actions #13

Updated by Tom Clegg about 4 years ago

Probably better to use os.UserHomeDir() instead of user.Current().HomeDir (this reminds me of how annoying it was in arvados-server boot when I wanted to install passenger in a specific dir but it insisted on putting stuff in the current user's home dir, even if I set $HOME to something else).

logger.SetOutput(stdout) seems surprising -- why not stderr?

Rest LGTM, thanks

Actions #14

Updated by Ward Vandewege about 4 years ago

Tom Clegg wrote:

Probably better to use os.UserHomeDir() instead of user.Current().HomeDir (this reminds me of how annoying it was in arvados-server boot when I wanted to install passenger in a specific dir but it insisted on putting stuff in the current user's home dir, even if I set $HOME to something else).

logger.SetOutput(stdout) seems surprising -- why not stderr?

Rest LGTM, thanks

OK, I've fixed those 2 things in b39f7a6141ecd5c53531b7705c0496623b4df9e9

Actions #15

Updated by Ward Vandewege about 4 years ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Resolved
Actions #16

Updated by Peter Amstutz over 3 years ago

  • Release set to 38
Actions

Also available in: Atom PDF