Idea #13429
closed[API] [arvados-cwl-runner] Save workflow outputs to desired storage classes
Description
arvados-cwl-runner should accept a command line option that determines the storage classes used to save the workflow output.
The minimal implementation is to update the storage_classes_desired field of the output collection. It's OK that this wastes storage resources by first writing the output data to the default class, then copying it to the desired class, and then (after blobSignatureTTL) deleting it from default.
Updated by Tom Morris over 6 years ago
- Related to Feature #11184: [Keep] Support multiple storage classes added
Updated by Tom Clegg over 6 years ago
- Subject changed from arvados-cwl-runner support storage tiers to [API] [arvados-cwl-runner] Save workflow outputs to desired storage classes
- Description updated (diff)
Updated by Tom Clegg over 6 years ago
- Related to deleted (Feature #11184: [Keep] Support multiple storage classes)
Updated by Tom Clegg over 6 years ago
- Blocks Feature #11184: [Keep] Support multiple storage classes added
Updated by Tom Morris over 6 years ago
- Target version changed from To Be Groomed to Arvados Future Sprints
- Story points set to 1.0
Updated by Tom Morris over 6 years ago
- Target version changed from Arvados Future Sprints to 2018-06-06 Sprint
Updated by Lucas Di Pentima over 6 years ago
Reviewing 95c01d22571043d7b6405f928538204cf930d453
- File
sdk/cwl/arvados_cwl/__init__.py
- Line 604: I think the default storage classes value could be set on the
parser.add_argument()
call - Line 606: On 13430 one of the requirements was to accept just one storage class for the moment, it that also desirable on this story?
- Line 727: Typo on help message: 'wortkflow'
- Line 604: I think the default storage classes value could be set on the
- I think some tests are needed for cases when the required storage classes is not the default value, and to check that it errors out when passed multiple storage classes.
Updated by Fuad Muhic over 6 years ago
I added check and tests for multiple storage classes. I think it would be nice to have a test that checks if output collection
contains storage_classes_desired field set to proper value defined by --storage-classes argument. However I'm really not sure how to write that test, I tried but I guess I'm not experienced enough with cwl-runner.
Updated by Fuad Muhic over 6 years ago
- Target version changed from 2018-06-06 Sprint to 2018-06-20 Sprint
Updated by Lucas Di Pentima over 6 years ago
Reviewing b01b480adb45c3bbfcfab13e343e08c16854dedc
- File
sdk/cwl/tests/test_submit.py
* Line 338:capture_stdout
is not being used after themain()
call so I suppose it can avoided or read from it to assure that the correct error message is being printed? - File
sdk/cwl/arvados_cwl/__init__.py
* Line 789: On error I think the way to go is to log it and return 1 (as in lines 804 & 824) - I think a test that assures an output collection has correct correct
storage_classes_desired
field is not necessary as the Python SDK Collection’s test already proves that. OTOH I think having a test that confirms that default & non-default—storage-classes
values are passed correctly tomake_output_collection()
will be very useful. I think this can be done using a similar approach as yourtest_error_when_multiple_storage_classes_specified
test and mockingmake_output_collection
to check how it was called. - I realized that the storage classes argument is being used only when
—submit
is not added. When it is used, a-c-r submits a container that runs another a-c-r with all necessary params (see test attest_submit.py
line 591, for example), so we’ll need a similar test that assures that the correct@ —storage-classes@ param is being passed along. I’m not sure where this needs to be added, please talk to Peter if you don’t find it. - Also, remember to merge master into this branch so Jenkins test don't fail because of
ciso8601
Updated by Fuad Muhic over 6 years ago
tests: https://ci.curoverse.com/job/developer-run-tests/759/ (with merged master)
- Added test which check that storage_classes parameters is propagated correctly to make_output_collection
- In --submit mode, storage_classes are passes to submitted container. (I opted to pass --storage-classes parameter only if user specified if in the first place. If user don't specify --storage-classes, submitted container will not have --storage-classes=default. Everything still works correctly and we don't have to add 100+ lines to test_submit.py). I added tests for this.
Updated by Fuad Muhic over 6 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|3fbabc1b5236b6667f453a2949849d96f6e683df.