Idea #13430
closed[arv-put] [Python] Allow caller to specify storage classes when writing data to Keep
Description
- Command line (
arv-put --storage-classes=archive ...
) - Create a new empty Collection and set desired storage classes (default being "default")
- Load an existing Collection, write some new data, and save
- Load an existing Collection, change its desired storage classes, write some new data, and save
If the caller asks for multiple storage classes, error out early, instead of writing all the data and then hitting an error when saving the collection record.
Related issues
Updated by Tom Morris over 6 years ago
- Related to Feature #11184: [Keep] Support multiple storage classes added
Updated by Tom Clegg over 6 years ago
- Subject changed from arv-put supports storage tiers to [arv-put] [Python] Allow caller to specify storage classes when writing data to Keep
- Description updated (diff)
Updated by Tom Clegg over 6 years ago
- Related to deleted (Feature #11184: [Keep] Support multiple storage classes)
Updated by Tom Clegg over 6 years ago
- Blocks Feature #11184: [Keep] Support multiple storage classes added
Updated by Peter Amstutz over 6 years ago
- Related to Feature #13382: [keepstore] Write new blocks to appropriate storage class added
Updated by Tom Morris over 6 years ago
- Target version changed from To Be Groomed to Arvados Future Sprints
Updated by Tom Morris over 6 years ago
- Target version changed from Arvados Future Sprints to 2018-05-23 Sprint
Updated by Fuad Muhic over 6 years ago
Quick notes:
Storage classes are separated with commas instead of space:
arv-put --storage-classes foo,bar,baz ./myFile.txt
If specifying them as space separated list of strings is better I can change it.
(In that case storage classes will have to be specified as last parameter)
Updating storage_classes_desired field always happens at the end when collection is saved (by calling save and save_new methods) which means only one API call is made.
That means that running following commands one after another:
arv-put --update-collection uuid --storage-classes hot ./myFile.txt
arv-put --update-collection uuid --storage-classes cold ./myFile.txt
will not change collections storage_classes_desired to cold, if myFile.txt didn't change, because manifest didn't change and Collection's save method will not make API call. (If this is not what we want I can change it but in that case updating collection while specifying storage classes will usually require 2 API calls.)
Updated by Lucas Di Pentima over 6 years ago
- Non-related: I think you should set up your git so that your Veritas account appears on the commits.
- Maybe some Python SDK tests are needed to cover the additions on
save()
andsave_new()
methods. - Adding docstring documentation for
storage_classes
on both methods would also be helpful - If
storage_classes
is not None onsave()
&save_new()
, maybe it would be convenient to validate that a list of strings was passed and error out early, similar to what the story ask on the multiple storage case. - Regarding updating the storage class when updating a collection that didn’t change: Maybe we can add a new behavior on the
save()
method that updates the collection whenever thestorage_classes
argument is not None, even if self.committed() == True.
Updated by Lucas Di Pentima over 6 years ago
Reviewing 287d269fa0ea5740715bfa6d3e93b44bea3b8584
Test run: https://ci.curoverse.com/job/developer-run-tests/722/
storage_classes
validation onsdk/python/arvados/collection.py
lines 1474 & 1487 could be condensed to just one check at the beginning of thesave()
method.- Would also be nice to have a test that proves that saving a committed collection updates the
storage_classes_desired
attribute. - With that, it LGTM.
Updated by Fuad Muhic over 6 years ago
- Status changed from In Progress to Resolved