[arv-put] [Python] Allow caller to specify storage classes when writing data to Keep
- Command line (
arv-put --storage-classes=archive ...)
- Create a new empty Collection and set desired storage classes (default being "default")
- Load an existing Collection, write some new data, and save
- Load an existing Collection, change its desired storage classes, write some new data, and save
If the caller asks for multiple storage classes, error out early, instead of writing all the data and then hitting an error when saving the collection record.
#13 Updated by Fuad Muhic over 2 years ago
Storage classes are separated with commas instead of space:
arv-put --storage-classes foo,bar,baz ./myFile.txt
If specifying them as space separated list of strings is better I can change it.
(In that case storage classes will have to be specified as last parameter)
Updating storage_classes_desired field always happens at the end when collection is saved (by calling save and save_new methods) which means only one API call is made.
That means that running following commands one after another:
arv-put --update-collection uuid --storage-classes hot ./myFile.txt
arv-put --update-collection uuid --storage-classes cold ./myFile.txt
will not change collections storage_classes_desired to cold, if myFile.txt didn't change, because manifest didn't change and Collection's save method will not make API call. (If this is not what we want I can change it but in that case updating collection while specifying storage classes will usually require 2 API calls.)
#14 Updated by Lucas Di Pentima over 2 years ago
- Non-related: I think you should set up your git so that your Veritas account appears on the commits.
- Maybe some Python SDK tests are needed to cover the additions on
- Adding docstring documentation for
storage_classeson both methods would also be helpful
storage_classesis not None on
save_new(), maybe it would be convenient to validate that a list of strings was passed and error out early, similar to what the story ask on the multiple storage case.
- Regarding updating the storage class when updating a collection that didn’t change: Maybe we can add a new behavior on the
save()method that updates the collection whenever the
storage_classesargument is not None, even if self.committed() == True.
#15 Updated by Lucas Di Pentima over 2 years ago
sdk/python/arvados/collection.pylines 1474 & 1487 could be condensed to just one check at the beginning of the
- Would also be nice to have a test that proves that saving a committed collection updates the
- With that, it LGTM.