Idea #13430
closed
[arv-put] [Python] Allow caller to specify storage classes when writing data to Keep
Added by Tom Morris over 6 years ago.
Updated over 6 years ago.
Release relationship:
Auto
Description
Examples
- Command line (
arv-put --storage-classes=archive ...
)
- Create a new empty Collection and set desired storage classes (default being "default")
- Load an existing Collection, write some new data, and save
- Load an existing Collection, change its desired storage classes, write some new data, and save
If the caller asks for multiple storage classes, error out early, instead of writing all the data and then hitting an error when saving the collection record.
- Related to Feature #11184: [Keep] Support multiple storage classes added
- Target version set to To Be Groomed
- Subject changed from arv-put supports storage tiers to [arv-put] [Python] Allow caller to specify storage classes when writing data to Keep
- Description updated (diff)
- Related to deleted (Feature #11184: [Keep] Support multiple storage classes)
- Description updated (diff)
- Related to Feature #13382: [keepstore] Write new blocks to appropriate storage class added
- Target version changed from To Be Groomed to Arvados Future Sprints
- Target version changed from Arvados Future Sprints to 2018-05-23 Sprint
- Assigned To set to Fuad Muhic
- Status changed from New to In Progress
Quick notes:
Storage classes are separated with commas instead of space:
arv-put --storage-classes foo,bar,baz ./myFile.txt
If specifying them as space separated list of strings is better I can change it.
(In that case storage classes will have to be specified as last parameter)
Updating storage_classes_desired field always happens at the end when collection is saved (by calling save and save_new methods) which means only one API call is made.
That means that running following commands one after another:
arv-put --update-collection uuid --storage-classes hot ./myFile.txt
arv-put --update-collection uuid --storage-classes cold ./myFile.txt
will not change collections storage_classes_desired to cold, if myFile.txt didn't change, because manifest didn't change and Collection's save method will not make API call. (If this is not what we want I can change it but in that case updating collection while specifying storage classes will usually require 2 API calls.)
- Non-related: I think you should set up your git so that your Veritas account appears on the commits.
- Maybe some Python SDK tests are needed to cover the additions on
save()
and save_new()
methods.
- Adding docstring documentation for
storage_classes
on both methods would also be helpful
- If
storage_classes
is not None on save()
& save_new()
, maybe it would be convenient to validate that a list of strings was passed and error out early, similar to what the story ask on the multiple storage case.
- Regarding updating the storage class when updating a collection that didn’t change: Maybe we can add a new behavior on the
save()
method that updates the collection whenever the storage_classes
argument is not None, even if self.committed() == True.
- Status changed from In Progress to Resolved
Also available in: Atom
PDF