Feature #17351
closed[arv-put] Storage classes
Description
- to define the command line arguments to specify storage clases
- to have all the expected behaviour for arv-put
- to add to the documentation this behaviour
- to add necesary tests to make sure we comply with this behaviour
- shall we migrate arv-put to Go in this instance or is this a future work?
Current command line arguments (arvados 2.1.0) :¶
arv-put --help usage: arv-put [-h] [--version] [--normalize | --dry-run] [--as-stream | --stream | --as-manifest | --in-manifest | --manifest | --as-raw | --raw] [--update-collection UUID] [--use-filename FILENAME] [--filename FILENAME] [--portable-data-hash] [--replication N] [--storage-classes STORAGE_CLASSES] [--threads N] [--exclude PATTERN] [--follow-links | --no-follow-links] [--trash-at YYYY-MM-DDTHH:MM | --trash-after DAYS] [--project-uuid UUID] [--name NAME] [--progress | --no-progress | --batch-progress] [--silent] [--resume | --no-resume] [--cache | --no-cache] [--retries RETRIES] [path [path ...]] (..) --replication N Set the replication level for the new collection: how many different physical storage devices (e.g., disks) should have a copy of each data block. Default is to use the server-provided default (if any) or 2. --storage-classes STORAGE_CLASSES Specify comma separated list of storage classes to be used when saving data to Keep.
base casecase¶
arv-put --replication N --storage-classes STORAGE_CLASSES directory
Expected behaviour: ...
updating an existing collection¶
arv-put --replication N --storage-classes STORAGE_CLASSES directory --update-collection zzzzz-4zz18-xxxxxxxxxxxxxxx
Expected behaviour: ...
giving conflicting options for resume transaction¶
arv-put --replication N --storage-classes STORAGE_CLASSES directory arv-put --replication M --storage-classes DIFFERENT_STORAGE_CLASSES directory --resume
Expected behaviour: ...
Related issues
Updated by Nico César over 3 years ago
- Related to Idea #16107: Storage classes added
Updated by Nico César over 3 years ago
- Target version set to To Be Groomed
- Category set to Keep
Updated by Nico César over 3 years ago
- Description updated (diff)
- Subject changed from [arv-put] [and other keep clients] Storage tiers design to [arv-put] Storage tiers design
Updated by Nico César over 3 years ago
- Subject changed from [arv-put] Storage tiers design to [arv-put] Storage classes revisit
Updated by Nico César over 3 years ago
- Subject changed from [arv-put] Storage classes revisit to [arv-put] Storage tiers design
Updated by Nico César over 3 years ago
- Subject changed from [arv-put] Storage tiers design to [arv-put] Storage classes revisit
Updated by Lucas Di Pentima over 3 years ago
- Target version changed from To Be Groomed to 2021-04-14 sprint
Updated by Peter Amstutz over 3 years ago
- Target version changed from 2021-04-14 sprint to 2021-05-26 sprint
Updated by Peter Amstutz over 3 years ago
- Subject changed from [arv-put] Storage classes revisit to [arv-put] Storage classes
Updated by Peter Amstutz over 3 years ago
- Assigned To set to Lucas Di Pentima
- Subject changed from [arv-put] Storage classes to [arv-put] Storage classes
Updated by Lucas Di Pentima over 3 years ago
- Target version changed from 2021-05-26 sprint to 2021-06-09 sprint
Updated by Lucas Di Pentima over 3 years ago
- Status changed from New to In Progress
Updated by Lucas Di Pentima over 3 years ago
Updates at a0fcd46cb - branch 17351-arvput-keepclient-storage-support
Test run: developer-run-tests: #2508
- Removes limitation of no more than 1 storage classes. (couldn't find the reason of that limitation, introduced in #13430)
- Passes storage classes data at
Collection
instantiation time instead of passing it to the.save()
or.save_new()
methods. This produces that the keep client used to upload files will write to keep directly to the specified classes.
Updated by Tom Clegg over 3 years ago
- Related to Idea #17465: Support writing blocks to correct storage classes in Python SDK added
Updated by Lucas Di Pentima over 3 years ago
Rebased to the latest #17465 changes at 57a26e5
Test run: developer-run-tests: #2510
Updated by Lucas Di Pentima over 3 years ago
- Target version changed from 2021-06-09 sprint to 2021-06-23 sprint
Updated by Lucas Di Pentima over 3 years ago
While working on #17572 I realized that making arv-put
to honor a previously created collection's desired_storage_classes
field may produce surprising results to the user, for example:
1. The user creates an empty collection via the CLI tools, assigning a desired_storage_classes
list with nonexistent classes.
2. The RailsAPI will be OK with that, so it gets created.
3. Then, the user executes arv-put
without any --storage-classes
argument but using --update-collection UUID
with the previously created collection's UUID.
4. The user will get an error from arv-put
because Keep returns 503 (I think) when a non-valid class is specified. This is because the command will honor the storage class set up on the collection record if none is specified.
If our priority is making sure that keep writes get done on the correct classes or nowhere, I think the solution would be to make RailsAPI or controller error out when a non-valid class is requested. WDYT?
Updated by Tom Clegg over 3 years ago
I think the "error because existing collection has unwritable classes" outcome is acceptable. Even if we validate classes when creating/saving a collection, this same condition can happen if all volumes with a given class become read-only, temporarily unreachable, or full.
We should probably check that the error message in such cases is not too confusing, though.
Updated by Lucas Di Pentima over 3 years ago
- % Done changed from 0 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|523d1c2a9963edc25becf7958e024992ed8a6e66.