Project

General

Profile

Actions

Feature #17696

closed

Exported config has default storage class(es), SDKs use the configured default storage class if not overridden

Added by Peter Amstutz almost 3 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Story points:
3.0
Release relationship:
Auto

Description

Use case: the default storage classes should be configurable in the config file, and there can be more than one.

  • api/controller reads Collections/DefaultStorageClasses map[string]bool. The default value should be {"default":true} (aka one storage class named "default").
  • out of the box, if Collections/DefaultStorageClasses is empty, and Volumes/XXXX/StorageClasses is empty, the implied value for both of those is "default". This should be done as part of config load. Using config-dump will show these "default" values. The config on the website will just have this listed in the comments, not in the sample config file.
  • when new collections are stored (and no storage classes are requested) the default values should be added to the desired storage classes
  • update those clients that send "default" as the storage class to send nothing instead
  • need a migration? for existing collections with storage class "default". Add a python recipe to the cookbook to change a particular storage class (e.g. "default") to something else.
  • add some tests
  • validation on the config-check:
    - Make sure the default class(es) are implemented by a volume.
    - If the config mentions an explicit storage class on a volume, all volumes must have storage classes defined (warning or error?) and Collections/DefaultStorageClasses must be set explicitly.

This gives us a short configuration (nothing set) that implies "default" in the simple case, but as soon as someone adds an additional storage class, we protect them from mistakes by forcing them to be explicit about all the storage classes in use.


Subtasks 1 (0 open1 closed)

Task #17979: Review 17696-gosdk-default-storage-classResolvedPeter Amstutz08/19/2021Actions

Related issues

Related to Arvados Epics - Idea #16107: Storage classesResolved03/01/202109/30/2021Actions
Blocked by Arvados - Task #17978: Add storage class default/priority fields to config + exported configResolvedTom Clegg08/05/2021Actions
Blocked by Arvados - Feature #17967: Prioritize reads from different storage classesResolvedTom Clegg08/05/2021Actions
Actions #1

Updated by Peter Amstutz almost 3 years ago

Actions #2

Updated by Peter Amstutz almost 3 years ago

  • Target version set to 2021-08-18 sprint
Actions #3

Updated by Ward Vandewege over 2 years ago

  • Description updated (diff)
Actions #4

Updated by Ward Vandewege over 2 years ago

  • Story points set to 3.0
  • Description updated (diff)
Actions #5

Updated by Peter Amstutz over 2 years ago

  • Tracker changed from Idea to Feature
Actions #6

Updated by Lucas Di Pentima over 2 years ago

  • Assigned To set to Lucas Di Pentima
Actions #7

Updated by Lucas Di Pentima over 2 years ago

  • Blocked by Task #17978: Add storage class default/priority fields to config + exported config added
Actions #8

Updated by Peter Amstutz over 2 years ago

  • Blocked by Feature #17967: Prioritize reads from different storage classes added
Actions #9

Updated by Tom Clegg over 2 years ago

With #17967, the exported config has defaults, but keepstore and apiserver both use the default storage classes if the caller doesn't specify anything -- so the SDKs probably don't even need to look up the defaults.

Actions #10

Updated by Lucas Di Pentima over 2 years ago

  • Status changed from New to In Progress
Actions #11

Updated by Peter Amstutz over 2 years ago

Tom Clegg wrote:

With #17967, the exported config has defaults, but keepstore and apiserver both use the default storage classes if the caller doesn't specify anything -- so the SDKs probably don't even need to look up the defaults.

Let's think this through.

  1. The client picks a random server (rendezvous hashed) to write to
  2. The server sees there are no storage classes specified, so it assumes the client wants the default ones
  3. The server doesn't offer any of the default storage classes
  4. The servers sends back an error (what error?)
  5. The client tries the next server in the list
  6. The next server offers one default storage class but not the other
  7. It sends back success with replication=2 (because that's what the storage class's replication is)
  8. The client declares success (but didn't write to both storage classes)

So I think it would be better than nothing for backwards compatibility, but in some situations it would not do the right thing.

Actions #12

Updated by Tom Clegg over 2 years ago

Peter Amstutz wrote:

  1. The client declares success (but didn't write to both storage classes)

Yeah, good call. The client does need to track storage classes after all.

Actions #13

Updated by Lucas Di Pentima over 2 years ago

  • Target version changed from 2021-08-18 sprint to 2021-09-01 sprint
Actions #14

Updated by Lucas Di Pentima over 2 years ago

Updates at 5ac636b5a - branch 17696-pysdk-default-storage-class
Test run: developer-run-tests: #2644

  • Updates Python SDK to request the configured storage classes when none are explicitly requested.
  • Updates the documentation on the collection's API section.
  • Adds & updates tests.
Actions #15

Updated by Lucas Di Pentima over 2 years ago

Pending

  • Migration cookbook recipe: Depends on #17994 -- Maybe this task should be on its own story blocked by it.
  • Go SDK (in progress)
Actions #16

Updated by Lucas Di Pentima over 2 years ago

Update on PySDK branch

Updates at 742be53
Test run: developer-run-tests: #2645

  • Dropped the commit about changing the StorageClasses.SAMPLE.Default value to false because it made some tests fail.
Actions #17

Updated by Lucas Di Pentima over 2 years ago

Updates at 3b492e126 - branch 17696-gosdk-default-storage-class
Test run: developer-run-tests: #2647

  • Adds ClusterConfig() function to arvadosclient.
  • Loads configured default storage classes when creating a new KeepClient.
  • Adds tests.
Actions #18

Updated by Peter Amstutz over 2 years ago

Lucas Di Pentima wrote:

Updates at 5ac636b5a - branch 17696-pysdk-default-storage-class
Test run: developer-run-tests: #2644

  • Updates Python SDK to request the configured storage classes when none are explicitly requested.
  • Updates the documentation on the collection's API section.
  • Adds & updates tests.

I think we also need the Collection class to take its storage classes from the default. In the current code, a new Collection record will write blocks to the default storage class from the config, but the record will be written to the API with empty storage classes, and we assume the API server will update it to the same default storage classes that the blocks were written to. If, for some reason, the API server behaved differently in setting the default, all the blocks would all have been written to the wrong place.

If the Collection object gets its default storage classes from the config (to be used when not explicitly specified) then both the keep client and Collection object will use the same storage classes.

Actions #19

Updated by Peter Amstutz over 2 years ago

Lucas Di Pentima wrote:

Updates at 3b492e126 - branch 17696-gosdk-default-storage-class
Test run: developer-run-tests: #2647

  • Adds ClusterConfig() function to arvadosclient.
  • Loads configured default storage classes when creating a new KeepClient.
  • Adds tests.

It hurts a bit that we're adding new features to arvadosclient instead of consolidating it with arvados.Client, maybe someday we'll find time to fix that.

Same comment goes for the Go SDK as for the Python SDK, the arvados.Collection.FileSystem() method should set c.StorageClassesDesired to the config default instead of assuming that keepclient and API server will agree.

Actions #20

Updated by Lucas Di Pentima over 2 years ago

Peter Amstutz wrote:

I think we also need the Collection class to take its storage classes from the default. In the current code, a new Collection record will write blocks to the default storage class from the config, but the record will be written to the API with empty storage classes, and we assume the API server will update it to the same default storage classes that the blocks were written to. If, for some reason, the API server behaved differently in setting the default, all the blocks would all have been written to the wrong place.
If the Collection object gets its default storage classes from the config (to be used when not explicitly specified) then both the keep client and Collection object will use the same storage classes.

Updates at 99e0417c6 (branch 17696-pysdk-default-storage-class)
Test run: developer-run-tests: #2654

  • Adds default storage classes usage on Collection class.
  • Adds test.
Actions #22

Updated by Lucas Di Pentima over 2 years ago

After discussion with the team, we decided that the default storage classes handling will be left to just the keep clients so I've dropped the latest commits and rebase to main.

PySDK side at 5209c88 - branch 17696-pysdk-default-storage-class
Test run: developer-run-tests: #2655

Actions #23

Updated by Lucas Di Pentima over 2 years ago

Updates at 913c7638c (GoSDK branch)
Test run: developer-run-tests: #2659

  • Adds integration test that proves that a Collection is saved with the same storage_classes_desired list as the KeepClient uses.
  • In doing the above, I realized that the kc.loadDefaultSorageClasses() call should be done on keepclient.New().
  • Updates previously added test.
Actions #24

Updated by Peter Amstutz over 2 years ago

Lucas Di Pentima wrote:

Updates at 913c7638c (GoSDK branch)
Test run: developer-run-tests: #2659

  • Adds integration test that proves that a Collection is saved with the same storage_classes_desired list as the KeepClient uses.
  • In doing the above, I realized that the kc.loadDefaultSorageClasses() call should be done on keepclient.New().
  • Updates previously added test.

Is KeepClient.loadDefaultClasses() backwards compatible? I see it prints a warning in that case, and DefaultStorageClasses will be nil. Is that handled properly? Glancing at BlockWrite, I'm not confident that it is. Does it need to make sure to set [default] ?

Actions #25

Updated by Peter Amstutz over 2 years ago

Lucas Di Pentima wrote:

After discussion with the team, we decided that the default storage classes handling will be left to just the keep clients so I've dropped the latest commits and rebase to main.

PySDK side at 5209c88 - branch 17696-pysdk-default-storage-class
Test run: developer-run-tests: #2655

LGTM

Actions #26

Updated by Lucas Di Pentima over 2 years ago

Peter Amstutz wrote:

Is KeepClient.loadDefaultClasses() backwards compatible? I see it prints a warning in that case, and DefaultStorageClasses will be nil. Is that handled properly? Glancing at BlockWrite, I'm not confident that it is. Does it need to make sure to set [default] ?

Updates at 6610ed6 (rebased)
Test run: developer-run-tests: #2666
Test re-run: developer-run-tests-remainder: #2772

  • Updated (& fixed) tests that simulate the case when talking to an old cluster that doesn't have storage classes keep support & exported config defaults.
Actions #27

Updated by Lucas Di Pentima over 2 years ago

  • Target version changed from 2021-09-01 sprint to 2021-09-15 sprint
Actions #28

Updated by Peter Amstutz over 2 years ago

Lucas Di Pentima wrote:

Peter Amstutz wrote:

Is KeepClient.loadDefaultClasses() backwards compatible? I see it prints a warning in that case, and DefaultStorageClasses will be nil. Is that handled properly? Glancing at BlockWrite, I'm not confident that it is. Does it need to make sure to set [default] ?

Updates at 6610ed6 (rebased)
Test run: developer-run-tests: #2666
Test re-run: developer-run-tests-remainder: #2772

  • Updated (& fixed) tests that simulate the case when talking to an old cluster that doesn't have storage classes keep support & exported config defaults.

LGTM

Actions #29

Updated by Lucas Di Pentima over 2 years ago

  • Status changed from In Progress to Resolved
Actions #30

Updated by Peter Amstutz over 2 years ago

  • Release set to 42
Actions

Also available in: Atom PDF