Feature #17696

Exported config has default storage class(es), SDKs use the configured default storage class if not overridden

Added by Peter Amstutz 4 months ago. Updated 17 days ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Start date:
08/19/2021
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
3.0

Description

Use case: the default storage classes should be configurable in the config file, and there can be more than one.

  • api/controller reads Collections/DefaultStorageClasses map[string]bool. The default value should be {"default":true} (aka one storage class named "default").
  • out of the box, if Collections/DefaultStorageClasses is empty, and Volumes/XXXX/StorageClasses is empty, the implied value for both of those is "default". This should be done as part of config load. Using config-dump will show these "default" values. The config on the website will just have this listed in the comments, not in the sample config file.
  • when new collections are stored (and no storage classes are requested) the default values should be added to the desired storage classes
  • update those clients that send "default" as the storage class to send nothing instead
  • need a migration? for existing collections with storage class "default". Add a python recipe to the cookbook to change a particular storage class (e.g. "default") to something else.
  • add some tests
  • validation on the config-check:
    - Make sure the default class(es) are implemented by a volume.
    - If the config mentions an explicit storage class on a volume, all volumes must have storage classes defined (warning or error?) and Collections/DefaultStorageClasses must be set explicitly.

This gives us a short configuration (nothing set) that implies "default" in the simple case, but as soon as someone adds an additional storage class, we protect them from mistakes by forcing them to be explicit about all the storage classes in use.


Subtasks

Task #17979: Review 17696-gosdk-default-storage-classResolvedPeter Amstutz


Related issues

Related to Arvados Epics - Story #16107: Storage classesNew03/01/202109/30/2021

Blocked by Arvados - Task #17978: Add storage class default/priority fields to config + exported configResolved08/05/2021

Blocked by Arvados - Feature #17967: Prioritize reads from different storage classesResolved08/05/2021

Associated revisions

Revision 168d3fe8
Added by Lucas Di Pentima 23 days ago

Merge branch '17696-pysdk-default-storage-class' into main. Refs #17696

Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <>

Revision dc1c6914
Added by Lucas Di Pentima 18 days ago

Merge branch '17696-gosdk-default-storage-class' into main. Closes #17696

Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <>

History

#1 Updated by Peter Amstutz 4 months ago

#2 Updated by Peter Amstutz 2 months ago

  • Target version set to 2021-08-18 sprint

#3 Updated by Ward Vandewege about 2 months ago

  • Description updated (diff)

#4 Updated by Ward Vandewege about 2 months ago

  • Story points set to 3.0
  • Description updated (diff)

#5 Updated by Peter Amstutz about 2 months ago

  • Tracker changed from Story to Feature

#6 Updated by Lucas Di Pentima about 2 months ago

  • Assigned To set to Lucas Di Pentima

#7 Updated by Lucas Di Pentima about 2 months ago

  • Blocked by Task #17978: Add storage class default/priority fields to config + exported config added

#8 Updated by Peter Amstutz about 2 months ago

  • Blocked by Feature #17967: Prioritize reads from different storage classes added

#9 Updated by Tom Clegg about 1 month ago

With #17967, the exported config has defaults, but keepstore and apiserver both use the default storage classes if the caller doesn't specify anything -- so the SDKs probably don't even need to look up the defaults.

#10 Updated by Lucas Di Pentima about 1 month ago

  • Status changed from New to In Progress

#11 Updated by Peter Amstutz about 1 month ago

Tom Clegg wrote:

With #17967, the exported config has defaults, but keepstore and apiserver both use the default storage classes if the caller doesn't specify anything -- so the SDKs probably don't even need to look up the defaults.

Let's think this through.

  1. The client picks a random server (rendezvous hashed) to write to
  2. The server sees there are no storage classes specified, so it assumes the client wants the default ones
  3. The server doesn't offer any of the default storage classes
  4. The servers sends back an error (what error?)
  5. The client tries the next server in the list
  6. The next server offers one default storage class but not the other
  7. It sends back success with replication=2 (because that's what the storage class's replication is)
  8. The client declares success (but didn't write to both storage classes)

So I think it would be better than nothing for backwards compatibility, but in some situations it would not do the right thing.

#12 Updated by Tom Clegg about 1 month ago

Peter Amstutz wrote:

  1. The client declares success (but didn't write to both storage classes)

Yeah, good call. The client does need to track storage classes after all.

#13 Updated by Lucas Di Pentima about 1 month ago

  • Target version changed from 2021-08-18 sprint to 2021-09-01 sprint

#14 Updated by Lucas Di Pentima about 1 month ago

Updates at 5ac636b5a - branch 17696-pysdk-default-storage-class
Test run: https://ci.arvados.org/job/developer-run-tests/2644/

  • Updates Python SDK to request the configured storage classes when none are explicitly requested.
  • Updates the documentation on the collection's API section.
  • Adds & updates tests.

#15 Updated by Lucas Di Pentima about 1 month ago

Pending

  • Migration cookbook recipe: Depends on #17994 -- Maybe this task should be on its own story blocked by it.
  • Go SDK (in progress)

#16 Updated by Lucas Di Pentima about 1 month ago

Update on PySDK branch

Updates at 742be53
Test run: https://ci.arvados.org/job/developer-run-tests/2645/

  • Dropped the commit about changing the StorageClasses.SAMPLE.Default value to false because it made some tests fail.

#17 Updated by Lucas Di Pentima 27 days ago

Updates at 3b492e126 - branch 17696-gosdk-default-storage-class
Test run: https://ci.arvados.org/job/developer-run-tests/2647/

  • Adds ClusterConfig() function to arvadosclient.
  • Loads configured default storage classes when creating a new KeepClient.
  • Adds tests.

#18 Updated by Peter Amstutz 26 days ago

Lucas Di Pentima wrote:

Updates at 5ac636b5a - branch 17696-pysdk-default-storage-class
Test run: https://ci.arvados.org/job/developer-run-tests/2644/

  • Updates Python SDK to request the configured storage classes when none are explicitly requested.
  • Updates the documentation on the collection's API section.
  • Adds & updates tests.

I think we also need the Collection class to take its storage classes from the default. In the current code, a new Collection record will write blocks to the default storage class from the config, but the record will be written to the API with empty storage classes, and we assume the API server will update it to the same default storage classes that the blocks were written to. If, for some reason, the API server behaved differently in setting the default, all the blocks would all have been written to the wrong place.

If the Collection object gets its default storage classes from the config (to be used when not explicitly specified) then both the keep client and Collection object will use the same storage classes.

#19 Updated by Peter Amstutz 26 days ago

Lucas Di Pentima wrote:

Updates at 3b492e126 - branch 17696-gosdk-default-storage-class
Test run: https://ci.arvados.org/job/developer-run-tests/2647/

  • Adds ClusterConfig() function to arvadosclient.
  • Loads configured default storage classes when creating a new KeepClient.
  • Adds tests.

It hurts a bit that we're adding new features to arvadosclient instead of consolidating it with arvados.Client, maybe someday we'll find time to fix that.

Same comment goes for the Go SDK as for the Python SDK, the arvados.Collection.FileSystem() method should set c.StorageClassesDesired to the config default instead of assuming that keepclient and API server will agree.

#20 Updated by Lucas Di Pentima 25 days ago

Peter Amstutz wrote:

I think we also need the Collection class to take its storage classes from the default. In the current code, a new Collection record will write blocks to the default storage class from the config, but the record will be written to the API with empty storage classes, and we assume the API server will update it to the same default storage classes that the blocks were written to. If, for some reason, the API server behaved differently in setting the default, all the blocks would all have been written to the wrong place.
If the Collection object gets its default storage classes from the config (to be used when not explicitly specified) then both the keep client and Collection object will use the same storage classes.

Updates at 99e0417c6 (branch 17696-pysdk-default-storage-class)
Test run: https://ci.arvados.org/job/developer-run-tests/2654/

  • Adds default storage classes usage on Collection class.
  • Adds test.

#22 Updated by Lucas Di Pentima 24 days ago

After discussion with the team, we decided that the default storage classes handling will be left to just the keep clients so I've dropped the latest commits and rebase to main.

PySDK side at 5209c88 - branch 17696-pysdk-default-storage-class
Test run: https://ci.arvados.org/job/developer-run-tests/2655/

#23 Updated by Lucas Di Pentima 24 days ago

Updates at 913c7638c (GoSDK branch)
Test run: https://ci.arvados.org/job/developer-run-tests/2659/

  • Adds integration test that proves that a Collection is saved with the same storage_classes_desired list as the KeepClient uses.
  • In doing the above, I realized that the kc.loadDefaultSorageClasses() call should be done on keepclient.New().
  • Updates previously added test.

#24 Updated by Peter Amstutz 23 days ago

Lucas Di Pentima wrote:

Updates at 913c7638c (GoSDK branch)
Test run: https://ci.arvados.org/job/developer-run-tests/2659/

  • Adds integration test that proves that a Collection is saved with the same storage_classes_desired list as the KeepClient uses.
  • In doing the above, I realized that the kc.loadDefaultSorageClasses() call should be done on keepclient.New().
  • Updates previously added test.

Is KeepClient.loadDefaultClasses() backwards compatible? I see it prints a warning in that case, and DefaultStorageClasses will be nil. Is that handled properly? Glancing at BlockWrite, I'm not confident that it is. Does it need to make sure to set [default] ?

#25 Updated by Peter Amstutz 23 days ago

Lucas Di Pentima wrote:

After discussion with the team, we decided that the default storage classes handling will be left to just the keep clients so I've dropped the latest commits and rebase to main.

PySDK side at 5209c88 - branch 17696-pysdk-default-storage-class
Test run: https://ci.arvados.org/job/developer-run-tests/2655/

LGTM

#26 Updated by Lucas Di Pentima 20 days ago

Peter Amstutz wrote:

Is KeepClient.loadDefaultClasses() backwards compatible? I see it prints a warning in that case, and DefaultStorageClasses will be nil. Is that handled properly? Glancing at BlockWrite, I'm not confident that it is. Does it need to make sure to set [default] ?

Updates at 6610ed6 (rebased)
Test run: https://ci.arvados.org/job/developer-run-tests/2666/
Test re-run: https://ci.arvados.org/job/developer-run-tests-remainder/2772/

  • Updated (& fixed) tests that simulate the case when talking to an old cluster that doesn't have storage classes keep support & exported config defaults.

#27 Updated by Lucas Di Pentima 18 days ago

  • Target version changed from 2021-09-01 sprint to 2021-09-15 sprint

#28 Updated by Peter Amstutz 18 days ago

Lucas Di Pentima wrote:

Peter Amstutz wrote:

Is KeepClient.loadDefaultClasses() backwards compatible? I see it prints a warning in that case, and DefaultStorageClasses will be nil. Is that handled properly? Glancing at BlockWrite, I'm not confident that it is. Does it need to make sure to set [default] ?

Updates at 6610ed6 (rebased)
Test run: https://ci.arvados.org/job/developer-run-tests/2666/
Test re-run: https://ci.arvados.org/job/developer-run-tests-remainder/2772/

  • Updated (& fixed) tests that simulate the case when talking to an old cluster that doesn't have storage classes keep support & exported config defaults.

LGTM

#29 Updated by Lucas Di Pentima 17 days ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF