Idea #17465
closedSupport writing blocks to correct storage classes in Python SDK
Description
Depends on #13382
The client is responsible for keeping track of how many replicas have been written for each storage class. If one storage class is satisfied but not another, the satisfied one can be removed from the X-Keep-Storage-Classes list. The overall write is a success if all storage classes were satisfied, and a failure if any of the storage classes could not be satisfied.
Add as an option to Python keepclient and Collection class.
"We are not expecting to skip servers, we just need to send the storage classes and know when we are done"
Keepclient code in Go and Python SDK support:
Setting storage classes to use when uploading a block
Passing storage classes in the X-Keep-Storage-Classes header
The client is responsible for keeping track of how many replicas have been written for each storage class. If one storage class is satisfied but not another, the satisfied one can be removed from the X-Keep-Storage-Classes list. The overall write is a success if all storage classes were satisfied, and a failure if any of the storage classes could not be satisfied.
- Collection instance knows the storage classes
- Storage classes are passed to the block manager instance
- Block manager passes storage classes to the put() call via a new optional parameter
Related issues
Updated by Peter Amstutz over 3 years ago
- Related to Idea #16107: Storage classes added
Updated by Peter Amstutz over 3 years ago
- Target version set to 2021-04-14 sprint
Updated by Lucas Di Pentima over 3 years ago
- Assigned To set to Lucas Di Pentima
Updated by Peter Amstutz over 3 years ago
- Target version changed from 2021-04-14 sprint to 2021-05-12 sprint
- Assigned To deleted (
Lucas Di Pentima)
Updated by Lucas Di Pentima over 3 years ago
- Blocked by Feature #13382: [keepstore] Write new blocks to appropriate storage class added
Updated by Peter Amstutz over 3 years ago
- Blocks Feature #17572: arv-mount understands storage classes added
Updated by Lucas Di Pentima over 3 years ago
- Target version changed from 2021-05-12 sprint to 2021-05-26 sprint
Updated by Lucas Di Pentima over 3 years ago
- Status changed from New to In Progress
Updated by Lucas Di Pentima over 3 years ago
- Target version changed from 2021-05-26 sprint to 2021-06-09 sprint
Updated by Lucas Di Pentima over 3 years ago
Status update:
Updates at a641967 - branch 17465-pysdk-storage-classes-support
Test run: developer-run-tests: #2499
KeepClient.put()
supports storage classes.- When talking to a older cluster, it falls back to the previous behavior, only checking for the number of copies.
- Added tests.
Note: Some of the new tests make KeepWriterQueue
to freeze. I'm not quite sure what's going on yet, but I avoided the problem my adding 1 to the number of tries (see line 574 at sdk/python/arvados/keep.py
)
Pending: Add storage classes support to arvados.Collection
Updated by Lucas Di Pentima over 3 years ago
Updates at aeb518534
Test run: developer-run-tests: #2500
- Adds
KeepClient
storage classes support toCollection
class throughBlockManager
. - Collections being loaded from a manifest locator now also load their
storage_classes_desired
attribute. - Added tests.
Updated by Lucas Di Pentima over 3 years ago
While reading through arv-put
code to see how to tackle #17351, I'm realizing I did something wrong in regards to the precedence of the storage classes being used when calling Collection.save()
or Collection.save_new()
and passing storage classes there: The current proposal disregards those passed arguments if the Collection
instance was created with other storage classes (something that can happen when the code is called with those arguments or when loading a Collection
from the API with storage classes desired attributes already set).
I'll fix that and update the ticket when ready.
Updated by Lucas Di Pentima over 3 years ago
Updates at 5ab34436e
Test run: developer-run-tests: #2507
- Makes
Collection.save()
andCollection.save_new()
to set the storage classes desired data as aCollection
instance variable, so it can be overwritten if already set at instantiation time. - Makes
BlockManager
to access storage classes desired data from itsCollection
instead of having an internal copy of the same information. This will allow to immediately start writing blocks on a different set of classes if requested. - Adds one more test.
Updated by Tom Clegg over 3 years ago
I'm not sure this is a great solution, but I think it explains/avoids the deadlock you were hitting, where pending_tries reaches 0 because #copies reaches the target, but there are still unsatisfied classes. With this patch, I found tests pass but generate a lot of noise from a queue.Empty exception in testutil -- ISTR the test strategy involves adding extra mocked keep responses in situations like this.
diff --git a/sdk/python/arvados/keep.py b/sdk/python/arvados/keep.py
index 2f20132ae..8fe05f347 100644
--- a/sdk/python/arvados/keep.py
+++ b/sdk/python/arvados/keep.py
@@ -571,7 +571,7 @@ class KeepClient(object):
self.response = None
self.storage_classes_tracking = True
self.queue_data_lock = threading.Lock()
- self.pending_tries = max(copies, len(classes))+1
+ self.pending_tries = max(copies, len(classes))
self.pending_tries_notification = threading.Condition()
def write_success(self, response, replicas_nr, classes_confirmed):
@@ -585,6 +585,11 @@ class KeepClient(object):
self.confirmed_storage_classes[st_class] += st_copies
except KeyError:
self.confirmed_storage_classes[st_class] = st_copies
+ pending_classes = 0
+ for st_class in self.wanted_storage_classes:
+ if st_class not in self.confirmed_storage_classes or self.confirmed_storage_classes[st_class] < self.wanted_copies:
+ pending_classes += 1
+ self.pending_tries = max(self.wanted_copies - self.successful_copies, pending_classes)
self.response = response
with self.pending_tries_notification:
self.pending_tries_notification.notify_all()
source:sdk/python/arvados/collection.py L1427 ("classes = ...") seems to be unused/unneeded:
if self._block_manager is None: copies = (self.replication_desired or self._my_api()._rootDesc.get('defaultCollectionReplication', 2)) classes = self.storage_classes_desired or [] self._block_manager = _BlockManager(self._my_keep(), copies=copies, put_threads=self.put_threads, num_retries=self.num_retries, storage_classes_func=self.storage_classes_desired)
source:sdk/python/arvados/keep.py L1256 seems like it will print this warning once per block on a big upload. Maybe use an "already warned" flag on the KeepClient object or something like that?
_logger.warning("X-Keep-Storage-Classes header not supported by the cluster")
Updated by Tom Clegg over 3 years ago
- Related to Feature #17351: [arv-put] Storage classes added
Updated by Lucas Di Pentima over 3 years ago
Thanks for your suggestion, I applied it with some tweaks to avoid code duplication.
Updates at 205aecd08
Test run: developer-run-tests: #2509
- Applied suggested fix to recompute pending tries.
- Avoids a "unsupported feature" warning logging for every block upload.
- Fixed tests.
Updated by Lucas Di Pentima over 3 years ago
- % Done changed from 0 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|8227b8a1be943fbbf3adb23a3d549dec28efbbba.