Story #9446

Updated by Peter Amstutz almost 5 years ago

Currently @arvados.keep.KeepClient.put@ creates one KeepWriterThread per keep server and then uses a complex (and error prone) locking strategy implemented in ThreadLimiter to ensure that only certain threads perform uploads in a certain order.

Refactor this code to use the following alternate strategy:

* For N wanted copies create N upload threads
* Start a new upload to the next server in sorted_roots when an upload fails

You may want to use a Queue (instead of explicit locks) to communicate between the main thread and the upload threads.

Consider setting up a thread pool attached to the keep client object and dispatching work instead of spawning new threads.

Two things to keep in mind:

* The order that it tries to upload to each server matters (because the earlier servers are preferred over the later ones)
* If it goes through the entire list without uploading sufficient replicas, it should try again. However, when it does so, it should (a) remember how many replicas were already uploaded (so if it wanted 3 and got 2 on the first pass it only needs 1 more) and (b) it should not try to upload to services to which it did a success upload in the previous round.

Back