Project

General

Profile

Actions

Idea #9180

closed

[PySDK] Avoid overreplication in KeepClient

Added by Brett Smith almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
06/08/2016
Due date:
Story points:
-

Description

When you upload data to Keep in the Python KeepClient, it starts a thread for each Keep service to do the uploads. It lets N threads run at a time, where N is the desired replication of the block, until N of those threads succeeds, or it has to give up trying.

This can frequently lead to situations where KeepClient overreplicates when things are running smoothly. Imagine a client uploads a block with desired replication 2 and 4 Keep services available. This can happen:

Start upload thread 1
Start upload thread 2
Upload thread 1 succeeds
Start upload thread 3
Upload thread 2 succeeds
Don't start any more threads, because there have been two successes
Upload thread 3 succeeds

Adjust the Python SDK logic so it doesn't let more threads run than are necessary to achieve the desired replication. The simplest possible change would be to adjust the thread limiter so that, rather than simply allowing N threads to run, it only lets (N - successes achieved) threads run.


Subtasks 1 (0 open1 closed)

Task #9387: Review 9180-avoid-overreplication-keepclientResolvedLucas Di Pentima06/08/2016Actions

Related issues

Related to Arvados - Idea #9446: [SDK] Refactor keep parallel write strategyResolvedLucas Di Pentima06/27/2016Actions
Actions

Also available in: Atom PDF