Project

General

Profile

Actions

Feature #8457

open

[Keep] Shuffle top N keep servers to balance reads

Added by Peter Amstutz about 8 years ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Keep
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

Currently, keep client always orders the server list the same way on a per-block basis. If compute nodes are all requesting the same file at once (a common event when starting a large run), this can lead to load on the keepstore at the top of the list for each block while other servers with the same block are not utilized.

Since blocks are typically replicated, we could shuffle the top N services (where N is the greater of the replication count for the block, and the number of Keep readers we're willing to run simultaneously). This will spread out the load in a properly replicated and balanced cluster as different clients will use slightly different priority orders for requesting blocks.

Each SDK should have a function "the maximum number of simultaneous workers, based on the desired replication level and the characteristics of the underlying Keep services." (The Python SDK has this code inside ThreadLimiter.__init__; it can be refactored out independently.) The result of that function should also be used to determine how many services to shuffle for this story.

Actions

Also available in: Atom PDF