Idea #15641
Updated by Tom Clegg over 5 years ago
When distributing data across multiple volumes in a cloud environment, rendezvous hashing should be based on volume ID:
* all keepstore servers access all volumes
* client/proxy uses rendezvous hash to sort/choose from the volume(s) in cluster config, and connects to the keepstore server(s) that have access to the chosen volumes
* keepstore uses rendezvous hash to sort/choose from the volumes it has access to
* keep-balance uses rendezvous hash to choose preferred volume(s) where a blob should be stored, and when pulling/trashing, chooses a random/arbitrary keepstore from the ones that have write access to the relevant volume
(Current code uses rendezvous to select a server, and sorts/chooses volumes in random/arbitrary order. This causes unnecessary bottlenecks between clients and buckets (one writable bucket per server / one writing server per bucket) and excessive keepstore-to-backend probing (multiple writable buckets per server).)