Actions
Multi-pass mode to reduce keep-balance memory footprint¶
Background: Currently keep-balance's RAM footprint increases with the number of stored blocks. On a large site, even the largest available machine might not have enough RAM to complete a keep-balance cycle. Scalability will be vastly improved if we can- run keep-balance without keeping the entire list of stored blocks in memory at once (addressed below)
- distribute the keep-balance work across multiple system hosts or compute nodes (future work)
Proposal: On large clusters, balance in N passes where each pass considers 1/N of the possible block locators -- for example, if N=16, the first pass considers blocks whose locators begin with "0", the next pass "1", etc. For simplicity, N must be a power of 16.
New behaviors relating to trash/pull lists:- When starting a new sweep, clear trash lists (no change to existing behavior)
- [keepstore] When posting a new trash/pull list, check
X-Keep-List-Prefix
header, and don't clear existing entries that have a different prefix - [keep-balance] When posting a new trash/pull list, set
X-Keep-List-Prefix
header, so keepstore knows which entries to clear - [keep-balance] Run a pass for each prefix, then merge resulting statistics to produce full cluster summary
replication_confirmed
:
- [rails] add collections column
replication_confirmed_partial
, default null - [rails] reset
replication_confirmed_partial=null
when updating a collection (just like existing behavior ofreplication_confirmed
) - [keep-balance] when starting a multi-pass sweep, clear
replication_confirmed_partial
:update collections set replication_confirmed_partial=NULL
- [keep-balance] after each pass (single prefix), set or reduce replication_confirmed_partial:
update collections set replication_confirmed_partial=min($1,coalesce(replication_confirmed_partial,$1)) where portable_data_hash=$2
- [keep-balance] after all passes (prefixes) are done, copy replication_confirmed_partial to replication_confirmed:
update collections set replication_confirmed=replication_confirmed_partial, replication_confirmed_at=$1 where replication_confirmed_partial is not NULL
- For now, a single keep-balance process will perform N passes serially, then merge results.
- In future, we should allow multiple keep-balance processes on different nodes running passes concurrently. This will require further coordination such that a single "coordinator" process merges statistics produced by "worker" processes and updates
replication_confirmed
when all workers are finished. Ideally, the workers can be automatically dispatched as containers on cloud/HPC nodes.
Updated by Tom Clegg about 1 month ago · 2 revisions