Project

General

Profile

Bug #7167

Updated by Tom Clegg over 9 years ago

This is a script aimed at system administrators who are migrating a cluster from one installation to another.    It copies Keep data from the old to the new cluster, in a way that's efficient both for the migration itself, and for accessing data on the destination cluster (in other words, blocks live on services early in their rendezvous hash order). 

 h2. Functional requirements 

 * The script dynamically finds all blocks available on the source cluster. _This can only be done by getting the "index" from each keepstore on the source side._ 
 * Get each block from the source cluster exactly once, and write it to the destination cluster, using standard Keep APIs and algorithms (e.g., rendezvous hashing, checksum validation). _This can be done with the existing Keep SDKs._ hashing). 
 * Include a checkpointing mechanism so that if the process is interrupted, it has a record of what blocks have already been copied and doesn't re-send them. _In the implementation below, the keep block index on the destination side serves as the checkpoint mechanism._ 
 * When writing a block on the destination side, use the destination cluster's default replication level, as given in the discovery document. 

 Possible future work (specifically excluded from TBD: Should this script do something to determine the requirements here): 
 * Determine the desired replication level for each block by reading all collection records from the source cluster, level, and write to the destination cluster based on that information. (Until then, keep-rsync will use the destination cluster's default replication level, leaving further adjustments to the destination cluster's information?    Or should we just write one or two copies of every block, and let Data Manager after the database has been migrated.) 
 * Verify integrity of blocks that (according to the checkpoint/index data on the destination side) already exist on the destination side. For now, we assume that some other mechanism is responsible for ensuring corrupt blocks aren't listed in keepstore index responses. adjust replication from there? 

 h2. Implementation 

 keep-rsync will be written in Go. Source code will live in source:services/keep-rsync. Debian/RedHat packages, and the binaries they install, will be called keep-rsync. 
 * Accepts "src" and "dst" arguments and reads settings/conf files just like arv-copy. 
 * Accepts command line arguments for (or reads from settings files) source and destination data manager key and blob signing key. These are necessary to get all indexes and data blocks respectively. 
 * Accepts replication argument (default to whatever is advertised in "destination" discovery doc). 
 * Accepts a "prefix" argument that passes through to index requests on both sides. This makes it possible to divide the work into (e.g.) 16 asynchronous jobs, one for each hex digit. 
 * Gets indexes from the source and destination keepstores[1]. 
 * Gets data from source keepstores/keepproxy, stores in destination using configured replication level. 
 * Uses regular SDK functions to get and put blocks. 
 * Displays progress. 
 ** "getting indexes: 10... 9... [...]" (count down number of indexes todo) 
 ** "copying data block 1 of 1234 (0% done, ETA 2m3s): acbd18db4cc2f85cedef654fccc4a4d8+3" 

 h2. Usage example h3. Example 

 How to use in a migration: 
 * Turn off data manager on destination cluster. 
 * Run keep-rsync. 
 * Disable access to source cluster. 
 * Dump database and restore to destination cluster. 
 * Run keep-rsync again. 

Back