Project

General

Profile

Idea #8724

Updated by Radhika Chippada almost 8 years ago

Write a script to verify that the Keep services on a cluster have a good copy of a given list of blocks.    The use case is that you generate a list of existing md5sums before you do a keep-rsync, and run this script after the keep-rsync is done to verify that everything was copied over correctly. 

 The script takes as input: 

 * a file with md5sums to check, one per line 
 * a file that contains the Keep signing key 
 * API host+token (read from a specified file, as keep-rsync does) 
 * allow the admin to specify the keep services using a json (as keep-rsync does) 
 * an optional prefix argument 

 For each block named in the list of md5sums, it generates its own signature for the block, and sends HEAD requests to appropriate (e.g., non-proxy) Keep services in rendezvous hash order.    If no Keep service returns 200 OK for a given block, the script reports that as an error, including the hash of the missing block.    It should do this for every missing block (i.e., don't abort at the first missing block).    Exit nonzero if any blocks were not found. 

 If a prefix argument is provided, the script only checks blocks that begin with that prefix, a la keep-rsync. 

 The tool is called keep-block-check.    It lives under @tools/@.    For the switches that this tool accepts that keep-rsync also accepts (at least the signing key file and the prefix), this tool should use the same name for the switch that keep-rsync does. 

 Test that: 

 * It exits zero and writes nothing to stdout when it finds all the blocks. 
 * When the input includes a block that cannot be found, the tool exits nonzero and writes an error that mentions that the block cannot be found.    Test this for cases where Keepstore returns: 
 ** 401 
 ** 404 
 ** 500 

Back