Recovering lost data » History » Revision 4

« Previous | Revision 4/7 (diff) | Next »
Tom Clegg, 04/26/2019 08:08 PM

Untrashing lost blocks

In some cases it is possible to recover data blocks that have been trashed by keep-balance (due to a bug like #15148, or an install/config error).

If you suspect blocks have been trashed erroneously, you should immediately:
  1. On all keepstore servers: set EmptyTrashInterval to a long time like 2400h
  2. On all keepstore servers: restart keepstore
  3. Stop the keep-balance service
When you think you have corrected the underlying problem, you should:
  1. Set LostBlocksFile to a suitable value (perhaps "/tmp/keep-balance-lost-blocks.txt") in your keep-balance config
  2. Start keep-balance

After keep-balance completes its first sweep, inspect /tmp/keep-balance-lost-blocks.txt. If it's not empty, you can request all keepstores to untrash any blocks that are still recoverable with a script like this:

set -e

# see Client.AuthToken in /etc/arvados/keep-balance/keep-balance.yml

# all keep server hostnames
hosts=(keep0 keep1 keep2 keep3 keep4 keep5)

while read hash pdhs; do
    echo "${hash}" 
    for h in ${hosts[@]}; do
        if curl -fgs -H "Authorization: Bearer $token" -X PUT "http://${h}:25107/untrash/$hash"; then
            echo "${hash} ok ${host}" 
done < /tmp/keep-balance-lost-blocks.txt

Updated by Tom Clegg about 5 years ago · 4 revisions