Feature #15125
[keep-balance] [keepstore] Procedure to halt/reverse/investigate a suspected data loss incident
Start date:
Due date:
% Done:
0%
Estimated time:
Story points:
-
Description
A site admin, upon suspecting keep-balance is erroneously trashing some data, should be able to
- act quickly to minimize the impact, and
- characterize the damage, if any
- immediately prevent keepstore from trashing or deleting any blocks while investigation/recovery proceeds
- untrash any blocks that might have been trashed erroneously (this may enable affected workflows to resume)
- get a list of missing block IDs
- get a list of collections that reference missing blocks (including uuid, pdh, name, project uuid, project name)
- report version in metrics (e.g.,
version{program="keep-balance", version="1.3.1"} = 1
) - report #+size of trashed blocks in metrics
- keepstore "untrash all" management API
- keep-balance reporting option to get debug info for a list of specific collection IDs and block IDs (without getting the entire debug dump, which is huge)
keep-block-check --collection=uuid_or_pdh
Related issues
History
#2
Updated by Tom Clegg almost 2 years ago
- Description updated (diff)
#3
Updated by Tom Morris almost 2 years ago
- Target version set to To Be Groomed
#4
Updated by Ward Vandewege 10 months ago
- Related to Story #16514: Actionable insight into keep usage added