Project

General

Profile

Actions

Feature #9364

open

[keep-balance] "Expedited delete" tool: perform garbage collection on some specific (recently deleted) collections, bypassing usual GC race protections

Added by Tom Clegg almost 8 years ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Keep
Target version:
Story points:
3.0
Release:
Release relationship:
Auto

Description

See Expedited delete

Given some PDHes of trashed collections...

For each PDH:
  • Check for an un-trashed collection with this PDH. If one exists, show a warning: this means "expedited delete" for this collection is a no-op.
  • Get the manifest text for this PDH from the trash.
Perform a keep-balance operation, but
  • Don't compute changes for all blocks -- only the ones appearing in the manifests retrieved above.
  • When checking trashed collections for recently referenced blocks (see #9363), skip collections with any of the supplied PDHes.
  • When deciding whether to delete blocks, use the most recent timestamp of the collections being deleted rather than {now minus signatureTTL} as the race window threshold.
  • Use the (new) "synchronous delete, ignoring timestamp" feature of keepstore instead of sending a trash list.
  • Don't process any "pull" operations.
  • If any blocks in the deleted collections are still referenced by other collections (either trashed or un-trashed), log the PDHes of the collections that prevent the data from being fully deleted.

This could be presented as a separate command ("keep-force-delete"?) if keep-balance is refactored into a module. Alternatively, it could be a runtime option for keep-balance. It will use the same configuration file as keep-balance (along with other options).


Related issues

Related to Arvados - Idea #9278: [Crunch2] Document/fix handling of collections with non-nil expires_at fieldIn ProgressActions
Actions

Also available in: Atom PDF