Project

General

Profile

Actions

Bug #9363

open

[keep-balance] Avoid deleting recently-referenced blocks

Added by Tom Clegg over 8 years ago. Updated 10 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Keep
Target version:
Story points:
1.0
Release:
Release relationship:
Auto

Description

The following sequence can break the invariant Keep relies on for safe garbage collection1:
  1. Write data blocks
  2. Save collection
  3. Wait for original blob signature TTLs (from step 1) to expire
  4. Retrieve the collection (setting aside a copy in memory)
  5. Update the collection to remove references to some (or all) blocks
  6. Wait for keep-balance to do a garbage-collection sweep
  7. Update the collection (or save a new one) using the blocks/signatures set aside in step 4

If the only reference to a block is the one in this collection, the block may be deleted during step 6, resulting in a collection being saved successfully in step 7 even though some of its data is missing.

Proposed solution:

When updating a collection, detect block references that are being dropped2; if there are any, save a temporary collection (trashed, readable only by admin) referencing all of the dropped blocks. These will be cleaned up when they expire, and in the meantime they will be seen by keep-balance, thus protecting them until even their new signatures (issued in step 4) expire.

1 If any client has a signed block locator with a future timestamp, that block either [i] is newer than BlobSignatureTTL according to the timestamp stored on the backend, and therefore won't be deleted; or [ii] appears in a collection record where keep-balance can see it, and therefore won't be deleted.

2 This might be done during the (existing) strip_signatures_and_update_replication_confirmed hook where we detect whether references to new blocks have been added, so we know whether to reset the replication_confirmed columns.


Related issues 1 (1 open0 closed)

Related to Arvados - Feature #4650: [API] API method and CLI shortcut for refreshing the signatures on some block locators (without creating a collection)NewActions
Actions #1

Updated by Tom Clegg over 5 years ago

  • Subject changed from [keep-balance] Avoid deleting recently-referenced blocks (based on data in logs table) to [keep-balance] Avoid deleting recently-referenced blocks
  • Description updated (diff)
  • Category set to Keep
Actions #2

Updated by Tom Clegg over 5 years ago

  • Description updated (diff)
Actions #3

Updated by Tom Clegg over 5 years ago

  • Target version set to To Be Groomed
Actions #4

Updated by Tom Morris over 5 years ago

What current client(s) exercise(s) this scenario? I can't think of a circumstance in which it would occur.

Actions #5

Updated by Tom Morris over 5 years ago

  • Target version changed from To Be Groomed to Arvados Future Sprints
Actions #6

Updated by Peter Amstutz over 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions #7

Updated by Peter Amstutz almost 2 years ago

  • Release set to 60
Actions #8

Updated by Peter Amstutz 10 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF