Bug #9967

[keep-balance] Do not delete blocks referenced by collections with replication_desired=0

Added by Tom Clegg over 4 years ago. Updated about 4 years ago.

Status:
New
Priority:
Normal
Assigned To:
Tom Morris
Category:
Keep
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

Background

Currently, the API server provides permission signatures for collections that have replication_desired=0. This means a client can:
  1. Write some data
  2. Create a collection "A" with replication_desired=0
  3. Wait until the blocks are old enough to be deleted by keep-balance
  4. Retrieve collection "A" and create a new collection "B" with the same manifest
  5. Change replication_desired on collection "A" to 2

After this, collections "A" and "B" refer to blocks which keep-balance was allowed to delete.

(As long as the underlying storage devices don't fail, it should never be possible for a client to obtain a signed locator for a block that doesn't exist.)

Collections with replication=0 might be useful, but proper support will include:
  • improving clients so they don't try to retrieve data from these collections
  • improving API so it doesn't provide locator signatures for these collections

In the meantime, we should avoid situations where data seems to be safe but isn't.

Proposed fix

In keep-balance, when a collection has replication_desired=0, pretend it's 1.

History

#1 Updated by Tom Clegg over 4 years ago

  • Description updated (diff)
  • Category set to Keep

#2 Updated by Tom Morris about 4 years ago

  • Assigned To set to Tom Morris
  • Target version set to Arvados Future Sprints

Also available in: Atom PDF