Bug #9967

[keep-balance] Do not delete blocks referenced by collections with replication_desired=0

Added by Tom Clegg over 4 years ago. Updated about 4 years ago.

Assigned To:
Tom Morris
Target version:
Start date:
Due date:
% Done:


Estimated time:
Story points:



Currently, the API server provides permission signatures for collections that have replication_desired=0. This means a client can:
  1. Write some data
  2. Create a collection "A" with replication_desired=0
  3. Wait until the blocks are old enough to be deleted by keep-balance
  4. Retrieve collection "A" and create a new collection "B" with the same manifest
  5. Change replication_desired on collection "A" to 2

After this, collections "A" and "B" refer to blocks which keep-balance was allowed to delete.

(As long as the underlying storage devices don't fail, it should never be possible for a client to obtain a signed locator for a block that doesn't exist.)

Collections with replication=0 might be useful, but proper support will include:
  • improving clients so they don't try to retrieve data from these collections
  • improving API so it doesn't provide locator signatures for these collections

In the meantime, we should avoid situations where data seems to be safe but isn't.

Proposed fix

In keep-balance, when a collection has replication_desired=0, pretend it's 1.


#1 Updated by Tom Clegg over 4 years ago

  • Description updated (diff)
  • Category set to Keep

#2 Updated by Tom Morris about 4 years ago

  • Assigned To set to Tom Morris
  • Target version set to Arvados Future Sprints

Also available in: Atom PDF