Expiring collections » History » Revision 5
« Previous |
Revision 5/22
(diff)
| Next »
Tom Clegg, 06/01/2016 08:07 PM
Expiring collections¶
Overview¶
Deleting a collection is not an instantaneous operation. Rather, a collection can be set to expire at some future time. Until that time arrives, its data blocks are still considered valuable: a client can "recover from trash" by clearing the expiry flag.
This addresses (at least) three desirable features:
A client should be able to undo a "delete collection" operation that was done by a different client. For example, it should be possible to delete a collection using arv-mount, then recover it using Workbench.
Automated processes need temp/scratch space: a mechanism to protect data temporarily from the garbage collector, without cluttering any user's account. Arvados should not require applications to do things like make "temp" subprojects and set timers to clean up old data.
It should not be possible to do a series of collection operations that results in "lost" blocks. Example:- Get old collection A (with signed manifest)
- Delete old collection A
- (garbage collector runs now)
- Create new collection B (using the signed manifest from collection A)
Background: existing race window¶
Keep's garbage collection strategy relies on a "race window": new unreferenced data cannot be deleted, because there is necessarily a time interval between getting a signature from a Keep server (by writing the data) and using that signature to add the block to a collection.
A timestamp signature from a keepstore server means "this data will not be deleted until the given timestamp": before giving out a signature, keepstore updates the mtime of the block on disk, and (even if asked by datamanager/keep-balance) refuses to delete blocks that are too new. This means the API server can safely store a collection without checking whether the referenced data blocks actually exist: if the timestamps are current, the blocks can't have been garbage-collected.
The expires_at behavior described here should help the API server offer a similar guarantee ("a signature expiring at time T means the data will not be deleted until T").
Interpreting expires_at¶
Each collection has an expires_at field.
expires_at | significance | get (pdh) | get (uuid) | appears in default list | can appear in list when filtering by expires_at |
null | persistent | yes | yes | yes | yes |
>now | expiring | yes(*) | yes(*) | no(**) | yes |
<=now | expired | no | no | no | no |
(*) If expires_at is not null, any signatures given in a get/list response must expire before expires_at.
(**) Change to "yes" after updating clients (arv-mount and Workbench) to behave appropriately, i.e., either use an expires_at filter when requesting collection lists, or skip over them in default views.
Expired collections are effectively deleted (whether/when the system deletes the rows from the underlying database table is an implementation detail).
Updating expires_at¶
When a client makes a DELETE request, the collection should not be deleted outright. Instead, its expires_at time should be set to (now + max(defaultExpiryWindow, blobSignatureTTL))
.
expires_at
and sooner than now+blobSignatureTTL
.
- It might be worth having a convenient way for clients to say "now()+defaultExpiryWindow" and "as soon as possible".
- Should "expires_at is too soon" be an error, or should we just use the earliest allowed time in that case?
Unique name index¶
After deleting a collection named "foo", it must be possible to create a new collection named "foo" in the same project without a name collision.
Two possible approaches:
- When expiring a collection, stash the original name somewhere and change its name to something unique (e.g., incorporating uuid and timestamp).
- Convert the database index to a partial index, so names only have to be unique among non-deleted items. (Disadvantage: arv-mount will not (always) be able to use the "name" field of an expiring collection as its filename in a trash directory.)
- It may help here to add the "ensure_unique_name" feature to the "update" method (currently it is only available in "create").
Client behavior¶
Workbench should not normally display collections with (expires_at is not null)
. A "view trash" feature would be useful, though.
arv-mount should not normally list collections with (expires_at is not null)
. A "trash directory" feature would be useful, though.
datamanager/keep-balance must not delete data blocks that are referenced by any collection with (expires_at is null or expires_at>now)
.
Collection modifications vs. consistency¶
In order to guarantee "permission signature timestamp T == no garbage collection until T", garbage collection must take into account blocks that were recently referenced by collections.
This guarantee is fundamentally at odds with an important admin feature, "expedited delete": an admin should have a mechanism to accelerate garbage collection. Ideally, this action can be restricted to the blocks from a specific deleted collection.
Two possible approaches:
Approach 1: Make an expiring collection whenever a collection is modified such that a block disappears from its manifest.
An expiring collection created this way- ...does not need to include all blocks from the original manifest -- only the ones that are disappearing from the original.
- ...is not necessarily visible to the user -- it just has to be visible to datamanager/keep-balance. (If it is also visible to the user, it could support an "undo modification" feature.)
- ...should have expires_at set to the modified collection's original expires_at -- or
(now()+blobSignatureTTL)
if that is null.
Here, the definition of "disappear" must include all blocks in a collection whose replication_desired is changing from non-zero to zero.
"Expedited delete" could be accomplished by deleting the placeholder collection.
Approach 2: Datamanager/keep-balance use arvados.v1.logs.index to get the "old" version of each manifest that has been changed or deleted recently (<= blobSignatureTTL seconds ago).
The virtue of this approach is that it avoids storing yet another copy of each collection manifest.
However, this might make it more awkward to support "expedited delete". The admin/tool would have to either- modify the logs table to backdate/remove the evidence, or
- set a datamanager/keep-balance flag to ignore [log entries for] specific collection UUIDs, or
- reduce the blobSignatureTTL for the whole system (this causes operations in progress to fail, but is the only solution that prevents the creation of collections that reference blocks that have already been deleted).
Related: replication_desired=0¶
A collection with replication_desired=0 does not protect its data from garbage collection. In this sense, replication_desired=0 is similar to expires_at<now.
However, replication_desired=0 does not mean the collection record itself should be hidden. It means the collection metadata (filenames, sizes, data hashes, collection PDH) are valuable enough to keep on hand, but the data itself isn't. For example, if we delete intermediate data generated by a workflow, and find later that the same workflow now produces a different result, it would be helpful to see which of the intermediate outputs differed.
Updated by Tom Clegg over 8 years ago · 5 revisions