Project

General

Profile

Actions

Idea #8997

closed

Keep: rethink role of "signature tokens"

Added by Peter Grandi almost 8 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

During last night around 4AM I woke up and suddenly I understood (or I think I did) the role of "signature tokens" described in:

https://dev.arvados.org/projects/arvados/wiki/Keep_server#Permission

especially in relationship to block lifetimes as per issues:

https://dev.arvados.org/issues/8993
https://dev.arvados.org/issues/8878
https://dev.arvados.org/issues/8867

So my current understanding is...

In Keep block liveness is reachability from a server-side manifest or a client-side signature token. Similar to UNIX directory entries+inodes for manifests or file descriptors for files, where "authorization" in the form of having a file descriptor implies liveness.

But permissions tokens are client-side and persistent capabilities, even if time-limited, unlike file descriptors that are server-side and disappear on reboot (which is a raw form of garbage collection) capabilities.

Since the Data Manager cannot trace signature tokens, which may be anywhere, only manifests, it must make worst-case assumptions on them, both as to their existence and expiry times, which implies that the block signature TTL must be monotonically increasing, which is hard to ensure.

One could have Keep record server-side which signature tokens have been issued (block and lifetime), and have the same signature token cover multiple blocks too, but then they become essentially temporary collections ("partial manifests" IIRC).

Also it is pointless for Keep to issue read permissions tokens to 'arv-put' when it uploads a block, as it does not need to read them.

All that 'arv-put' needs to know is that when it registers a manifest all block hashes mentioned in it are live if the registration succeeded.

So what should actually happen is that the API server on registering a manifest verifies it has all the blocks for the hashes in the manifest, and otherwise returns a list of the blocks it does not have (and signature tokens should just be about permissions, not necessarily imply liveness, because they should not have the same dual role file descriptors have).

That's because it then becomes a compare-and-swap server-side sequence of atomic transactions. In a distributed setup the best that can be hoped for is eventual or even potential convergence.

The verification can be based on first checking whether the hashes in the new manifest are already present in other manifests (and "locking" them for the duration), and then asking all the keep servers to check, and a non-persistent TTL guarantee for the result of that check may
happen at that point.

Everything else is an optimization, for example:

  • Having 'arv-put' effectively do that check during the upload, e.g. by registering temporary partial collection manifests, issuing at the end the final full manifest and after that deleting the temporary partial ones.
  • Maybe every Keepstore server keeping a persistent hint-list of blocks it has, and perhaps the API server keeping a persistent hint-list of recently known to be live blocks and on which servers.

PS Computery-sciency stuff that may be related: Dijkstra parallel garbage collector with "white", "black", or "grey" (being uploaded) states. Also P Bishop's distributed parallel garbage collector MIT TR-178 (and successors).

Actions

Also available in: Atom PDF