Federated collections » History » Version 2

Version 1 (Peter Amstutz, 08/01/2018 05:53 PM) → Version 2/9 (Peter Amstutz, 08/01/2018 06:58 PM)

h1. Federated collections

* Fetch collection record by uuid
** use federated record retrieval strategy, already developed.
* Fetch collection record by PDH
** No location hint. Must send out request to all federated clusters.
** Read-only, only need to support GET operation
* Can cache result by PDH.

Record will have a manifest with signed blocks. However these blocks will be signed for the origin cluster.

Client needs to be able to fetch blocks from remote cluster.

arvados-controller could add block hints, using existing feature in the Python and Go SDK:

* Blocks in a manifest can include a hint in the form "+K@zzzzz". Python SDK will attempt to fetch the block from "https://keep.zzzzz.arvadosapi.com/"
** Must conform to a particular naming DNS scheme.
** Could be generalized by looking up in "remote_hosts" and using the "keep_services.accessible" API.
** Every block will be requested from remote every time, because client is contacting remote server directly, limited opportunity for edge caching.

* Hint can also be a uuid of a "local gateway service". This is instructs client to use a specific service from the keep_services table (indicated as "service_type" of "gateway:")
** Direct requests through a specific service
** Does not encode which remote cluster to pull a block from.
** Gateway service could search for blocks by sending request to every federated cluster
** Gateway service can cache blocks so they don't need to be re-fetched from remote.