Federated collections » History » Version 2
Peter Amstutz, 08/01/2018 06:58 PM
h1. Federated collections
* Fetch collection record by uuid
** use federated record retrieval strategy, already developed.
* Fetch collection record by PDH
** No location hint. Must send out request to all federated clusters.
** Read-only, only need to support GET operation
* Can cache result by PDH.
Record will have a manifest with signed blocks. However these blocks will be signed for the origin cluster.
Client needs to be able to fetch blocks from remote cluster.
arvados-controller could add block hints, using existing feature in the Python and Go SDK:
* Blocks in a manifest can include a hint in the form "+K@zzzzz". Python SDK will attempt to fetch the block from "https://keep.zzzzz.arvadosapi.com/"
** Must conform to a particular naming DNS scheme.
** Could be generalized by looking up in "remote_hosts" and using the "keep_services.accessible" API.
** Every block will be requested from remote every time, because client is contacting remote server directly, limited opportunity for edge caching.
* Hint can also be a uuid of a "local gateway service". This is instructs client to use a specific service from the keep_services table (indicated as "service_type" of "gateway:")
** Direct requests through a specific service
** Does not encode which remote cluster to pull a block from.
** Gateway service could search for blocks by sending request to every federated cluster
** Gateway service can cache blocks so they don't need to be re-fetched from remote.