Federated collections » History » Version 2

« Previous - Version 2/9 (diff) - Next » - Current version
Peter Amstutz, 08/01/2018 06:58 PM


Federated collections

  • Fetch collection record by uuid
    • use federated record retrieval strategy, already developed.
  • Fetch collection record by PDH
    • No location hint. Must send out request to all federated clusters.
    • Read-only, only need to support GET operation
  • Can cache result by PDH.

Record will have a manifest with signed blocks. However these blocks will be signed for the origin cluster.

Client needs to be able to fetch blocks from remote cluster.

arvados-controller could add block hints, using existing feature in the Python and Go SDK:

  • Blocks in a manifest can include a hint in the form "+K@zzzzz". Python SDK will attempt to fetch the block from "https://keep.zzzzz.arvadosapi.com/"
    • Must conform to a particular naming DNS scheme.
    • Could be generalized by looking up in "remote_hosts" and using the "keep_services.accessible" API.
    • Every block will be requested from remote every time, because client is contacting remote server directly, limited opportunity for edge caching.
  • Hint can also be a uuid of a "local gateway service". This is instructs client to use a specific service from the keep_services table (indicated as "service_type" of "gateway:")
    • Direct requests through a specific service
    • Does not encode which remote cluster to pull a block from.
    • Gateway service could search for blocks by sending request to every federated cluster
    • Gateway service can cache blocks so they don't need to be re-fetched from remote.