Federated collections » History » Version 6

« Previous - Version 6/9 (diff) - Next » - Current version
Peter Amstutz, 08/01/2018 08:28 PM


Federated collections

  • Fetch collection record by uuid
    • use federated record retrieval strategy, already developed.
  • Fetch collection record by PDH
    • No location hint. Distribute request to all federated clusters and pick one to return.
    • Read-only, only need to support GET operation
  • Can cache result by PDH.

Record will have a manifest with signed blocks. However these blocks will be signed for the origin cluster.

Client needs to be able to fetch blocks from remote cluster.

arvados-controller could add block hints, using existing feature in the Python and Go SDK:

  • Blocks in a manifest can include a hint in the form "+K@zzzzz". Python SDK will attempt to fetch the block from "https://keep.zzzzz.arvadosapi.com/"
    • Must conform to a particular naming DNS scheme.
    • Could be generalized by looking up in "remote_hosts" and using the "keep_services.accessible" API.
    • Every block will be requested from remote every time, because client is contacting remote server directly, limited opportunity for edge caching.
  • Hint can also be a uuid of a "local gateway service". This is instructs client to use a specific service from the keep_services table (indicated as "service_type" of "gateway:")
    • Direct requests through a specific service
    • Does not encode which remote cluster to pull a block from.
    • Gateway service could search for blocks by sending request to every federated cluster
    • Gateway service can cache blocks so they don't need to be re-fetched from remote.

Both "hint" schemes are slightly inelegant because they require repeating the "+K@" hint for ever block in the manifest.

We probably want an architecture that makes block caching possible, even if the first pass implementation doesn't support it. That implies a gateway / proxy service rather than contacting the remote cluster directly (architecturally, this is also more in line with arvados-controller design acting as an intermediary, as opposed to adding federation features in the client.)

Proposal

Arvados-controller decorates blocks with "+K@zzzzz" hints but change the implementation so that instead of the client contacting the remote host, the client contacts the local gateway service and requests the block with the cluster hint and block signature (which is returned by the remote cluster).

The local gateway services requests the block from the appropriate cluster, returns the result.

A simple caching strategy would be to copy the block to local keep storage, and maintain a mapping from the remote signature(s) to a local signature. If a request comes for a block which has recently been fetched, it can issue a HEAD request to verify the signature and then remember the signature.

Fetching collection flow:

  1. Running on cluster aaaaa
  2. Client sends request to arvados-controller by PDH
  3. arvados-controller searches local database and comes up empty.
  4. arvados-controller sends request for collection by PDH (with salted token) out to federated clusters bbbbb and ccccc
  5. ccccc returns result
  6. arvados-controller decorates the return record with "+K@ccccc" block hints
  7. return record to client

Fetching block flow:

  1. client wishes to read a file
  2. client has signed block locator with "+K@ccccc" hint
  3. client sends request to "gateway" Keep service
  4. gateway keep service contacts keepproxy on cluster ccccc and requests block
  5. keepproxy on ccccc returns block content to gateway
  6. gateway returns block content to client

Fetching block, with caching:

  1. client wishes to read a file
  2. client has signed block locator with "+K@ccccc" hint
  3. client sends request to "gateway" Keep service
  4. gateway service looks up block in memory / local database
    1. if found, check if the block signature is cached
    2. if block signature isn't cached, send HEAD request to ccccc
    3. if the signature checks out, fetch the block from aaaaa local keepstore and returns that.
    4. else fail (because HEAD request must have failed)
  5. gateway keep service contacts keepproxy on cluster ccccc and requests block
  6. keepproxy on ccccc returns block content to gateway
  7. gateway saves block to aaaaa local keep, records mapping of remote block+signature to local block+signature (could be in memory, or local database such as sqlite)
  8. gateway returns block content to client

Development tasks

  • arvados-controller support for fetching collections records by UUID and PDH
  • arvados-gateway to fetch keep blocks from remote clusters
  • Update keep client in Python and Go SDK to use arvados-gateway