Routing multi cluster requests

Concept

The goal of federation is to present an interface that fuses multiple clusters into a single view.

The role of arvados-controller is to determine which cluster(s) a request should go.

Examples

My "home cluster" is qr1hi. I have a token qr1hi-secretsecretsecret.

I want to read a collection on c97qk using the Python SDK.

c = CollectionReader("c97qk-...")
  1. The CollectionReader sends a request to arvados-controller.
  2. arvados-controller examines the prefix c97qk and contacts c97qk.arvadosapi.com.
  3. The request router uses the "salted" token hmac(c97qk, qr1hi-secretsecretsecret) → qr1hi-secretsecretc97qk
  4. c97qk gets the token and notices the qr1hi prefix.
  5. c97qk contacts qr1hi to determine if the token is valid and what user is associated with the token.
  6. c97qk caches the token and sets current_user. The request proceeds as normal.
  7. The response is returned to arvados-controller
  8. The manifest_text needs is updated by arvados-controller to transform the block signatures from "+A..." to "+Rc97qk-..." to indicate the signatures are valid for c97qk
  9. The response is returned to CollectionReader.
  10. The CollectionReader sends a block read request to a qr1hi keepstore with the +Rc97qk signature
  11. The keepstore recognizes that it is a remote signature and contacts the remote cluster to fetch the block. The signature is transformed from a remote signature back to a regular one.
  12. The block is returned to the client.

I want to search for a collection across clusters

c = router.collections().list(filters=[["name", "like", "sample-1234%"]]).execute()
  1. arvados-controller has a "search list" of clusters (where does this come from??? maybe an attribute of the primary user account on qr1hi?)
  2. arvados-controller sends the request to each cluster in parallel using federated identity / salted token described above.
  3. arvados-controller gathers the results.
  4. arvados-controller collates the results (will need to understand "order" option to do this properly)
  5. Collated results are returned
  6. Paging - ??? likely need to keep track of some state locally to be able to be able to issue correct follow-up requests to each cluster. Can have consistent ordering within a page but not across pages unless all pages are fetched first.

Another case: I want to list the contents of a project across clusters. Same query process.

c = router.collections().list(owner_uuid="qr1hi-....").execute()

I want to create a collection on another cluster.

Provide "owner_uuid" of a project or group on a foreign cluster.

router.collections().create(body={"owner_uuid": "c97qk-...."}).execute()
  1. arvados-controller examines the prefix c97qk and contacts c97qk.arvadosapi.com using federated identity / salted token described above .
  2. The cluster determines if the user has write access to the group or project and validates the create request as normal.
  3. The newly created record is returned.

No "owner_uuid" means creating the object on the "home" cluster.

I want to update an object on another cluster.

router.collections().update(uuid="c97qk-....", body={....}).execute()
  1. arvados-controller examines the prefix c97qk and contacts c97qk.arvadosapi.com using federated identity / salted token described above .
  2. The cluster determines if the user has write access to object and validates the update request as normal.
  3. The updated record is returned.

I want to change the ownership of a remote object to a project on my home cluster.

The object is located on c97qk and currently owned by me, I'd like to make it owned by a project qr1hi-...

  1. Route an "update" request to change "owner_uuid" to c97qk as described above.
  2. c97qk contacts qr1hi and asks if the user has write access to the project.
  3. The object is updated and returned to the user

(This suggests I can only share things with groups on the same home cluster as me. hmm.)

I want to change the ownership of an object on my home cluster object to a project on a remote cluster.

  1. Route the "update" as described above to qr1hi.
  2. qr1hi contacts c97qk using the c97qk salted token and asks if the user has write access to the project.
  3. The object is updated and returned to the user

I want to change the ownership of an object from one remote project (c97qk) to another (4xphq).

Can't be done directly (???) because c97qk and 4xphq don't talk to each other directly. (The token given to c97qk is not valid for accessing 4xphq and likewise). Could be done as a two-step process where ownership is assigned from c97qk to qr1hi, then from qr1hi to 4xphq.