Project

General

Profile

Federated collections » History » Revision 2

Revision 1 (Peter Amstutz, 08/01/2018 05:53 PM) → Revision 2/9 (Peter Amstutz, 08/01/2018 06:58 PM)

h1. Federated collections 

 * Fetch collection record by uuid 
 ** use federated record retrieval strategy, already developed. 
 * Fetch collection record by PDH 
 ** No location hint.    Must send out request to all federated clusters. 
 ** Read-only, only need to support GET operation 
 * Can cache result by PDH. 

 Record will have a manifest with signed blocks.    However these blocks will be signed for the origin cluster. 

 Client needs to be able to fetch blocks from remote cluster. 

 arvados-controller could add block hints, using existing feature in the Python and Go SDK: 

 * Blocks in a manifest can include a hint in the form "+K@zzzzz".    Python SDK will attempt to fetch the block from "https://keep.zzzzz.arvadosapi.com/" 
 ** Must conform to a particular naming DNS scheme. 
 ** Could be generalized by looking up in "remote_hosts" and using the "keep_services.accessible" API. 
 ** Every block will be requested from remote every time, because client is contacting remote server directly, limited opportunity for edge caching. 

 * Hint can also be a uuid of a "local gateway service".    This is instructs client to use a specific service from the keep_services table (indicated as "service_type" of "gateway:") 
 ** Direct requests through a specific service 
 ** Does not encode which remote cluster to pull a block from. 
 ** Gateway service could search for blocks by sending request to every federated cluster 
 ** Gateway service can cache blocks so they don't need to be re-fetched from remote.