Idea #21936
Updated by Peter Amstutz 10 months ago
* Manifest format extended to support a link to an external resource as a block "hint"
* Keepstore gets an API which takes an external resource URL (s3://) and verifies that the object is accessible, fetches metadata, generates the md5, and returns a manifest stream fragment
* Python SDK method which takes external resource URL, calls keepstore to get a manifest stream fragment
* Keepstore supports fetching blocks that have an external resource hint
* Python and Go SDK handle blocks with external resource hints, where the MD5 corresponds to a hash of the locator hint and not the content itself
* arvados-cwl-runner supports s3 object inputs by using this API to create collection with links to external resources
Assumptions:
* Keepstore and compute nodes have permission to read s3 buckets where resources are located via IAM instance roles
Possibly required, TBD:
* Store credentials associated with S3 buckets in Arvados config.yml, which are used by keepstore when IAM instance roles are not available.