Idea #16360
closed
Keep-web supports S3 compatible interface
Added by Peter Amstutz over 4 years ago.
Updated over 3 years ago.
Release relationship:
Auto
Description
Background: Applications want to access data in Arvados through a web API. Keep-web currently supports WebDAV. Although it is a standard, 3rd party applications used by customers do not support WebDAV. Instead, many 3rd party applications support reading and writing to S3-compatible object storage. Keep-web should support an S3 compatible API to enable these applications to easily access Arvados.
Initial implementation should support the "list objects", "get object", and "put object" APIs using
- object name = file path within collection
- bucket name = collection UUID
- secret key = Arvados API token
- endpoint = ExternalURL of keep-web service
...with enough s3 compatibility to test with a common s3 client library like https://github.com/minio/minio-go.
- Start date set to 06/01/2021
- Due date set to 08/31/2021
- Description updated (diff)
- Description updated (diff)
Seems like this also needs to work for "bucket name = project UUID"? I imagine a case where a 3rd party service or scientific instrument is writing data to Arvados that consists of a number of different samples, in subdirectories, that belong in separate collections. We should ask potential users what they want to do with it.
I expect there are uses for "bucket name = project UUID" (and perhaps an "everything" bucket) but we might not want to jump in right away unless it's really needed right away. Semantics are a bit weird: creating an object called "foo" would never work, but creating "foo/bar" would (assuming "foo" is a collection in the project, or if we allow collections to be implicitly created that way). Implementing an efficient "list objects", iterating over all subprojects and collections with paging/cursors, might be an interesting challenge.
- Start date changed from 06/01/2021 to 07/01/2020
- Due date changed from 08/31/2021 to 10/01/2020
- Related to Idea #16535: [keep-web] Minimal implementation of S3 API added
- Related to Feature #16744: [keep-web] Support more S3 write APIs: DeleteObjects, POST object added
- Related to Feature #16745: [keep-web] Improve performance of S3 APIs using server-side cache added
- Status changed from New to In Progress
- Related to Idea #16809: [keep-web] Check V4 signature on S3 requests, don't require sending entire Arvados token as AccessKey added
- Due date changed from 10/01/2020 to 10/30/2020
- Due date changed from 10/30/2020 to 10/31/2020
- Related to Bug #16830: [keep-web] S3 PutObject response should have content MD5 added
- Related to Bug #16842: [keepstore] S3 driver should handle sub-second precision in timestamps added
- Related to deleted (Bug #16842: [keepstore] S3 driver should handle sub-second precision in timestamps)
- Related to Bug #16850: [keep-web] Add KeyCount field to ListObjects response added
- Related to Bug #16790: [keep-web] S3 ListObjects response should not have empty NextMarker field added
- Related to Feature #17009: [keep-web] S3 API should accept bucket name as first component of domain name added
- Due date changed from 10/31/2020 to 11/18/2020
- Related to Feature #17119: Virtual folder in FUSE/S3/WebDAV with contents defined by a query added
- Due date changed from 11/18/2020 to 12/31/2020
- Related to deleted (Feature #16745: [keep-web] Improve performance of S3 APIs using server-side cache)
- Due date changed from 12/31/2020 to 02/28/2021
- Due date changed from 02/28/2021 to 03/31/2021
- Related to Bug #17507: [keep-web] [s3] Implement NextContinuationToken per ListObjectsV2 API added
- Due date changed from 03/31/2021 to 04/30/2021
- Status changed from In Progress to Resolved
Also available in: Atom
PDF