Project

General

Profile

Actions

Idea #16360

closed

Keep-web supports S3 compatible interface

Added by Peter Amstutz over 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
-
Target version:
-
Start date:
07/01/2020
Due date:
04/30/2021
Story points:
-
Release:
Release relationship:
Auto

Description

Background: Applications want to access data in Arvados through a web API. Keep-web currently supports WebDAV. Although it is a standard, 3rd party applications used by customers do not support WebDAV. Instead, many 3rd party applications support reading and writing to S3-compatible object storage. Keep-web should support an S3 compatible API to enable these applications to easily access Arvados.

Initial implementation should support the "list objects", "get object", and "put object" APIs using
  • object name = file path within collection
  • bucket name = collection UUID
  • secret key = Arvados API token
  • endpoint = ExternalURL of keep-web service

...with enough s3 compatibility to test with a common s3 client library like https://github.com/minio/minio-go.


Related issues 10 (2 open8 closed)

Related to Arvados - Idea #16535: [keep-web] Minimal implementation of S3 APIResolvedTom Clegg07/21/2020Actions
Related to Arvados - Feature #16744: [keep-web] Support more S3 write APIs: DeleteObjects, POST objectNewActions
Related to Arvados - Idea #16809: [keep-web] Check V4 signature on S3 requests, don't require sending entire Arvados token as AccessKeyResolvedTom Clegg09/22/2020Actions
Related to Arvados - Bug #16830: [keep-web] S3 PutObject response should have content MD5NewActions
Related to Arvados - Bug #16850: [keep-web] Add KeyCount field to ListObjects responseResolvedTom Clegg09/21/2020Actions
Related to Arvados - Bug #16790: [keep-web] S3 ListObjects response should not have empty NextMarker fieldResolvedTom Clegg09/01/2020Actions
Related to Arvados - Feature #17009: [keep-web] S3 API should accept bucket name as first component of domain nameResolvedTom Clegg11/19/2020Actions
Related to Arvados - Feature #17119: Virtual folder in FUSE/S3/WebDAV with contents defined by a queryResolvedWard Vandewege02/12/2021Actions
Related to Arvados - Feature #16669: Accept OpenID Connect access tokenResolvedTom Clegg09/24/2020Actions
Related to Arvados - Bug #17507: [keep-web] [s3] Implement NextContinuationToken per ListObjectsV2 APIResolvedTom Clegg04/23/2021Actions
Actions #1

Updated by Peter Amstutz over 4 years ago

  • Start date set to 06/01/2021
  • Due date set to 08/31/2021
Actions #3

Updated by Peter Amstutz over 4 years ago

  • Description updated (diff)
Actions #4

Updated by Tom Clegg over 4 years ago

  • Description updated (diff)
Actions #5

Updated by Peter Amstutz over 4 years ago

Seems like this also needs to work for "bucket name = project UUID"? I imagine a case where a 3rd party service or scientific instrument is writing data to Arvados that consists of a number of different samples, in subdirectories, that belong in separate collections. We should ask potential users what they want to do with it.

Actions #6

Updated by Tom Clegg over 4 years ago

I expect there are uses for "bucket name = project UUID" (and perhaps an "everything" bucket) but we might not want to jump in right away unless it's really needed right away. Semantics are a bit weird: creating an object called "foo" would never work, but creating "foo/bar" would (assuming "foo" is a collection in the project, or if we allow collections to be implicitly created that way). Implementing an efficient "list objects", iterating over all subprojects and collections with paging/cursors, might be an interesting challenge.

Actions #7

Updated by Peter Amstutz over 4 years ago

  • Start date changed from 06/01/2021 to 07/01/2020
  • Due date changed from 08/31/2021 to 10/01/2020
Actions #8

Updated by Tom Clegg over 4 years ago

  • Related to Idea #16535: [keep-web] Minimal implementation of S3 API added
Actions #9

Updated by Tom Clegg over 4 years ago

  • Related to Feature #16744: [keep-web] Support more S3 write APIs: DeleteObjects, POST object added
Actions #10

Updated by Tom Clegg over 4 years ago

  • Related to Feature #16745: [keep-web] Improve performance of S3 APIs using server-side cache added
Actions #11

Updated by Peter Amstutz over 4 years ago

  • Status changed from New to In Progress
Actions #12

Updated by Tom Clegg over 4 years ago

  • Related to Idea #16809: [keep-web] Check V4 signature on S3 requests, don't require sending entire Arvados token as AccessKey added
Actions #13

Updated by Peter Amstutz over 4 years ago

  • Due date changed from 10/01/2020 to 10/30/2020
Actions #14

Updated by Peter Amstutz over 4 years ago

  • Due date changed from 10/30/2020 to 10/31/2020
Actions #15

Updated by Tom Clegg over 4 years ago

  • Related to Bug #16830: [keep-web] S3 PutObject response should have content MD5 added
Actions #16

Updated by Tom Clegg over 4 years ago

  • Related to Bug #16842: [keepstore] S3 driver should handle sub-second precision in timestamps added
Actions #17

Updated by Tom Clegg over 4 years ago

  • Related to deleted (Bug #16842: [keepstore] S3 driver should handle sub-second precision in timestamps)
Actions #18

Updated by Tom Clegg over 4 years ago

  • Related to Bug #16850: [keep-web] Add KeyCount field to ListObjects response added
Actions #20

Updated by Tom Clegg over 4 years ago

  • Related to Bug #16790: [keep-web] S3 ListObjects response should not have empty NextMarker field added
Actions #21

Updated by Tom Clegg about 4 years ago

  • Related to Feature #17009: [keep-web] S3 API should accept bucket name as first component of domain name added
Actions #22

Updated by Peter Amstutz about 4 years ago

  • Due date changed from 10/31/2020 to 11/18/2020
Actions #23

Updated by Peter Amstutz about 4 years ago

  • Related to Feature #17119: Virtual folder in FUSE/S3/WebDAV with contents defined by a query added
Actions #24

Updated by Peter Amstutz about 4 years ago

  • Due date changed from 11/18/2020 to 12/31/2020
Actions #25

Updated by Peter Amstutz about 4 years ago

Actions #26

Updated by Peter Amstutz about 4 years ago

  • Related to deleted (Feature #16745: [keep-web] Improve performance of S3 APIs using server-side cache)
Actions #27

Updated by Peter Amstutz almost 4 years ago

  • Due date changed from 12/31/2020 to 02/28/2021
Actions #28

Updated by Peter Amstutz almost 4 years ago

  • Due date changed from 02/28/2021 to 03/31/2021
Actions #29

Updated by Peter Amstutz over 3 years ago

  • Related to Bug #17507: [keep-web] [s3] Implement NextContinuationToken per ListObjectsV2 API added
Actions #30

Updated by Peter Amstutz over 3 years ago

  • Due date changed from 03/31/2021 to 04/30/2021
Actions #31

Updated by Peter Amstutz over 3 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF