Feature #17009

[keep-web] S3 API should accept bucket name as first component of domain name

Added by Tom Clegg about 2 months ago. Updated 5 days ago.

Status:
Feedback
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Start date:
11/19/2020
Due date:
% Done:

50%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Description

Currently it only accepts bucket name in path, but it should be easy enough to accept bucket name in the domain name as we already do in keep-web for non-S3 requests.


Subtasks

Task #17143: Review 17009-s3-bucket-vhostResolvedPeter Amstutz

Task #17169: Get cyberduck to workNew


Related issues

Related to Arvados Epics - Story #16360: Keep-web supports S3 compatible interfaceIn Progress07/01/202012/31/2020

Blocked by Arvados - Feature #17011: Add keep-web wildcard DNS to saltIn Progress11/25/2020

Associated revisions

Revision 40a4776f
Added by Tom Clegg 11 days ago

Merge branch '17009-s3-bucket-vhost'

closes #17009

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

Revision 0c5e55d6 (diff)
Added by Tom Clegg 5 days ago

17009: Fix bucket-level ops using virtual host-style requests.

refs #17009

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Tom Clegg about 2 months ago

  • Related to Story #16360: Keep-web supports S3 compatible interface added

#2 Updated by Peter Amstutz 18 days ago

I'm trying to use the command line version of cyberduck from https://duck.sh/

I'm trying to list the contents of a bucket:

duck -l s3://download.ce8i5.arvadosapi.com/ce8i5-j7d0g-g6r8w0853s32ged/

This doesn't work because it is connecting to

ce8i5-j7d0g-g6r8w0853s32ged.download.ce8i5.arvadosapi.com

From debugging, I see something about:

s3service.disable-dns-buckets=false

This seems to be a configuration option of the jets3t java library used by Duck. I don't know how to set it, though.
creating ~/.duck/jets3t.properties didn't seem to work.

#3 Updated by Peter Amstutz 18 days ago

  • Target version set to 2020-12-02 Sprint

#4 Updated by Peter Amstutz 18 days ago

#5 Updated by Peter Amstutz 13 days ago

  • Story points set to 1.0

#6 Updated by Peter Amstutz 13 days ago

  • Assigned To set to Tom Clegg

#8 Updated by Tom Clegg 12 days ago

  • Status changed from New to In Progress

#9 Updated by Tom Clegg 12 days ago

Worth adding a note to that keep-web install page along these lines? "The *.collections.ClusterID.example.com option is preferred if you plan to access Keep using third-party S3 client software."

(Some clients can be configured to use a different pattern like {bucket}--collections.example.com but even for them it's probably less effort overall to use the default pattern.)

#10 Updated by Peter Amstutz 11 days ago

17009-s3-bucket-vhost @ baeef76a2b3b60fb3613d01b1df2916397e8c589

Well, that was easy.

We'll want to do some manual testing when the wildcard certificates get set up on one of the dev clusters.

Otherwise, this LGTM.

Tom Clegg wrote:

Worth adding a note to that keep-web install page along these lines? "The *.collections.ClusterID.example.com option is preferred if you plan to access Keep using third-party S3 client software."

(Some clients can be configured to use a different pattern like {bucket}--collections.example.com but even for them it's probably less effort overall to use the default pattern.)

Yes, it should be recommended. Also the introduction on that page should mention support for S3 API.

#11 Updated by Tom Clegg 11 days ago

Install doc updates:

17009-s3-bucket-vhost @ 2c3df643bc9effb76a26d56c6b4881856003c053

#12 Updated by Anonymous 11 days ago

  • Status changed from In Progress to Resolved

#13 Updated by Peter Amstutz 6 days ago

  • Status changed from Resolved to Feedback

#14 Updated by Peter Amstutz 5 days ago

Cyberduck still doesn't quite work. It is supposed to be returning a list of bucket contents but instead it is returning an application/x-directory object.

$ duck -v -l  s3://collections.ce8i5.arvadosapi.com/ce8i5-4zz18-ohp73xy8om7aipj
Listing directory ce8i5-4zz18-ohp73xy8om7aipj…
Login collections.ce8i5.arvadosapi.com. Login collections.ce8i5.arvadosapi.com – S3 with username and password. No login credentials could be found in the Keychain.
Access Key ID (peter): ce8i5-gj3su-02f1ov5mgblpf5b
Login as ce8i5-gj3su-02f1ov5mgblpf5b
Secret Access Key: 
WARNING! Passwords are stored in plain text in ~/.duck/credentials.
Save password (y/n): y
Authenticating as ce8i5-gj3su-02f1ov5mgblpf5b…
> GET / HTTP/1.1
> Date: Wed, 25 Nov 2020 16:14:18 GMT
> x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
> Host: collections.ce8i5.arvadosapi.com
> x-amz-date: 20201125T161418Z
> Authorization: ********
> Connection: Keep-Alive
> User-Agent: Cyberduck/7.7.0.33744 (Linux/4.19.0-10-amd64) (amd64)

< HTTP/1.1 200 OK
< Server: nginx/1.14.0 (Ubuntu)
< Date: Wed, 25 Nov 2020 16:14:18 GMT
< Content-Type: application/xml
< Content-Length: 271
< Connection: keep-alive
< Strict-Transport-Security: max-age=63072000

> GET /?encoding-type=url&max-keys=1000&prefix&delimiter=%2F HTTP/1.1
> Date: Wed, 25 Nov 2020 16:14:18 GMT
> x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
> Host: collections.ce8i5.arvadosapi.com
> x-amz-date: 20201125T161418Z
> Authorization: ********
> Connection: Keep-Alive
> User-Agent: Cyberduck/7.7.0.33744 (Linux/4.19.0-10-amd64) (amd64)

< HTTP/1.1 200 OK
< Server: nginx/1.14.0 (Ubuntu)
< Date: Wed, 25 Nov 2020 16:14:18 GMT
< Content-Type: application/xml
< Content-Length: 272
< Connection: keep-alive

Login successful…

> GET /?versioning HTTP/1.1
> Date: Wed, 25 Nov 2020 16:14:18 GMT
> x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
> Host: ce8i5-4zz18-ohp73xy8om7aipj.collections.ce8i5.arvadosapi.com
> x-amz-date: 20201125T161418Z
> Authorization: ********
> Connection: Keep-Alive
> User-Agent: Cyberduck/7.7.0.33744 (Linux/4.19.0-10-amd64) (amd64)

< HTTP/1.1 200 OK
< Server: nginx/1.14.0 (Ubuntu)
< Date: Wed, 25 Nov 2020 16:14:19 GMT
< Content-Type: application/x-directory
< Content-Length: 0
< Connection: keep-alive
< Strict-Transport-Security: max-age=63072000

> GET /?encoding-type=url&max-keys=1000&prefix&delimiter=%2F HTTP/1.1
> Date: Wed, 25 Nov 2020 16:14:19 GMT
> x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
> Host: ce8i5-4zz18-ohp73xy8om7aipj.collections.ce8i5.arvadosapi.com
> x-amz-date: 20201125T161419Z
> Authorization: ********
> Connection: Keep-Alive
> User-Agent: Cyberduck/7.7.0.33744 (Linux/4.19.0-10-amd64) (amd64)

< HTTP/1.1 200 OK
< Server: nginx/1.14.0 (Ubuntu)
< Date: Wed, 25 Nov 2020 16:14:19 GMT
< Content-Type: application/x-directory
< Content-Length: 0
< Connection: keep-alive
< Strict-Transport-Security: max-age=63072000
Listing directory ce8i5-4zz18-ohp73xy8om7aipj failed. Failed to parse XML document with handler class org.jets3t.service.impl.rest.XmlResponsesSaxParser$ListBucketHandler. Please contact your web hosting service provider for assistance.

Also available in: Atom PDF