Bug #21636
closedkeepstore S3 driver gets TokenExpiry errors when using automatic credentials / IAMRole feature
Description
We should be using the ExpiryWindow feature in the AWS SDK to avoid this.
We should also return 5xx instead of 404 if we get a TokenExpiry error while trying to read/write a block, so the client can retry.
Updated by Tom Clegg about 1 year ago
21636-s3-token-expiry @ 4ec8e3b32cd11141f9d639f651b721feff66c437 -- developer-run-tests: #4112
21636-s3-token-expiry-2.7 @ 26fcb9e9e22526a11badb61b6d782876914008e8 -- 2.7-developer-run-tests: #16
Updated by Brett Smith about 1 year ago
I don't want to bikeshed this, but I am curious what the rationale is for setting ExpiryWindow to 1 minute. Given that the current behavior we're seeing suggests the expiry is ~24 hours, I probably would've set it on the order of 10-30 minutes: long enough to provide confidence that even painfully slow cloud API calls will have a valid token for their entire lifecycle, without excessive token cycling. But I'm not going to insist on that, especially if you've got basically any explanation for this value.
Am I following right that the reason the main branch only gets a test for returning 5xx, and no corresponding code changes, is because the keepstore rewrite already did that?
Thanks.
Updated by Tom Clegg about 1 year ago
Brett Smith wrote in #note-2:
...rationale is for setting ExpiryWindow to 1 minute...
The SDK examples and defaults in other SDKs were 10-15 seconds and I thought that seemed too short. I've now looked further and found that the minimum session time is 15 minutes, and new credentials are documented to be available "at least 5 minutes" before the old ones expire, so I've updated to 5 minutes and added that doc link in a comment.
Am I following right that the reason the main branch only gets a test for returning 5xx, and no corresponding code changes, is because the keepstore rewrite already did that?
Yes, that's right.
Updated by Tom Clegg about 1 year ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|a15ba265e59898fb59fa85fac01c0a9efc4d064f.
Updated by Peter Amstutz about 1 year ago
Tom Clegg wrote in #note-3:
The SDK examples and defaults in other SDKs were 10-15 seconds and I thought that seemed too short. I've now looked further and found that the minimum session time is 15 minutes, and new credentials are documented to be available "at least 5 minutes" before the old ones expire, so I've updated to 5 minutes and added that doc link in a comment.
Hopefully it doesn't introduce a new race condition where credentials are available exactly at the 5 minute mark and the code tries to refresh it too soon...
Updated by Tom Clegg about 1 year ago
Based on my reading of the aws-sdk-go-v2 code, if we refresh too soon, that just means we get credentials that expire in 5 minutes, which therefore get used only once for the current call, and then we refresh again next time.
Updated by Peter Amstutz about 1 year ago
Tom Clegg wrote in #note-6:
Based on my reading of the aws-sdk-go-v2 code, if we refresh too soon, that just means we get credentials that expire in 5 minutes, which therefore get used only once for the current call, and then we refresh again next time.
So we probably get the old credentials back in that case and we end up having to refresh again anyway but it is unlikely to break?
Updated by Tom Clegg about 1 year ago
Peter Amstutz wrote in #note-7:
So we probably get the old credentials back in that case and we end up having to refresh again anyway but it is unlikely to break?
Yes.