Bug #17529
closed[a-d-c] AWS/EC2 driver should return a RateLimitError to dispatcher if the upstream error is RequestLimitExceeded
Description
Current code results in error logs like this:
{"InstanceType":"c5large.spot","PID":2231,"error":"RequestLimitExceeded: Request limit exceeded.\n\tstatus code: 503, request id: 778d5b22-90e4-4a48-8f09-6946e8edcb2c","level":"error","msg":"create failed","time":"2021-03-30T13:38:38.925565910Z"}
...but the returned error does not implement the lib/cloud.RateLimitError interface, so the dispatcher doesn't back off.
Updated by Tom Clegg almost 4 years ago
- Status changed from New to In Progress
17529-ec2-rate-limit @ ff11506c916cb2cd8abd1905e16c4d4f5ddd4240 -- developer-run-tests: #2412
Updated by Ward Vandewege almost 4 years ago
Tom Clegg wrote:
17529-ec2-rate-limit @ ff11506c916cb2cd8abd1905e16c4d4f5ddd4240 -- developer-run-tests: #2412
Nit: in `func newEC2InstanceSet` the last "else" can be dropped and the block outdented
This looks good for the "RunInstances" call; should we also add this for the other function interfaces that ec2Interface exposes?
Otherwise, LGTM, thanks.
Updated by Tom Clegg almost 4 years ago
Ward Vandewege wrote:
Nit: in `func newEC2InstanceSet` the last "else" can be dropped and the block outdented
Done with refactor.
This looks good for the "RunInstances" call; should we also add this for the other function interfaces that ec2Interface exposes?
Create and Instances() are currently the only methods whose caller (in a-d-c) will actually notice the don't-retry-until advice. Updated branch so Instances() is covered too.
We could also handle SetTags and Destroy (perhaps even reducing MaxCloudOpsPerSecond on the fly as well as logging something) ... but that would be a somewhat bigger fix, and (I suspect) not quite as important practically speaking.
That said, I noticed SetTags() wasn't even obeying MaxCloudOpsPerSecond in a-d-c, and that was trivial to fix.
17529-ec2-rate-limit @ 47cbcd47789be77f3a1c44ba605853f50c448e8a -- developer-run-tests: #2419
Updated by Ward Vandewege almost 4 years ago
Tom Clegg wrote:
Ward Vandewege wrote:
Nit: in `func newEC2InstanceSet` the last "else" can be dropped and the block outdented
Done with refactor.
This looks good for the "RunInstances" call; should we also add this for the other function interfaces that ec2Interface exposes?
Create and Instances() are currently the only methods whose caller (in a-d-c) will actually notice the don't-retry-until advice. Updated branch so Instances() is covered too.
We could also handle SetTags and Destroy (perhaps even reducing MaxCloudOpsPerSecond on the fly as well as logging something) ... but that would be a somewhat bigger fix, and (I suspect) not quite as important practically speaking.
That said, I noticed SetTags() wasn't even obeying MaxCloudOpsPerSecond in a-d-c, and that was trivial to fix.
17529-ec2-rate-limit @ 47cbcd47789be77f3a1c44ba605853f50c448e8a -- developer-run-tests: #2419
Cool, thanks for that additional fix in SetTags(). This LGTM; should we create a follow-on ticket to make a-d-c notice the don't-retry-until behavior for SetTags() and Destroy(), too (and the corresponding changes in the cloud drivers) ?
Updated by Tom Clegg almost 4 years ago
- Related to Bug #17561: [arvados-dispatch-cloud] inst.SetTags() and inst.Destroy() should respect rate-limiting responses from cloud provider added
Updated by Tom Clegg almost 4 years ago
- % Done changed from 0 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|e96a1b2bef7762c2896273af5d15625f4a845c2e.