Bug #21287
openBinning and throttling incoming and outgoing requests
Description
Originally from:
https://dev.arvados.org/issues/21285#note-2
In order to service a request, controller can do a number of things:
- Forward it to the local Rails API server
- Handle it entirely within controller (by querying the local database itself)
- Query another service (keep-web, or a crunch-run process on a compute node)
- Query another Arvados instance (federated queries)
In the 3rd or 4th cases, we don't have full control over what the other service is going to do -- but we have existing patterns in the keep-web and federated cases where the remote service will query back to our controller in order to verify an API token, retrieve a user record, or get other data.
We've specifically observed this with keep-web, where:
- the Workbench 2 process page sends requests for all the log collection files at once
- this hits controller's request limit
- keep-web sends a request back to verify a token
- the request to verify the token is stuck behind the outstanding requests that were proxied to keep-web, that are waiting on keep-web, that is waiting on the token verify
- the system is deadlocked until something times out
The current fix is to make sure the minimum request limit is high enough that we don't do this to ourselves.
We could get into a similar situation with federation, but an even simpler problem is one where the remote service is in a slow or broken (or malicious state) where it is a tar pit that causes queries to hang for a long time. If the queue is filled with outstanding requests, the system will become unusable. (Of course, this is also possible with slow Rails/database requests, but the sysadmin has more control over those).
Proposed solution¶
Limit both incoming and outgoing requests.
- determine request priority and timestamp for priority queue order
- start handling up to MaxConcurrentRequests incoming requests in priority order, with throttling
- when a request handler is going to make an outgoing request to Rails, acquire another throttled lock (up to MaxConcurrentRailsRequests) for that category of outgoing request
- the request acquires the rails lock in priority order
- also want to bin requests into categories, eg
- requests that get information about the token, e.g current user or current token
- requests that proxy to keep-web
- container gateway requests (already implemented)
- everything else
Updated by Peter Amstutz about 1 year ago
- Status changed from New to In Progress
Updated by Peter Amstutz about 1 year ago
- Status changed from In Progress to New
Updated by Peter Amstutz about 1 year ago
- Subject changed from MaxExternalRequests config to MaxForwardedRequests config
Updated by Peter Amstutz about 1 year ago
- Subject changed from MaxForwardedRequests config to MaxProxiedRequests config
Updated by Peter Amstutz about 1 year ago
- Description updated (diff)
- Subject changed from MaxProxiedRequests config to MaxProxiedRequests config
Updated by Peter Amstutz 12 months ago
- Target version changed from Development 2024-01-17 sprint to Development 2024-01-31 sprint
Updated by Peter Amstutz 12 months ago
- Description updated (diff)
- Subject changed from MaxProxiedRequests config to Throttle both incoming and outgoing requests
Updated by Peter Amstutz 12 months ago
- Subject changed from Throttle both incoming and outgoing requests to Binning and throttling incoming and outgoing requests
Updated by Peter Amstutz 11 months ago
- Target version changed from Development 2024-01-31 sprint to Development 2024-02-14 sprint
Updated by Peter Amstutz 11 months ago
- Target version changed from Development 2024-02-14 sprint to Development 2024-02-28 sprint
Updated by Peter Amstutz 10 months ago
- Target version changed from Development 2024-02-28 sprint to Development 2024-03-13 sprint
Updated by Peter Amstutz 10 months ago
- Target version changed from Development 2024-03-13 sprint to Development 2024-03-27 sprint
Updated by Peter Amstutz 10 months ago
- Target version changed from Development 2024-03-27 sprint to Development 2024-04-24 sprint
Updated by Peter Amstutz 9 months ago
- Target version changed from Development 2024-04-24 sprint to Development 2024-04-10 sprint
Updated by Peter Amstutz 9 months ago
- Target version changed from Development 2024-04-10 sprint to Development 2024-04-24 sprint
Updated by Peter Amstutz 9 months ago
- Target version changed from Development 2024-04-24 sprint to Development 2024-05-08 sprint
Updated by Peter Amstutz 9 months ago
- Target version changed from Development 2024-05-08 sprint to Development 2024-06-05 sprint
Updated by Peter Amstutz 8 months ago
- Target version changed from Development 2024-06-05 sprint to 439
Updated by Peter Amstutz 7 months ago
- Target version changed from 439 to Development 2024-07-03 sprint
Updated by Peter Amstutz 6 months ago
- Target version changed from Development 2024-07-03 sprint to Development 2024-08-07 sprint
Updated by Peter Amstutz 6 months ago
- Target version changed from Development 2024-08-07 sprint to Future
Updated by Peter Amstutz about 22 hours ago
- Related to Bug #22414: MaxConcurrentRailsRequests default is too small added