Bug #11561
closed
[API] Limit number of lock/unlock cycles for a given container
Added by Tom Clegg almost 8 years ago.
Updated about 6 years ago.
Release relationship:
Auto
Description
Currently, if a container cannot be started due to some infrastructure problem (whether or not it's related to the specific container) it will be retried repeatedly forever.
Proposed solution:
Add a site config knob (analogous to num_retries) that limits the number of times a container can be unlocked (moved from Locked to Queued state) before being automatically cancelled.
Add:
- Config key max_container_dispatch_attempts (default 5)
- DB column "lock_count" (do not include in API response)
- Increment lock_count during lock()
- When unlocking a container, if lock_count >= Rails.configuration.max_container_dispatch_attempts, change state to Cancelled instead of Queued (the unlock API should still respond 200 in this case) and update runtime_status[error] with an error message.
Write tests and update documentation.
- Description updated (diff)
- Target version set to Arvados Future Sprints
- Related to Bug #9688: [Crunch2] Limit number of dispatch attempts per container added
- Target version changed from Arvados Future Sprints to To Be Groomed
This is a near duplicate of #9688. We should probably just merge the two.
- Related to Bug #14540: [API] Limit number of container lock/unlock cycles added
- Related to deleted (Bug #14540: [API] Limit number of container lock/unlock cycles)
- Has duplicate Bug #14540: [API] Limit number of container lock/unlock cycles added
- Related to deleted (Bug #9688: [Crunch2] Limit number of dispatch attempts per container)
- Is duplicate of Bug #9688: [Crunch2] Limit number of dispatch attempts per container added
- Status changed from New to Duplicate
- Status changed from Duplicate to New
- Priority changed from Normal to High
- Description updated (diff)
- Description updated (diff)
- Target version changed from To Be Groomed to 2019-02-27 Sprint
- Assigned To set to Peter Amstutz
- Status changed from New to Resolved
- Related to Bug #18102: max dispatch attempts error added
Also available in: Atom
PDF