Bug #11561

[API] Limit number of lock/unlock cycles for a given container

Added by Tom Clegg over 1 year ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
API
Target version:
Start date:
04/26/2017
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

Currently, if a container cannot be started due to some infrastructure problem (whether or not it's related to the specific container) it will be retried repeatedly forever.

Proposed solution:

Add a site config knob (analogous to num_retries) that limits the number of times a container can be unlocked (moved from Locked to Queued state) before being automatically cancelled.

Add:
  • Config key max_container_dispatch_attempts (default 5)
  • DB column "lock_count"
  • API response field "lock_count"
  • Increment lock_count during lock()
  • When unlocking a container, if lock_count >= Rails.configuration.max_container_dispatch_attempts, change state to Cancelled instead of Queued (the unlock API should still respond 200 in this case)

Related issues

Related to Arvados - Bug #11190: Containers seem to run more than once, which isn't supposed to happenResolved2017-03-01

History

#1 Updated by Tom Clegg over 1 year ago

  • Description updated (diff)

#2 Updated by Tom Morris over 1 year ago

  • Target version set to Arvados Future Sprints

Also available in: Atom PDF