Bug #11561

[API] Limit number of lock/unlock cycles for a given container

Added by Tom Clegg over 2 years ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Start date:
04/26/2017
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
2.0
Release relationship:
Auto

Description

Currently, if a container cannot be started due to some infrastructure problem (whether or not it's related to the specific container) it will be retried repeatedly forever.

Proposed solution:

Add a site config knob (analogous to num_retries) that limits the number of times a container can be unlocked (moved from Locked to Queued state) before being automatically cancelled.

Add:
  • Config key max_container_dispatch_attempts (default 5)
  • DB column "lock_count" (do not include in API response)
  • Increment lock_count during lock()
  • When unlocking a container, if lock_count >= Rails.configuration.max_container_dispatch_attempts, change state to Cancelled instead of Queued (the unlock API should still respond 200 in this case) and update runtime_status[error] with an error message.

Write tests and update documentation.


Subtasks

Task #14837: Review 11561-limit-container-locksResolvedPeter Amstutz


Related issues

Related to Arvados - Bug #11190: Containers seem to run more than once, which isn't supposed to happenResolved03/01/2017

Has duplicate Arvados - Bug #14540: [API] Limit number of container lock/unlock cyclesDuplicate

Is duplicate of Arvados - Bug #9688: [Crunch2] Limit number of dispatch attempts per containerDuplicate08/02/2016

Associated revisions

Revision a4efc217
Added by Peter Amstutz 8 months ago

Merge branch '11561-limit-container-locks' refs #11561

Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <>

History

#1 Updated by Tom Clegg over 2 years ago

  • Description updated (diff)

#2 Updated by Tom Morris about 2 years ago

  • Target version set to Arvados Future Sprints

#3 Updated by Peter Amstutz 9 months ago

  • Related to Bug #9688: [Crunch2] Limit number of dispatch attempts per container added

#4 Updated by Peter Amstutz 9 months ago

  • Target version changed from Arvados Future Sprints to To Be Groomed

#5 Updated by Tom Morris 9 months ago

This is a near duplicate of #9688. We should probably just merge the two.

#6 Updated by Tom Morris 9 months ago

  • Related to Bug #14540: [API] Limit number of container lock/unlock cycles added

#7 Updated by Peter Amstutz 9 months ago

  • Related to deleted (Bug #14540: [API] Limit number of container lock/unlock cycles)

#8 Updated by Peter Amstutz 9 months ago

  • Has duplicate Bug #14540: [API] Limit number of container lock/unlock cycles added

#9 Updated by Peter Amstutz 9 months ago

  • Related to deleted (Bug #9688: [Crunch2] Limit number of dispatch attempts per container)

#10 Updated by Peter Amstutz 9 months ago

  • Is duplicate of Bug #9688: [Crunch2] Limit number of dispatch attempts per container added

#11 Updated by Peter Amstutz 9 months ago

  • Status changed from New to Duplicate

#12 Updated by Peter Amstutz 9 months ago

  • Status changed from Duplicate to New
  • Priority changed from Normal to High

#14 Updated by Peter Amstutz 9 months ago

  • Description updated (diff)

#15 Updated by Peter Amstutz 9 months ago

  • Description updated (diff)

#16 Updated by Peter Amstutz 9 months ago

  • Story points set to 2.0

#17 Updated by Tom Morris 8 months ago

  • Target version changed from To Be Groomed to 2019-02-27 Sprint

#18 Updated by Peter Amstutz 8 months ago

  • Assigned To set to Peter Amstutz

#19 Updated by Peter Amstutz 8 months ago

11561-limit-container-locks @ 0f14b3456d2d3bdf95b78b65a1a41280a7416928

https://ci.curoverse.com/view/Developer/job/developer-run-tests/1074/

Added lock_count + migration

Updated lock/unlock

Added test

Added configuration parameter (and added it to the new cluster config design doc as well)

#20 Updated by Lucas Di Pentima 8 months ago

This LGTM, thanks!

#21 Updated by Peter Amstutz 8 months ago

  • Status changed from New to Resolved

#22 Updated by Tom Morris 8 months ago

  • Release set to 15

Also available in: Atom PDF