Project

General

Profile

Bug #11561

Updated by Tom Clegg over 7 years ago

Currently, if a container cannot be started due This is analogous to some infrastructure problem (whether or not it's related to the specific container) it will be retried repeatedly forever. 

 Proposed solution: 

 Add a site config knob (analogous to num_retries) that num_retries. It limits the number of times a container can be unlocked (moved from Locked to Queued state) before being automatically cancelled. due to an infrastructure problem. 

 Add: 
 * Config key max_container_dispatch_attempts (default 5) 
 * DB column "lock_count" 
 * API response field "lock_count" 
 * Increment lock_count during lock() 
 * When unlocking a container, if lock_count >= Rails.configuration.max_container_dispatch_attempts, change state to Cancelled instead of Queued (the unlock API should still respond 200 in this case) 

Back