Project

General

Profile

Container dispatch » History » Revision 17

Revision 16 (Tom Clegg, 12/23/2015 09:43 PM) → Revision 17/26 (Tom Clegg, 01/06/2016 08:46 PM)

h1. Container dispatch 

 {{toc}} 

 h2. Summary 

 A dispatcher uses available compute resources to execute queued containers. 

 Dispatch is meant to be a small simple component rather than a pluggable framework: e.g., "slurm dispatch" can be a small standalone program, rather than a plugin for a big generic dispatch program. 

 h2. Pseudocode 

 * Notice there is a queued container 
 * Decide whether the required resources are available to run the container 
 * Lock the container (this avoids races with other dispatch processes) 
 * Translate the container's runtime constraints and priority to instructions for the lower-level scheduler, if any 
 * Invoke the "crunch2 run" executor 
 * When the priority changes on a container taken by this dispatch process, update the lower-level scheduler accordingly (cancel if priority is zero) 
 * If the lower-level scheduler indicates the container is finished or abandoned, but the Container record is locked by this dispatcher and has state=Running, fail the container 

 h2. Examples 

 slurm batch mode 
 * Use "sinfo" to determine whether it is possible to run the container 
 * Submit a batch job to the queue: "echo crunch-run --job {uuid} | sbatch -N1" 
 * When container priority changes, use scontrol and scancel to propagate changes to slurm 
 * Use strigger to run a cleanup script when a container exits 

 standalone worker 
 * Inspect /proc/meminfo, /proc/cpuinfo, "docker ps", etc. to determine local capacity 
 * Invoke crunch-run as a child process (or perhaps a detached daemon process) 
 * Signal crunch-run to stop if container priority changes to zero 

 h2. Arvados API support 

 Each dispatch process has an Arvados API token that allows it to see queued containers. 
 * No two dispatch processes can run at the same time with the same token. One way to achieve this is to make a user record for each dispatch service. 

 Container APIs relevant to a dispatch program: 
 * List Queued containers (might be a subset of Queued containers) 
 * List containers with state=Locked or state=Running associated with current token 
 * Receive event when container is created or modified and state is Queued (it might become runnable) 
 * Change state Queued->Locked 
 * Change state Locked->Queued 
 * Change state Locked->Running 
 * Change state Running->Complete 
 * Receive event when priority changes 
 * Receive event when state changes to Complete 
 * Create a unique API token to pass to crunch-run (expires when the container stops) 
 * Create events/logs 
 ** Decided not to run this container 
 ** Decided to run this container (e.g., no node with those resources) 
 ** Lock failed 
 ** Dispatched to crunch-run 
 ** Cleaned up crashed crunch-run (lower-level scheduler indicates the job finished, but crunch-run didn't leave the container in a final state) 
 ** Cleaned up abandoned container (container belongs to this process, but dispatch and lower-level scheduler don't know about it) 

 h2. Non-responsibilities 

 Dispatch doesn't retry failed containers. If something needs to be reattempted, a new container will appear in the queue. 

 Dispatch doesn't fail a container that it can't run. It doesn't know whether other dispatchers will be able to run it. 

 

 h2. Additional notes 

 (see also #6429 and #6518 and #8028) #6518) 

 Using websockets to listen for container events (new containers added, priority changes) will benefit from some Go SDK support.