Feature #6518

Updated by Tom Clegg over 5 years ago

When containers appear in the queue, use SLURM to execute them on worker nodes.

For now, the queue is arvados.v1.containers.queue (much like the Crunch1 job queue).

From [[Crunch2 dispatch]]:

slurm batch mode
* Use "sinfo" to determine whether it is possible to run the container
* Submit a batch job to the queue: "echo crunch-run --job {uuid} | sbatch -N1"
* When container priority changes, use scontrol and scancel to propagate changes to slurm
* Use strigger to run a cleanup script when a container exits

The cleanup script just has to deal with cases like the node dying before crunch-run has a chance to update the container record to state="Complete"