Project

General

Profile

Feature #19263

Updated by Peter Amstutz over 1 year ago

HPC systems can support low priority / preemptible jobs similar to cloud.    One way is by having a separate "preemptible" queue. 

 Propose that, similar to GPU support, if a job is marked "preemptible", the Arvados configuration can supply a command line fragment that submits the job to a site-specific "preemptible" queue. 

 The dispatcher must also recognize when a job has been preempted and behave appropriately (e.g. not try to re-submit the job).    I.e. when LSF preempts a job, it can be configured to suspend it with SIGSTOP and keep it in in memory to resume later.    We don't want Arvados to think a suspended job has failed and try to cancel it. 
 job)

Back