Project

General

Profile

Idea #8574

Updated by Peter Amstutz about 8 years ago

Put SLURM in control of deciding when nodes should be allocated or destroyed on the cloud.    This is necessary to avoid second-guessing slurm when multiple jobs can share a node.    This can be implemented in a way that is compatible with both current crunch and crunch v2, and is likely to improve stability of current crunch. 

 Proposals: 

 1) Arvados nodes table has a static list of entries (0..N) for each available compute node size; list resources of each node (CPUs, memory); the state from sinfo (idle/alloc/down); state from node manager    (booting/running/shutdown); and whether we desire the the node to be up or down. 

 2) Generate partial SLURM configuration from nodes table with nodes marked as "cloud" type; see https://dev.arvados.org/issues/6520#note-5 for details.    ResumeProgram and SuspendProgram contact the API server and adjust the "desired state up/down" flag. 

 2) Change architecture of node manager to get rid of "wishlist" and monitoring arvados job queue and instead compares Arvados node list & cloud node list and decides which nodes to start and stop based on the "up/down" flag in the node record. 

 3) Remove code from crunch-dispatch that explicitly selects nodes (#nodes_available_for_job), instead run salloc with runtime constraints translated into salloc parameters --nodes, --mincpus, --mem.    Remove --immediate flag from salloc so that the request is queued (which will cause slurm to request more nodes if no idle nodes are available.)    One added benefit is that salloc will fail immediately if a job requests resources that cannot possibly be fulfilled in the current configuration, this can be communicated to the user. 

Back