Project

General

Profile

Actions

Idea #9780

closed

[Crunch2] crunch-dispatch-slurm only dispatches containers that meet specific runtime constraint criteria

Added by Brett Smith over 7 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Start date:
08/12/2016
Due date:
Story points:
-

Description

Cluster administrators want to be able to dispatch work to SLURM with different options based on the size of the work. For example, very-high-RAM jobs should go to a particular SLURM partition.

Implementation plan: run multiple dispatchers that are configured to accept different sets of containers based on their runtime constraints. Administrators can then configure those dispatchers with different SbatchArguments to dispatch jobs differently based on the runtime constraints.

Add configuration options to crunch-dispatch-slurm that encode a series of conditions on container runtime_constraints. The dispatcher only tries to dispatch containers that meet all of the configured conditions.

Actions #1

Updated by Brett Smith over 7 years ago

  • Subject changed from [ to [Crunch2] crunch-dispatch-slurm only dispatches containers that meet specific runtime constraint criteria
  • Description updated (diff)

I'll confess that I'm a little surprised that this is apparently needed. I thought you could encode all this policy in SLURM directly, as long as we tell SLURM what the hardware requirements are (which we already do)? But if I'm wrong this seems like a good implementation.

Actions #2

Updated by Nico César over 7 years ago

Brett Smith wrote:

I'll confess that I'm a little surprised that this is apparently needed. I thought you could encode all this policy in SLURM directly, as long as we tell SLURM what the hardware requirements are (which we already do)? But if I'm wrong this seems like a good implementation.

As far as I know, if you have few big ram nodes and several smaller ram nodes. Chances are you'll get the big ram node allocated with small jobs, so whenever a big job comes to the queue it doesn't have the resources immediately and needs to wait. To avoid that, you can create 2 partitions and specify --partition parameter when you ran sbatch

Actions #3

Updated by Tom Morris over 7 years ago

  • Project changed from 42 to Arvados
Actions #4

Updated by Peter Amstutz about 4 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF