Feature #14021

[crunch-dispatch-slurm] option to set job priority directly instead of using nice values

Added by Tom Clegg 4 months ago. Updated 4 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

Background: crunch-dispatch-slurm uses slurm's "nice" feature to make its slurm jobs' priority order match the priority order of their corresponding arvados containers. This is better than setting slurm job priorities directly in that it doesn't require c-d-s to have slurm administrator privileges. However, on older versions of slurm, its effectiveness is limited by slurm's maximum nice value of 10000.

crunch-dispatch-slurm could offer a configuration flag that causes it to set slurm job priority directly ("scontrol update jobid=X priority=Y") instead of restricting itself to non-privileged "nice" operations.

This configuration would be useful on systems running slurm 15 where it is acceptable to give slurm administrator privileges to crunch-dispatch-slurm.

Possible implementation strategies:
  1. Assume Arvados jobs are the only ones running. Choose a high priority like 2^32-3 (slurm's usual "TOP" priority) for the highest-priority Arvados job, and count down from there using the existing PrioritySpread logic.
  2. Wait for each submitted job to appear in the slurm queue, and note its automatically assigned priority. When updating priorities, start at the highest auto-assigned priority of all jobs still shown in squeue, and count down from there using the existing PrioritySpread logic.
    • Caveat: on a cloud installation, jobs frequently drop into "admin hold" state (priority=0), so it might be some time before c-d-s gets a chance to see a given job's automatically assigned priority. The resulting behavior might be confusing.

History

#2 Updated by Tom Clegg 4 months ago

  • Target version set to To Be Groomed

#3 Updated by Tom Clegg 4 months ago

  • Subject changed from [crunch-dispatch-slurm] Config option to set job priority directly instead of using nice values to [crunch-dispatch-slurm] option to set job priority directly instead of using nice values

Also available in: Atom PDF