Project

General

Profile

Actions

Bug #12199

closed

Don't schedule jobs on nodes which are too much bigger than requested

Added by Tom Morris over 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
3.0
Release relationship:
Auto

Description

Add cloud node size and price info to site configuration object:
  • cloud instance id
  • price
  • cores
  • ram
  • disk
Add Go package (server/dispatch_cloud) that exports a node size calculator.
  • Load cluster config
  • Given an arvados.Container record, return the appropriate instance type (not just a string, but the whole type record from the config)
  • Return a predictable error (ErrInstanceTypesNotConfigured?) if none are listed
  • Return a predictable error (ErrConstraintsNotSatisfiable?) if no instance type is big enough

In crunch-dispatch-slurm, before submitting a slurm job, use the dispatch_cloud calculator to determine the appropriate instance type. Add it to the sbatch command line in "--constraints". Omit constraints if calculator returns ErrInstanceTypesNotConfigured. Cancel the container if calculator returns ErrConstraintsNotSatisifiable.

In nodemanager, for each slurm job with constraints, use the indicated instance type for the wishlist.

In the node boot script, use scontrol to set the slurm feature/constraint to match whatever in the Arvados node record.

Optional (or not?): In nodemanager, load the site configuration file, and use the instance types found there instead of the ones in the "sizes" section of nodemanager's own config file.


Subtasks 1 (0 open1 closed)

Task #12971: Review 12199-dispatch-to-node-typeResolvedPeter Amstutz01/29/2018Actions

Related issues

Related to Arvados - Bug #12908: install docs for slurm don't allow multiple jobs per nodeResolvedActions
Related to Arvados - Bug #12794: [Node Manager] Behave smarter in environments where scratch space can be arbitrarily sized, like GCP and AWSClosedActions
Related to Arvados - Bug #13166: [node manager] wishlist should consist of top priority containersResolvedLucas Di Pentima03/26/2018Actions
Actions

Also available in: Atom PDF