Project

General

Profile

Actions

Bug #4839

closed

[Node Manager] Should look at Arvados node's crunch_worker_state, not info['slurm_state']

Added by Brett Smith almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Node Manager
Target version:
Story points:
0.5

Description

As of this writing, the Node Manager's ComputeNodeMonitorActor and friends look at the node record's info['slurm_state'] string to decide whether or not the node is eligible for shutdown.

But the API server is responsible for knowing its dispatch method and translating between that and a common string. It exposes this as crunch_worker_state, which can be one of 'busy', 'idle', or 'down'. Node Manager should use this field to make shutdown decisions instead.

Note that I'm only talking about making a change when Node Manager is looking at a node record to make shutdown decisions. Code that talks to SLURM directly, like the ComputeNodeShutdownActor in the SLURM dispatch module, doesn't need to be changed.


Subtasks 2 (0 open2 closed)

Task #5169: Heed crunch_worker_stateResolvedTom Clegg12/18/2014Actions
Task #5203: Review 4839-worker-stateResolvedTom Clegg12/18/2014Actions
Actions

Also available in: Atom PDF