Bug #4839
closed[Node Manager] Should look at Arvados node's crunch_worker_state, not info['slurm_state']
Description
As of this writing, the Node Manager's ComputeNodeMonitorActor and friends look at the node record's info['slurm_state'] string to decide whether or not the node is eligible for shutdown.
But the API server is responsible for knowing its dispatch method and translating between that and a common string. It exposes this as crunch_worker_state, which can be one of 'busy', 'idle', or 'down'. Node Manager should use this field to make shutdown decisions instead.
Note that I'm only talking about making a change when Node Manager is looking at a node record to make shutdown decisions. Code that talks to SLURM directly, like the ComputeNodeShutdownActor in the SLURM dispatch module, doesn't need to be changed.
Updated by Tom Clegg almost 10 years ago
- Target version changed from Bug Triage to Arvados Future Sprints
Updated by Tom Clegg almost 10 years ago
- Target version changed from Arvados Future Sprints to 2015-02-18 sprint
Updated by Tom Clegg almost 10 years ago
- Assigned To changed from Brett Smith to Tom Clegg
Updated by Anonymous almost 10 years ago
- Status changed from New to Resolved
- % Done changed from 50 to 100
Applied in changeset arvados|commit:91abe2648d8ca1a3a5185e94beb505ad33db9e2c.