Project

General

Profile

Actions

Bug #4839

closed

[Node Manager] Should look at Arvados node's crunch_worker_state, not info['slurm_state']

Added by Brett Smith almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Node Manager
Target version:
Story points:
0.5

Description

As of this writing, the Node Manager's ComputeNodeMonitorActor and friends look at the node record's info['slurm_state'] string to decide whether or not the node is eligible for shutdown.

But the API server is responsible for knowing its dispatch method and translating between that and a common string. It exposes this as crunch_worker_state, which can be one of 'busy', 'idle', or 'down'. Node Manager should use this field to make shutdown decisions instead.

Note that I'm only talking about making a change when Node Manager is looking at a node record to make shutdown decisions. Code that talks to SLURM directly, like the ComputeNodeShutdownActor in the SLURM dispatch module, doesn't need to be changed.


Subtasks 2 (0 open2 closed)

Task #5169: Heed crunch_worker_stateResolvedTom Clegg12/18/2014Actions
Task #5203: Review 4839-worker-stateResolvedTom Clegg12/18/2014Actions
Actions #1

Updated by Tom Clegg almost 10 years ago

  • Target version changed from Bug Triage to Arvados Future Sprints
Actions #2

Updated by Tom Clegg almost 10 years ago

  • Target version changed from Arvados Future Sprints to 2015-02-18 sprint
Actions #3

Updated by Brett Smith almost 10 years ago

  • Assigned To set to Brett Smith
Actions #4

Updated by Tom Clegg almost 10 years ago

  • Assigned To changed from Brett Smith to Tom Clegg
Actions #5

Updated by Brett Smith almost 10 years ago

61fd9276 is good to merge. Thanks.

Actions #6

Updated by Anonymous almost 10 years ago

  • Status changed from New to Resolved
  • % Done changed from 50 to 100

Applied in changeset arvados|commit:91abe2648d8ca1a3a5185e94beb505ad33db9e2c.

Actions

Also available in: Atom PDF