| Branch: | Tag: | Revision:

arvados / services / nodemanager @ 0654fa16

Name Size
.gitignore 27 Bytes 40 Bytes
README.rst 1.05 KB
agpl-3.0.txt 33.7 KB 30 Bytes 1.3 KB

Latest revisions

# Date Author Comment
0654fa16 04/07/2017 03:43 pm Peter Amstutz

Increase ping delay in WatchdogActorTest to try and reduce spurious test failures. no issue #

4bb024ae 04/04/2017 06:03 pm Peter Amstutz

11413: Use getattr() in exception handler.

a03ce405 04/04/2017 03:40 pm Peter Amstutz

11413: Wrap destroy_node with similar logic to create_node: on exception check
the node list to determine if the node was actually destroyed successfully.

4a35e06b 04/04/2017 03:21 pm Peter Amstutz

11413: Fix issues with node manager on GCE:

  • Always override Node.size with CloudSizeWrapper
  • Get updated node record before setting metadata to minimize 'Supplied
    fingerprint does not match current metadata fingerprint.' error.
  • Use ex_set_node_metadata() instead of issuing request directly.
b8000c3c 03/23/2017 08:13 pm Peter Amstutz

11323: Don't try to offer_arvados_pair on unpaired nodes which are being shut down.

2e32ef16 03/23/2017 08:12 pm Peter Amstutz

11324: Fix crash in NodeManagerDaemonActor when receiving a node_can_shutdown
message for a node that has already been shut down.

2aef6ca0 03/23/2017 06:07 pm Peter Amstutz

11325: Remove "broken node" check. Assume if the node really isn't
functioning, it should be "down" in SLURM anyway. Remove test_broken_node_not_counted because broken node check is removed.

da8c9048 03/17/2017 05:27 pm Peter Amstutz

11288: Slurm requires reason to put node in DOWN state.

b60a21fe 03/16/2017 08:49 pm Peter Amstutz

11254: Refactor _node_states

2c69d491 03/16/2017 08:11 pm Peter Amstutz

11254: Cloud nodes where "actor is None" are considered to be in shutdown. The
only time it should be "None" is the period between a successful shutdown and
when the node disappears from the cloud node list.

View revisions

Also available in: Atom