Idea #8000
Updated by Peter Amstutz over 9 years ago
Apparently node manager only shuts down nodes that are "idle" in slurm, if they are "down" then they don't get shut down?
<pre>
2015-12-11_20:41:05.08909 2015-12-11 20:41:05 arvnodeman.cloud_nodes[11545] DEBUG: CloudNodeListMonitorActor (at 140548410010704) got response with 1 items
2015-12-11_20:41:05.09007 2015-12-11 20:41:05 arvnodeman.daemon[11545] INFO: Registering new cloud node /subscriptions/a731f419-596b-4b64-a278-364e76506b06/resourceGroups/c97qk/providers/Microsoft.Compute/virtualMachines/compute-tj4hwdsw3yjiyjt-c97qk
2015-12-11_20:41:05.09273 2015-12-11 20:41:05 pykka[11545] DEBUG: Registered ComputeNodeMonitorActor (urn:uuid:83697dab-e718-4fd5-8595-b6563015585c)
2015-12-11_20:41:05.09280 2015-12-11 20:41:05 pykka[11545] DEBUG: Starting ComputeNodeMonitorActor (urn:uuid:83697dab-e718-4fd5-8595-b6563015585c)
2015-12-11_20:41:05.09391 2015-12-11 20:41:05 arvnodeman.computenode[11545] DEBUG: Node /subscriptions/a731f419-596b-4b64-a278-364e76506b06/resourceGroups/c97qk/providers/Microsoft.Compute/virtualMachines/compute-tj4hwdsw3yjiyjt-c97qk suggesting shutdown.
2015-12-11_20:41:05.09584 2015-12-11 20:41:05 arvnodeman.cloud_nodes[11545] DEBUG: <pykka.proxy._CallableProxy object at 0x7fd3f81b0850> subscribed to events for '/subscriptions/a731f419-596b-4b64-a278-364e76506b06/resourceGroups/c97qk/providers/Microsoft.Compute/virtualMachines/compute-tj4hwdsw3yjiyjt-c97qk'
2015-12-11_20:41:05.09804 2015-12-11 20:41:05 arvnodeman.daemon[11545] INFO: Cloud node /subscriptions/a731f419-596b-4b64-a278-364e76506b06/resourceGroups/c97qk/providers/Microsoft.Compute/virtualMachines/compute-tj4hwdsw3yjiyjt-c97qk has associated with Arvados node c97qk-7ekkf-tj4hwdsw3yjiyjt
2015-12-11_20:41:05.09921 2015-12-11 20:41:05 arvnodeman.computenode[11545] DEBUG: Node /subscriptions/a731f419-596b-4b64-a278-364e76506b06/resourceGroups/c97qk/providers/Microsoft.Compute/virtualMachines/compute-tj4hwdsw3yjiyjt-c97qk shutdown window open but node busy.
2015-12-11_20:41:05.10064 2015-12-11 20:41:05 arvnodeman.arvados_nodes[11545] DEBUG: <pykka.proxy._CallableProxy object at 0x7fd3f8e11250> subscribed to events for 'c97qk-7ekkf-tj4hwdsw3yjiyjt'
</pre>
<pre>
$ arv node get -u c97qk-7ekkf-tj4hwdsw3yjiyjt
{
"href":"/nodes/c97qk-7ekkf-tj4hwdsw3yjiyjt",
"kind":"arvados#node",
"etag":"984qlz3msed6utdnndclhuz0o",
"uuid":"c97qk-7ekkf-tj4hwdsw3yjiyjt",
"owner_uuid":"c97qk-tpzed-000000000000000",
"created_at":"2015-09-09T14:26:19.832861000Z",
"modified_by_client_uuid":null,
"modified_by_user_uuid":"c97qk-tpzed-000000000000000",
"modified_at":"2015-12-11T20:58:01.734010000Z",
"hostname":"compute0",
"domain":"c97qk.arvadosapi.com",
"ip_address":"10.25.64.10",
"last_ping_at":"2015-12-11T20:58:01.734010000Z",
"slot_number":0,
"status":"running",
"job_uuid":null,
"crunch_worker_state":"down",
"properties":{
"cloud_node":{
"price":0,
"size":"Standard_D1"
},
"total_cpu_cores":1,
"total_ram_mb":3442,
"total_scratch_mb":51172
},
"first_ping_at":"2015-12-08T02:17:01.949316000Z",
"info":{
"ec2_instance_id":"/subscriptions/a731f419-596b-4b64-a278-364e76506b06/resourceGroups/c97qk/providers/Microsoft.Compute/virtualMachines/compute-tj4hwdsw3yjiyjt-c97qk",
"last_action":"Prepared by Node Manager",
"ping_secret":"35vaizroj3kkoqzm2vad92t6fewg7hbdix8jgj0wpklh3rdo4v",
"slurm_state":"down"
},
"nameservers":[
"10.25.0.6"
]
}
</pre>
<pre>
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
compute* up infinite 2 drain* compute[2-3]
compute* up infinite 252 down* compute[1,4-14,16-255]
compute* up infinite 1 idle compute15
compute* up infinite 1 down compute0
</pre>