Bug #8913
Updated by Nico César about 8 years ago
This happened in qr2hi: (I don't know if this exceptions are the cause of the manager being wedged or not. ) I restarted the service and the nodes were created.
<pre>
# grep Traceback arvados-node-manager/log/main/current -A28
2016-04-08_18:00:17.44134 Traceback (most recent call last):
2016-04-08_18:00:17.44134 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/launcher.py", line 128, in main
2016-04-08_18:00:17.44135 signal.pause()
2016-04-08_18:00:17.44136 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/launcher.py", line 90, in shutdown_signal
2016-04-08_18:00:17.44136 node_daemon.shutdown()
2016-04-08_18:00:17.44136 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/baseactor.py", line 25, in __call__
2016-04-08_18:00:17.44137 self.actor_ref.tell(message)
2016-04-08_18:00:17.44137 File "/usr/local/lib/python2.7/dist-packages/pykka/actor.py", line 398, in tell
2016-04-08_18:00:17.44137 raise ActorDeadError('%s not found' % self)
2016-04-08_18:00:17.44137 ActorDeadError: NodeManagerDaemonActor (urn:uuid:e9844486-0662-4b73-bc46-8e64f57ac168) not found
2016-04-08_18:00:17.44211 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:1c85ed8e-3b54-43fb-80eb-9cd3a5a9738f)
2016-04-08_18:00:17.44212 2016-04-08 18:00:17 pykka[29660] DEBUG: Stopped ComputeNodeMonitorActor (urn:uuid:1c85ed8e-3b54-43fb-80eb-9cd3a5a9738f)
2016-04-08_18:00:17.44232 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:2bce315f-39a6-4daa-9027-acd3850e742e)
2016-04-08_18:00:17.44239 2016-04-08 18:00:17 pykka[29660] DEBUG: Stopped ComputeNodeMonitorActor (urn:uuid:2bce315f-39a6-4daa-9027-acd3850e742e)
2016-04-08_18:00:17.44307 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:6ea5fdd4-6cf8-4a35-bba5-d45bb64195c7)
2016-04-08_18:00:17.44308 2016-04-08 18:00:17 pykka[29660] DEBUG: Stopped ComputeNodeMonitorActor (urn:uuid:6ea5fdd4-6cf8-4a35-bba5-d45bb64195c7)
2016-04-08_18:00:17.44328 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:d9b7b106-1eae-4d5d-a86d-2aac9d334035)
2016-04-08_18:00:17.44333 2016-04-08 18:00:17 pykka[29660] DEBUG: Stopped ComputeNodeMonitorActor (urn:uuid:d9b7b106-1eae-4d5d-a86d-2aac9d334035)
2016-04-08_18:00:17.44562 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:1b2032d8-1698-4d07-90a2-92fc166301cd)
2016-04-08_18:00:17.44563 2016-04-08 18:00:17 pykka[29660] DEBUG: Stopped ComputeNodeMonitorActor (urn:uuid:1b2032d8-1698-4d07-90a2-92fc166301cd)
2016-04-08_18:00:17.44594 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:a3f8ae7a-fd6d-4366-a5c0-9dfc99aa8672)
2016-04-08_18:00:17.44602 2016-04-08 18:00:17 pykka[29660] DEBUG: Stopped ComputeNodeMonitorActor (urn:uuid:a3f8ae7a-fd6d-4366-a5c0-9dfc99aa8672)
2016-04-08_18:00:17.44660 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:aaee2dfd-366a-4025-8799-70f82053ea68)
2016-04-08_18:00:17.44662 2016-04-08 18:00:17 pykka[29660] DEBUG: Stopped ComputeNodeMonitorActor (urn:uuid:aaee2dfd-366a-4025-8799-70f82053ea68)
2016-04-08_18:00:17.44685 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:ae7b61a0-a9b3-424c-98fe-be7f02f9593c)
2016-04-08_18:00:17.44694 2016-04-08 18:00:17 pykka[29660] DEBUG: Stopped ComputeNodeMonitorActor (urn:uuid:ae7b61a0-a9b3-424c-98fe-be7f02f9593c)
2016-04-08_18:00:17.44744 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:670df52f-e04d-46c2-96f8-ec56cf02f833)
2016-04-08_18:00:17.44745 2016-04-08 18:00:17 pykka[29660] DEBUG: Stopped ComputeNodeMonitorActor (urn:uuid:670df52f-e04d-46c2-96f8-ec56cf02f833)
2016-04-08_18:00:17.44786 2016-04-08 18:00:17 pykka[29660] DEBUG: Unregistered ComputeNodeMonitorActor (urn:uuid:8ae760f3-6394-4838-bfa6-079f2ad8643a)
--
2016-04-08_18:01:48.93823 Traceback (most recent call last):
2016-04-08_18:01:48.93823 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/computenode/dispatch/__init__.py", line 281, in throttle_wrapper
2016-04-08_18:01:48.93824 result = orig_func(self, *args, **kwargs)
2016-04-08_18:01:48.93824 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/computenode/dispatch/__init__.py", line 296, in sync_node
2016-04-08_18:01:48.93824 return self._cloud.sync_node(cloud_node, arvados_node)
2016-04-08_18:01:48.93825 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/computenode/driver/gce.py", line 149, in sync_node
2016-04-08_18:01:48.93825 method='POST', data=metadata_req)
2016-04-08_18:01:48.93825 File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 937, in async_request
2016-04-08_18:01:48.93826 response = request(**kwargs)
2016-04-08_18:01:48.93826 File "/usr/local/lib/python2.7/dist-packages/libcloud/compute/drivers/gce.py", line 120, in request
2016-04-08_18:01:48.93826 response = super(GCEConnection, self).request(*args, **kwargs)
2016-04-08_18:01:48.93827 File "/usr/local/lib/python2.7/dist-packages/libcloud/common/google.py", line 692, in request
2016-04-08_18:01:48.93827 *args, **kwargs)
2016-04-08_18:01:48.93828 File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 799, in request
2016-04-08_18:01:48.93828 response = responseCls(**kwargs)
2016-04-08_18:01:48.93828 File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 145, in __init__
2016-04-08_18:01:48.93829 self.object = self.parse_body()
2016-04-08_18:01:48.93829 File "/usr/local/lib/python2.7/dist-packages/libcloud/common/google.py", line 253, in parse_body
2016-04-08_18:01:48.93829 raise GoogleBaseError(message, self.status, code)
2016-04-08_18:01:48.93830 GoogleBaseError: u'Supplied fingerprint does not match current metadata fingerprint.'
2016-04-08_18:01:49.63636 2016-04-08 18:01:49 JobQueueMonitorActor.38234512[17035] DEBUG: sending request
2016-04-08_18:01:49.64174 2016-04-08 18:01:49 CloudNodeListMonitorActor.30580448[17035] DEBUG: sending request
2016-04-08_18:01:49.64681 2016-04-08 18:01:49 ArvadosNodeListMonitorActor.35361056[17035] DEBUG: sending request
2016-04-08_18:01:49.75908 2016-04-08 18:01:49 JobQueueMonitorActor.38234512[17035] DEBUG: Calculated wishlist: n1-standard-8, n1-standard-8, n1-standard-8, n1-standard-8, n1-standard-8, n1-standard-8, n1-standard-8, n1-standard-8, n1-standard-8, n1-standard-8
2016-04-08_18:01:49.75920 2016-04-08 18:01:49 JobQueueMonitorActor.38234512[17035] INFO: got response with 1 items in 0.124391078949 seconds, next poll at 2016-04-08 18:01:59
2016-04-08_18:01:49.75970 2016-04-08 18:01:49 NodeManagerDaemonActor.49c466c95e79[17035] INFO: n1-highmem-32: wishlist 0, up 0 (booting 0, idle 0, busy 0), shutting down 0
2016-04-08_18:01:49.76002 2016-04-08 18:01:49 NodeManagerDaemonActor.49c466c95e79[17035] INFO: n1-standard-32: wishlist 0, up 0 (booting 0, idle 0, busy 0), shutting down 0
2016-04-08_18:01:49.76020 2016-04-08 18:01:49 NodeManagerDaemonActor.49c466c95e79[17035] INFO: n1-highmem-16: wishlist 0, up 0 (booting 0, idle 0, busy 0), shutting down 0
2016-04-08_18:01:49.76037 2016-04-08 18:01:49 NodeManagerDaemonActor.49c466c95e79[17035] INFO: n1-standard-16: wishlist 0, up 0 (booting 0, idle 0, busy 0), shutting down 0
--
</pre>