https://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422016-04-08T18:09:22ZArvadosArvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=376122016-04-08T18:09:22ZBrett Smithbrett.smith@curii.com
<ul></ul><p>The last traceback you pasted, the one you based the subject on, is #6225.</p>
<p>The ActorDeadError above that is more interesting, that's almost always going to be a problem. More logs before that would be good to see.</p> Arvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=376132016-04-08T18:22:14ZNico César
<ul><li><strong>File</strong> <a href="/attachments/1122">@400000005707f6132670d7dc.s</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/1122/@400000005707f6132670d7dc.s">@400000005707f6132670d7dc.s</a> added</li><li><strong>Project</strong> changed from <i>Arvados</i> to <i>35</i></li><li><strong>Subject</strong> changed from <i>[Nodemanager] GCE returns "Supplied fingerprint does not match current metadata fingerprint"</i> to <i>[Nodemanager] GCE returns "ActorDead"</i></li></ul> Arvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=376142016-04-08T18:25:13ZNico César
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/37614/diff?detail_id=36715">diff</a>)</li></ul><p>yes... I guess the fingerprint it's irrelevant. Probably we should not transform that traceback into a WARNING or something.</p>
<p>I added a log that has the ActorDead. moved to Arvados private just because it has a log.</p> Arvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=376432016-04-08T21:33:12ZBrett Smithbrett.smith@curii.com
<ul><li><strong>Project</strong> changed from <i>35</i> to <i>Arvados</i></li><li><strong>Subject</strong> changed from <i>[Nodemanager] GCE returns "ActorDead"</i> to <i>[Nodemanager] 'unicode' object has no attribute 'id'</i></li></ul><p>The original error was aaallllll the way back here:</p>
<pre>2016-04-06_16:52:29.77830 2016-04-06 16:52:29 NodeManagerDaemonActor.8e64f57ac168[29660]
ERROR: while calculating nodes wanted for size <arvnodeman.jobqueue.CloudSizeWrapper ob
ject at 0x261ce90>
2016-04-06_16:52:29.77831 Traceback (most recent call last):
2016-04-06_16:52:29.77831 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/daem
on.py", line 326, in update_server_wishlist
2016-04-06_16:52:29.77831 nodes_wanted = self._nodes_wanted(size)
2016-04-06_16:52:29.77831 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/daem
on.py", line 285, in _nodes_wanted
2016-04-06_16:52:29.77832 total_price = self._total_price()
2016-04-06_16:52:29.77833 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/daemon.py", line 250, in _total_price
2016-04-06_16:52:29.77834 for i in (self.booted, self.cloud_nodes.nodes)
2016-04-06_16:52:29.77834 File "/usr/local/lib/python2.7/dist-packages/arvnodeman/daemon.py", line 251, in <genexpr>
2016-04-06_16:52:29.77834 for c in i.itervalues())
2016-04-06_16:52:29.77835 AttributeError: 'unicode' object has no attribute 'id'
</pre>
<p>From this point on, the daemon actor was dead. The traceback in the description only happened after someone tried to stop the process, and the stopping process failed because the daemon was already dead--the exception came from the signal handler.</p> Arvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=376452016-04-08T21:37:02ZBrett Smithbrett.smith@curii.com
<ul><li><strong>Subject</strong> changed from <i>[Nodemanager] 'unicode' object has no attribute 'id'</i> to <i>[Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSize</i></li></ul> Arvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=376482016-04-08T22:14:54ZBrett Smithbrett.smith@curii.com
<ul></ul><p>This is a bug introduced by <a class="issue tracker-1 status-3 priority-4 priority-default closed parent" title="Bug: [Node Manager] On node creation, node search fails, and unhandled exceptions cascade up to the Da... (Resolved)" href="https://dev.arvados.org/issues/8872">#8872</a>. The node returned by search_for doesn't have its size attribute fixed.</p> Arvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=376492016-04-08T22:15:20ZBrett Smithbrett.smith@curii.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li><li><strong>Assigned To</strong> set to <i>Brett Smith</i></li><li><strong>Target version</strong> set to <i>2016-04-13 sprint</i></li></ul> Arvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=377062016-04-12T13:24:16ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>Brett Smith wrote:</p>
<blockquote>
<p>The original error was aaallllll the way back here:</p>
<p>[...]</p>
<p>From this point on, the daemon actor was dead. The traceback in the description only happened after someone tried to stop the process, and the stopping process failed because the daemon was already dead--the exception came from the signal handler.</p>
</blockquote>
<p>Related to this, perhaps on_failure() should kill self on all unhandled exceptions and not just certain ones? Currently the policy is to handle recoverable exceptions before it gets to on_failure(), so once an exception gets to on_failure() it means an actor is going to die unexpectedly, which generally results in node manager getting wedged. (Filed a separate report <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: [Node manager] Always crash on_failure() (Closed)" href="https://dev.arvados.org/issues/8932">#8932</a>)</p> Arvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=377072016-04-12T13:25:51ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>The fix in 8912-node-manager-patch-nodes-wip <a class="changeset" title="8912: Node Manager search_for_now uses overridden methods. This wasn't possible in the original ..." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/8db9ad8e3ef0221bdf430c3bf3f0a527c7bc3a55">8db9ad8</a> LGTM.</p> Arvados - Bug #8913: [Nodemanager] On GCE: 'unicode' object has no attribute 'id', where we should have a NodeSizehttps://dev.arvados.org/issues/8913?journal_id=377212016-04-12T14:55:06ZBrett Smithbrett.smith@curii.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul><p>Applied in changeset arvados|commit:788b8d7247da8c4592b1f9d482fff4e1509f57f3.</p>