https://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422017-02-26T19:06:24ZArvadosArvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=487282017-02-26T19:06:24ZWard Vandewegeward@curii.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/48728/diff?detail_id=46887">diff</a>)</li></ul> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=487312017-02-27T15:30:40ZWard Vandewegeward@curii.com
<ul><li><strong>Subject</strong> changed from <i>stale squeue processes on c97qk</i> to <i>stale squeue processes on c97qk caused by crunch-dispatch --jobs</i></li></ul> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=494672017-03-15T00:02:32ZTom Morristfmorris@veritasgenetics.com
<ul><li><strong>Project</strong> changed from <i>40</i> to <i>Arvados</i></li><li><strong>Subject</strong> changed from <i>stale squeue processes on c97qk caused by crunch-dispatch --jobs</i> to <i>Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"</i></li><li><strong>Description</strong> updated (<a title="View differences" href="/journals/49467/diff?detail_id=47673">diff</a>)</li><li><strong>Target version</strong> set to <i>2017-03-29 sprint</i></li></ul> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=495962017-03-15T19:52:52ZLucas Di Pentimalucas.dipentima@curii.com
<ul><li><strong>Assigned To</strong> set to <i>Lucas Di Pentima</i></li></ul> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=497462017-03-17T20:59:08ZLucas Di Pentimalucas.dipentima@curii.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=497782017-03-20T19:02:21ZLucas Di Pentimalucas.dipentima@curii.com
<ul></ul><p>Updated at branch <code>11170-stale-squeue-procs</code> - <a class="changeset" title="11170: Set up a thread to reap the status of squeue runs so that they don't become zombie processes." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/f31475dfeb37c0e4d6b5244cba3bbd06e323b8e8">f31475d</a><br />Test run: <a class="external" href="https://ci.curoverse.com/job/developer-run-tests/195/">https://ci.curoverse.com/job/developer-run-tests/195/</a></p>
<p>Used <code>Process::detach</code> on both <code>File.popen(...)</code> cases so that the process status get collected by a separate thread on completion.<br />Ref: <a class="external" href="https://ruby-doc.org/core-2.1.1/Process.html#method-c-detach">https://ruby-doc.org/core-2.1.1/Process.html#method-c-detach</a></p> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=497832017-03-20T19:32:38ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p><code>squeue_jobs</code> and <code>scancel</code> should use the block form of <code>IO.popen()</code> so that it is closed automatically. See <code>stdout_s</code></p> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=497952017-03-20T23:41:57ZLucas Di Pentimalucas.dipentima@curii.com
<ul></ul><p>Updates at <a class="changeset" title="11170: Treat the squeue/scancel calls as files instead of treating them as processes. Calling clo..." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/79e53c0eed77396cb37f60b48be0c60fe7e0ab89">79e53c0</a><br />Test run: <a class="external" href="https://ci.curoverse.com/job/developer-run-tests/196/">https://ci.curoverse.com/job/developer-run-tests/196/</a></p> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=498492017-03-22T13:51:54ZLucas Di Pentimalucas.dipentima@curii.com
<ul></ul><p>New updates at <a class="changeset" title="11170: Updated tests to reflect the use of IO instead of File." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/077878d94771c25c25edfe01a98a523898916d9e">077878d</a><br />Test run: <a class="external" href="https://ci.curoverse.com/job/developer-run-tests/197/">https://ci.curoverse.com/job/developer-run-tests/197/</a></p>
<p>I've updated the tests so they stub the IO class instead of File.</p> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=498522017-03-22T14:09:56ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>Can we get</p>
<pre>
p = IO.popen(['squeue', '-a', '-h', '-o', '%j'])
begin
l = p.readlines.map {|line| line.strip}
ensure
p.close
end
</pre> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=498532017-03-22T14:21:00ZLucas Di Pentimalucas.dipentima@curii.com
<ul></ul><p>Done: <a class="changeset" title="11170: Calling close method from an ensure block." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/2741b54c38ed1e32cc9f0129614a00d84f51bca8">2741b54</a></p> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=498832017-03-23T13:31:25ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>Lucas Di Pentima wrote:</p>
<blockquote>
<p>Done: <a class="changeset" title="11170: Calling close method from an ensure block." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/2741b54c38ed1e32cc9f0129614a00d84f51bca8">2741b54</a></p>
</blockquote>
<p>LGTM</p> Arvados - Bug #11170: Stale squeue processes on c97qk caused by "crunch-dispatch --jobs"https://dev.arvados.org/issues/11170?journal_id=498842017-03-23T13:35:07ZLucas Di Pentimalucas.dipentima@curii.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul><p>Applied in changeset arvados|commit:83203f5c739ee0b0199e76babccb60e832a0de8e.</p>