https://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422023-04-18T17:39:40ZArvadosArvados - Bug #20378: crunch-run maximum downtime tolerancehttps://dev.arvados.org/issues/20378?journal_id=1142832023-04-18T17:39:40ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul> Arvados - Bug #20378: crunch-run maximum downtime tolerancehttps://dev.arvados.org/issues/20378?journal_id=1142842023-04-18T17:40:57ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Category</strong> set to <i>Crunch</i></li><li><strong>Subject</strong> changed from <i>crunch-</i> to <i>crunch-run maximum downtime tolerance</i></li></ul> Arvados - Bug #20378: crunch-run maximum downtime tolerancehttps://dev.arvados.org/issues/20378?journal_id=1142852023-04-18T17:48:51ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/114285/diff?detail_id=111104">diff</a>)</li></ul> Arvados - Bug #20378: crunch-run maximum downtime tolerancehttps://dev.arvados.org/issues/20378?journal_id=1142902023-04-18T21:20:13ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>New</i></li></ul> Arvados - Bug #20378: crunch-run maximum downtime tolerancehttps://dev.arvados.org/issues/20378?journal_id=1142962023-04-19T13:35:09ZBrett Smithbrett.smith@curii.com
<ul></ul><p>I think downtime tolerance should err on the long side, and perhaps be a function of how long the job has run, maybe with a cap on minimum and maximum. Having a compute node sitting around for the API server to come back is expensive and annoying, but it's not nearly as annoying as losing a week's worth of compute because the API server was unreachable for a few hours at the end of a job.</p>