Feature #5064

[Crunch] Automatically restart jobs after internal/intermittent errors

Added by Bryan Cosca over 4 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
Start date:
01/22/2015
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

After a customer meeting, they asked if a pipeline crashes halfway through based on a node failure or some random internal failure not on their code, will they have to wait until next morning to run the pipeline or will it try to restart form there?

This would save informaticians a lot of time and disappointment because they will try to run their job before they go to sleep and if it fails before the morning, they do not want to wait a whole day again to restart that job, they would want arvados to take care of it themselves.

History

#1 Updated by Brett Smith over 4 years ago

  • Subject changed from Automatic restart on jobs if the error was internal to [Crunch] Automatically restart jobs after internal/intermittent errors
  • Category set to Crunch
  • Target version set to Arvados Future Sprints

Also available in: Atom PDF