[DRAFT] [Crunch] User can configure task retries
For some of our long running jobs, we're finding that we get new types of transient failures that are not identified as "temporary" failure even if they would probably work if retried. We should add a job field indicating a "minimum number of retries", which will be honored even for "permanent" failures.
Alternately, an even simpler solution would be to add a flag which causes all failures to be treated as "temporary" for the purposes of retry. Question: should this behavior be default?