Project

General

Profile

Actions

Feature #20604

open

crunch-run retry timeout should increase for long-running containers

Added by Tom Clegg 11 months ago. Updated 11 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
Story points:
-

Description

For example, if a container finishes successfully after running for 48 hours, and crunch-run encounters transient errors while updating the container state to Complete via controller, it should surely retry for longer than the default 8 minutes before giving up.


Related issues

Related to Arvados - Bug #20540: crunch-run should sleep-and-retry after transient failures on API calls, especially when container is succeedingResolvedTom Clegg05/30/2023Actions
Actions

Also available in: Atom PDF