Project

General

Profile

Actions

Bug #20432

closed

Improve CWL runner handling 503 errors

Added by Peter Amstutz over 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
CWL
Story points:
-
Release relationship:
Auto

Description

  1. recoverable errors, like getting container states, is not an error, and should only be logged as a warning
  2. requesting /arvados/v1/config isn't retried
  3. requesting /discovery/v1/apis/arvados/v1/rest isn't retried
    1. _thread._local object has no attribute 'api' -- throwing and handling AttributeError is intentional but maybe getattr would be better, in any event it is the API object construction that is ultimately failing
  4. requesting /arvados/v1/containers/current
  5. requesting /arvados/v1/users/current
  6. everything should use 8-10 retries
  7. FUSE command.py also calls users.current without retries -- have seen a few instances of FUSE failing to start due to 503 errors on fetching discovery doc or other endpoints required for startup

Related issues

Related to Arvados - Bug #12684: Let user specify a retry strategy on the client object, used for all API callsResolvedBrett Smith05/09/2023Actions
Actions

Also available in: Atom PDF