Project

General

Profile

Actions

Bug #16132

closed

ce8i5 test timeouts

Added by Peter Amstutz about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-

Description

The run-cwl-test-ce8i5 job runs the CWL test suite against the test cluster ce8i5.

Tests have been failing with timeouts. There is a 900s (15m) limit on tests, so that means it is taking longer than that to either bring up a node or schedule a container.

It is unclear if the issue is Azure delay, arvados-dispatch-cloud delay, some interaction between the two (waiting for something it shouldn't), stalled ssh connections, or what.

One way to begin diagnosing this would be to start collecting prometheus metrics, and see if we can capture the delay on a graph.

Actions

Also available in: Atom PDF