Project

General

Profile

Actions

Bug #13491

closed

arvbox deadlocks on parallel usage

Added by Michael Crusoe over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

Running the CWL conformance tests with arvbox takes an hour, which is far too long.
Peter Amstutz suggested running them in parallel mode, but that is producing deadlock errors.

Using the head of the primary branch and running the following commands:

docker pull arvados/arvbox-demo:latest
sdk/cwl/test_with_arvbox.sh --config localdemo --leave-running --junit-xml=/tmp/junit.xml -j4

Note the use of `-j4` which leads to parallel calls to arvados-cwl-runner.

Test failed: /tmp/cwltest/arv-cwl-containers --compute-checksum --outdir=/tmp/tmpQ_esdz --quiet v1.0/stderr-shortcut.cwl v1.0/empty.json
Test command line with stderr redirection, brief syntax
Returned non-zero
2018-05-16 07:09:21 arvados.cwl-runner ERROR: [container stderr-shortcut.cwl] got error <HttpError 422 when requesting https://localhost:8000/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected
DETAIL:  Process 12103 waits for ExclusiveLock on relation 16511 of database 16443; blocked by process 6629.
Process 6629 waits for ExclusiveLock on relation 16498 of database 16443; blocked by process 12103.
HINT:  See server log for query details.
>">
2018-05-16 07:09:21 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-05-16 07:09:21 cwltool WARNING: Final process status is permanentFail

Full log: https://ci.commonwl.org/job/arvados-conformance/836/console


Related issues

Related to Arvados - Bug #13594: PG::TRDeadlockDetected when running cwl tests in parallelResolvedTom CleggActions
Actions

Also available in: Atom PDF