Bug #13594

PG::TRDeadlockDetected when running cwl tests in parallel

Added by Ward Vandewege about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-
Release:
Release relationship:
Auto

Description

When I run the cwl tests in parallel, I am seeing occasional a-c-r failures.

This is on api server version 1.1.4.20180605165723

$ time ./projects/arvados-dev/jenkins/run-cwl-test.sh -d -l vwxyz -j 14
2018-06-08 13:03:55 Loading ARVADOS_API_HOST and ARVADOS_API_TOKEN
2018-06-08 13:03:55 Running 'if [[ ! -e common-workflow-language ]]; then git clone --depth 1 https://github.com/common-workflow-language/common-workflow-language.git; fi' locally
2018-06-08 13:03:55 Running 'printf "%s\n%s\n" '#!/bin/sh' 'exec arvados-cwl-runner --compute-checksum --disable-reuse "$@"' > ~ward/arvados-cwl-runner-with-checksum.sh; chmod 755 ~ward/arvados-cwl-runner-with-checksum.sh' locally
2018-06-08 13:03:55 Running 'cd common-workflow-language; git pull; ARVADOS_API_HOST=104.198.76.112:444 ARVADOS_API_TOKEN=suppressed' locally
remote: Counting objects: 9, done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 9 (delta 6), reused 5 (delta 3), pack-reused 0
Unpacking objects: 100% (9/9), done.
From https://github.com/common-workflow-language/common-workflow-language
   e0cc5bd..4fe434e  master     -> origin/master
Updating e0cc5bd..4fe434e
Fast-forward
 v1.0/conformance_test_v1.0.yaml | 259 ++++++++++++++++++++--------------------
 1 file changed, 130 insertions(+), 129 deletions(-)
--- Running conformance test v1.0 on /home/ward/arvados-cwl-runner-with-checksum.sh ---
/usr/bin/arvados-cwl-runner 1.1.4.20180604132029, arvados-python-client
1.1.4.20180507184611, cwltool 1.0.20180524215209
Test [1/131] General test of command line generation
Test [2/131] Test nested prefixes with arrays
Test [3/131] Test nested command line bindings
Test [4/131] Test command line with optional input (missing)
Test [5/131] Test command line with optional input (provided)
Test [6/131] Test InitialWorkDirRequirement ExpressionEngineRequirement.engineConfig feature
Test [7/131] Test command execution in Docker with stdout redirection
Test [8/131] Test command execution in Docker with simplified syntax stdout redirection
Test [9/131] Test command execution in Docker with stdout redirection
Test [10/131] Test command line with stderr redirection
Test [11/131] Test command line with stderr redirection, brief syntax
Test [12/131] Test command line with stderr redirection, named brief syntax
Test [13/131] Test command execution in Docker with stdin and stdout redirection
Test [14/131] Test default usage of Any in expressions.

Test [15/131] Test explicitly passing null to Any type inputs with default values.
Test [16/131] Testing the string 'null' does not trip up an Any with a default value.
Test [17/131] Test Any without defaults can be unspecified.
Test [18/131] Test explicitly passing null to Any type without a default value.
Test [19/131] Testing the string 'null' does not trip up an Any without a default value.
Test [20/131] Testing Any type compatibility in outputSource
Test [21/131] Test command execution in with stdin and stdout redirection
Test [22/131] Test ExpressionTool with Docker-based expression engine
Test [23/131] Test outputEval to transform output
Test [24/131] Test two step workflow with imported tools
Test [25/131] Test two step workflow with inline tools
Test [26/131] Test single step workflow with Scatter step
Test [27/131] Test single step workflow with Scatter step and two data links connected to
same input, default merge behavior

Test [28/131] Test single step workflow with Scatter step and two data links connected to
same input, nested merge behavior

Test [29/131] Test single step workflow with Scatter step and two data links connected to
same input, flattened merge behavior

Test [30/131] Test that no MultipleInputFeatureRequirement is necessary when
workflow step source is a single-item list

Test [31/131] Test workflow with default value for input parameter (missing)
Test [32/131] Test workflow with default value for input parameter (provided)
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpKvFSQ1 --quiet v1.0/count-lines5-wf.cwl v1.0/wc-job.json
Test workflow with default value for input parameter (provided)
Returned non-zero
2018-06-08 13:06:41 arvados.cwl-runner ERROR: Unhandled exception running task
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/task_queue.py", line 32, in task_queue_func
    task()
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/arvcontainer.py", line 463, in run
    ).execute(num_retries=self.arvrunner.num_retries)
  File "/usr/lib/python2.7/dist-packages/oauth2client/util.py", line 137, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/googleapiclient/http.py", line 840, in execute
    raise HttpError(resp, content, uri=self.uri)
ApiError: <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected
LINE 1: INSERT INTO "container_requests" ("name", "state", "containe...
                    ^
DETAIL:  Process 330 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 328.
Process 328 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 330.
HINT:  See server log for query details.
>">
2018-06-08 13:06:41 cwltool ERROR: Workflow error, try again with --debug for more information:
Workflow did not return a result.

Test [33/131] Test that workflow defaults override tool defaults
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpdVIBdL --quiet v1.0/count-lines5-wf.cwl v1.0/empty.json
Test workflow with default value for input parameter (missing)
Returned non-zero
2018-06-08 13:06:39 arvados.cwl-runner ERROR: Unhandled exception running task
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/task_queue.py", line 32, in task_queue_func
    task()
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/arvcontainer.py", line 463, in run
    ).execute(num_retries=self.arvrunner.num_retries)
  File "/usr/lib/python2.7/dist-packages/oauth2client/util.py", line 137, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/googleapiclient/http.py", line 840, in execute
    raise HttpError(resp, content, uri=self.uri)
ApiError: <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected
DETAIL:  Process 143 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 328.
Process 328 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 143.
HINT:  See server log for query details.
>">
2018-06-08 13:06:42 cwltool ERROR: Workflow error, try again with --debug for more information:
Workflow did not return a result.

Test [34/131] Test EnvVarRequirement
Test [35/131] Test workflow scatter with single scatter parameter
Test [36/131] Test workflow scatter with two scatter parameters and nested_crossproduct join method
Test [37/131] Test workflow scatter with two scatter parameters and flat_crossproduct join method
Test [38/131] Test workflow scatter with two scatter parameters and dotproduct join method
Test [39/131] Test workflow scatter with single empty list parameter
Test [40/131] Test workflow scatter with two scatter parameters and nested_crossproduct join method with second list empty
Test [41/131] Test workflow scatter with two scatter parameters and nested_crossproduct join method with first list empty
Test [42/131] Test workflow scatter with two scatter parameters, one of which is empty and flat_crossproduct join method
Test [43/131] Test workflow scatter with two empty scatter parameters and dotproduct join method
Test [44/131] Test Any type input parameter
Test [45/131] Test nested workflow
Test [46/131] Test requirement priority
Test [47/131] Test requirements override hints
Test [48/131] Test requirements on workflow steps
Test [49/131] Test default value on step input parameter
Test [50/131] Test use default value on step input parameter with empty source
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpRiIVhM --quiet v1.0/env-wf3.cwl v1.0/env-job.json
Test requirements on workflow steps
Returned non-zero
2018-06-08 13:08:02 arvados.cwl-runner ERROR: Unhandled exception running task
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/task_queue.py", line 32, in task_queue_func
    task()
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/arvcontainer.py", line 463, in run
    ).execute(num_retries=self.arvrunner.num_retries)
  File "/usr/lib/python2.7/dist-packages/oauth2client/util.py", line 137, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/googleapiclient/http.py", line 840, in execute
    raise HttpError(resp, content, uri=self.uri)
ApiError: <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected
DETAIL:  Process 328 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 389.
Process 389 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 328.
HINT:  See server log for query details.
>">
2018-06-08 13:08:04 cwltool ERROR: Workflow error, try again with --debug for more information:
Workflow did not return a result.

Test [51/131] Test use default value on step input parameter with null source
Test [52/131] Test default value on step input parameter overridden by provided source
Test [53/131] Test simple workflow
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpYRN1vc --quiet v1.0/scatter-wf1.cwl v1.0/scatter-job1.json
Test workflow scatter with single scatter parameter
Returned non-zero
2018-06-08 13:08:25 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:08:25 cwltool WARNING: Final process status is permanentFail

Test [54/131] Test unknown hints are ignored.
Test [55/131] Test InitialWorkDirRequirement linking input files and capturing secondaryFiles
on input and output. Also tests the use of a variety of parameter references
and expressions in the secondaryFiles field.

Test [56/131] Test InitialWorkDirRequirement with expression in filename.

Test [57/131] Test if trailing newline is present in file entry in InitialWorkDir
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmp1VmbUt --quiet v1.0/env-wf1.cwl v1.0/env-job.json
Test requirement priority
Returned non-zero
2018-06-08 13:08:30 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:08:31 cwltool WARNING: Final process status is permanentFail

Test [58/131] Test inline expressions

Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmp81MTs9 --quiet v1.0/iwdr-entry.cwl v1.0/string-job.json
Test if trailing newline is present in file entry in InitialWorkDir
Returned non-zero
2018-06-08 13:08:35 arvados.cwl-runner ERROR: [container iwdr-entry.cwl] got error <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected at character 13
DETAIL:  Process 327 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 330.
Process 330 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 327.
HINT:  See server log for query details.
>">
2018-06-08 13:08:35 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:08:35 cwltool WARNING: Final process status is permanentFail

Test [59/131] Test SchemaDefRequirement definition used in tool parameter

Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpcN9gUL --quiet v1.0/wc4-tool.cwl v1.0/wc-job.json
Test inline expressions

Returned non-zero
2018-06-08 13:08:37 arvados.cwl-runner ERROR: [container wc4-tool.cwl] got error <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected at character 13
DETAIL:  Process 143 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 339.
Process 339 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 143.
HINT:  See server log for query details.
>">
2018-06-08 13:08:37 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:08:37 cwltool WARNING: Final process status is permanentFail

Test [60/131] Test SchemaDefRequirement definition used in workflow parameter

Test [61/131] Test parameter evaluation, no support for JS expressions

Test [62/131] Test parameter evaluation, with support for JS expressions

Test [63/131] Test metadata
Test [64/131] Test simple format checking.

Test [65/131] Test format checking against ontology using subclassOf.

Test [66/131] Test format checking against ontology using equivalentClass.

Test [67/131] Test optional output file and optional secondaryFile on output.

Test [68/131] Test that second expression in concatenated valueFrom is not ignored
Test [69/131] Test valueFrom on workflow step.
Test [70/131] Test valueFrom on workflow step with multiple sources
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpa85k3O --quiet v1.0/revsort.cwl v1.0/revsort-job.json
Test simple workflow
Returned non-zero
2018-06-08 13:09:03 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:09:03 cwltool WARNING: Final process status is permanentFail

Test [71/131] Test valueFrom on workflow step referencing other inputs
Test [72/131] Test record type output binding.
Test [73/131] Test support for reading cwl.output.json when running in a Docker container
and just 'path' is provided.

Test [74/131] Test support for reading cwl.output.json when running in a Docker container
and just 'location' is provided.

Test [75/131] Test support for returning multiple glob patterns from expression
Test [76/131] Test workflow scatter with single scatter parameter and valueFrom on step input
Test [77/131] Test workflow scatter with two scatter parameters and nested_crossproduct join method and valueFrom on step input
Test [78/131] Test workflow scatter with two scatter parameters and flat_crossproduct join method and valueFrom on step input
Test [79/131] Test workflow scatter with two scatter parameters and dotproduct join method and valueFrom on step input
Test [80/131] Test workflow scatter with single scatter parameter and valueFrom on step input
Test [81/131] Test valueFrom eval on scattered input parameter
Test [82/131] Test workflow two input files with same name.
Test [83/131] Test directory input with parameter reference
Test [84/131] Test directory input in Docker
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpKIKTtS --quiet v1.0/scatter-valuefrom-wf6.cwl v1.0/scatter-valuefrom-job3.json
Test valueFrom eval on scattered input parameter
Returned non-zero
2018-06-08 13:09:36 arvados.cwl-runner ERROR: Unhandled exception running task
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/task_queue.py", line 32, in task_queue_func
    task()
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/arvcontainer.py", line 463, in run
    ).execute(num_retries=self.arvrunner.num_retries)
  File "/usr/lib/python2.7/dist-packages/oauth2client/util.py", line 137, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/googleapiclient/http.py", line 840, in execute
    raise HttpError(resp, content, uri=self.uri)
ApiError: <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected
DETAIL:  Process 328 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 331.
Process 331 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 328.
HINT:  See server log for query details.
>">
2018-06-08 13:09:36 cwltool ERROR: Workflow error, try again with --debug for more information:
Workflow did not return a result.

Test [85/131] Test directory output
Test [86/131] Test directories in secondaryFiles
Test [87/131] Test dynamic initial work dir
Test [88/131] Test writable staged files.
Test [89/131] Test file literal as input
Test [90/131] Test expression in InitialWorkDir listing
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpNHDG18 --quiet v1.0/dir5.cwl v1.0/dir-job.yml
Test dynamic initial work dir
Returned non-zero
2018-06-08 13:09:56 arvados.cwl-runner ERROR: [container dir5.cwl] got error <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected
DETAIL:  Process 327 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 143.
Process 143 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 327.
HINT:  See server log for query details.
>">
2018-06-08 13:09:56 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:09:58 cwltool WARNING: Final process status is permanentFail

Test [91/131] Test nameroot/nameext expression in arguments, stdout
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpQNLjVf --quiet v1.0/cat3-tool.cwl v1.0/file-literal.yml
Test file literal as input
Returned non-zero
2018-06-08 13:09:58 arvados.cwl-runner ERROR: [container cat3-tool.cwl] got error <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected
DETAIL:  Process 143 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 525.
Process 525 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 143.
HINT:  See server log for query details.
>">
2018-06-08 13:09:58 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:10:00 cwltool WARNING: Final process status is permanentFail

Test [92/131] Test directory input with inputBinding
Test [93/131] Test command line generation of array-of-arrays
Test [94/131] Test $HOME and $TMPDIR are set correctly
Test [95/131] Test $HOME and $TMPDIR are set correctly in Docker
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpRwpcja --quiet v1.0/nameroot.cwl v1.0/wc-job.json
Test nameroot/nameext expression in arguments, stdout
Returned non-zero
2018-06-08 13:10:08 arvados.cwl-runner ERROR: [container nameroot.cwl] got error <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected
DETAIL:  Process 327 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 516.
Process 516 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 327.
HINT:  See server log for query details.
>">
2018-06-08 13:10:08 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:10:12 cwltool WARNING: Final process status is permanentFail

Test [96/131] Test that expressionLib requirement of individual tool step overrides expressionLib of workflow.
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpNz2Ojp --quiet 'v1.0/js-expr-req-wf.cwl#wf' v1.0/empty.json
Test that expressionLib requirement of individual tool step overrides expressionLib of workflow.
Returned non-zero
2018-06-08 13:10:22 arvados.cwl-runner ERROR: Unhandled exception running task
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/task_queue.py", line 32, in task_queue_func
    task()
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/arvcontainer.py", line 463, in run
    ).execute(num_retries=self.arvrunner.num_retries)
  File "/usr/lib/python2.7/dist-packages/oauth2client/util.py", line 137, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/googleapiclient/http.py", line 840, in execute
    raise HttpError(resp, content, uri=self.uri)
ApiError: <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected at character 13
DETAIL:  Process 516 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 327.
Process 327 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 516.
HINT:  See server log for query details.
>">
2018-06-08 13:10:24 cwltool ERROR: Workflow error, try again with --debug for more information:
Workflow did not return a result.

Test [97/131] Test output of InitialWorkDir
Test [98/131] Test embedded subworkflow
Test [99/131] Test secondaryFiles on array of files.
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpmo60IN --quiet 'v1.0/scatter-valuefrom-wf4.cwl#main' v1.0/scatter-valuefrom-job2.json
Test workflow scatter with two scatter parameters and dotproduct join method and valueFrom on step input
Returned non-zero
2018-06-08 13:10:32 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:10:32 cwltool WARNING: Final process status is permanentFail

Test [100/131] Test directory literal output created by ExpressionTool
Test [101/131] Test file literal output created by ExpressionTool
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmp8FlN2t --quiet v1.0/scatter-valuefrom-wf2.cwl v1.0/scatter-valuefrom-job2.json
Test workflow scatter with two scatter parameters and nested_crossproduct join method and valueFrom on step input
Returned non-zero
2018-06-08 13:10:37 arvados.cwl-runner WARNING: Overall process status is permanentFail
2018-06-08 13:10:37 cwltool WARNING: Final process status is permanentFail

Test [102/131] Test dockerOutputDirectory
Test [103/131] Test hints with $import
Test [104/131] Test warning instead of error when default path is not found
Test [105/131] Test InlineJavascriptRequirement with multiple expressions in the same tool
Test [106/131] Test if a writable input directory is recursivly copied and writable
Test [107/131] Test that missing parameters are null (not undefined) in expression
Test [108/131] Test that provided parameter is not null in expression
Test [109/131] Test compound workflow document
Test [110/131] Test that nameroot and nameext are generated from basename at execution time by the runner
Test [111/131] Test that file path in $(inputs) for initialworkdir is in $(outdir).
Test [112/131] Test single step workflow with Scatter step and two data links connected to
same input, flattened merge behavior. Workflow inputs are set as list

Test [113/131] Test step input with multiple sources with multiple types
Test [114/131] Test that shell directives are not interpreted.
Test [115/131] Test that shell directives are quoted.
Test failed: /home/ward/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmpjFUove --quiet v1.0/sum-wf.cwl v1.0/sum-job.json
Test step input with multiple sources with multiple types
Returned non-zero
2018-06-08 13:10:59 arvados.cwl-runner ERROR: Unhandled exception running task
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/task_queue.py", line 32, in task_queue_func
    task()
  File "/usr/lib/python2.7/dist-packages/arvados_cwl/arvcontainer.py", line 463, in run
    ).execute(num_retries=self.arvrunner.num_retries)
  File "/usr/lib/python2.7/dist-packages/oauth2client/util.py", line 137, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/googleapiclient/http.py", line 840, in execute
    raise HttpError(resp, content, uri=self.uri)
ApiError: <HttpError 422 when requesting https://104.198.76.112:444/arvados/v1/container_requests?alt=json returned "#<PG::TRDeadlockDetected: ERROR:  deadlock detected
LINE 1: INSERT INTO "container_requests" ("name", "state", "containe...
                    ^
DETAIL:  Process 330 waits for RowExclusiveLock on relation 16498 of database 16387; blocked by process 331.
Process 331 waits for ExclusiveLock on relation 16511 of database 16387; blocked by process 330.
HINT:  See server log for query details.
>">
2018-06-08 13:10:59 cwltool ERROR: Workflow error, try again with --debug for more information:
Workflow did not return a result.

Test [116/131] Test empty writable dir with InitialWorkDirRequirement
Test [117/131] Test empty writable dir with InitialWorkDirRequirement inside Docker
Test [118/131] Test dynamic resource reqs referencing inputs
Test [119/131] Test file literal as input without Docker
Test [120/131] Test that OutputBinding.glob is sorted as specified by POSIX
Test [121/131] Test InitialWorkDirRequirement with a nested directory structure from another step
Test [122/131] Test that boolean flags do not appear on command line if inputBinding is empty and not null
Test [123/131] Test that expression engine does not fail to evaluate reference to self with unprovided input
Test [124/131] Test successCodes
Test [125/131] Test simple workflow with a dynamic resource requirement
Test [126/131] Test that empty array input does not add anything to command line
Test [127/131] Test that ResourceRequirement on a step level redefines requirement on the workflow level

Test [128/131] Test valueFrom with constant value overriding provided array inputs
Test [129/131] Test dynamic resource reqs referencing the size of Files inside a Directory
Test [130/131] Test that it is not an error to connect a parameter to a workflow
step, even if the parameter doesn't appear in the `run` process
inputs.

Test [131/131] Test that parameters that doesn't appear in the `run` process
inputs are not present in the input object used to run the tool.

115 tests passed, 16 failures, 0 unsupported features

1 tool tests failed
2018-06-08 13:12:12 ERROR running command locally: exit code 1
Failed ./run_test.sh -j14 RUNNER=/home/ward/arvados-cwl-runner-with-checksum.sh

real    8m17.011s
user    7m26.259s
sys    0m30.876s

Related issues

Related to Arvados - Bug #13164: [API] dispatch sometimes tries to run cancelled containers at the expense of pending containersResolved05/14/2018

Related to Arvados - Bug #13500: crunch-dispatch-slurm PG::TRDeadlockDetected: ERROR: deadlock detectedClosed

Related to Arvados - Bug #13491: arvbox deadlocks on parallel usageResolved

Associated revisions

Revision 85c13201 (diff)
Added by Tom Clegg about 1 year ago

13594: Remove table lock.

refs #13594

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Ward Vandewege about 1 year ago

  • Description updated (diff)

#2 Updated by Ward Vandewege about 1 year ago

  • Related to Bug #13164: [API] dispatch sometimes tries to run cancelled containers at the expense of pending containers added

#3 Updated by Ward Vandewege about 1 year ago

  • Related to Bug #13500: crunch-dispatch-slurm PG::TRDeadlockDetected: ERROR: deadlock detected added

#4 Updated by Ward Vandewege about 1 year ago

On arvados-api-server 1.1.4.20180608190512-8:

$ time ./projects/arvados-dev/jenkins/run-cwl-test.sh -d -l vwxyz -j 14
2018-06-08 17:14:03 Loading ARVADOS_API_HOST and ARVADOS_API_TOKEN
2018-06-08 17:14:03 Running 'if [[ ! -e common-workflow-language ]]; then git clone --depth 1 https://github.com/common-workflow-language/common-workflow-language.git; fi' locally
2018-06-08 17:14:03 Running 'printf "%s\n%s\n" '#!/bin/sh' 'exec arvados-cwl-runner --compute-checksum --disable-reuse "$@"' > ~ward/arvados-cwl-runner-with-checksum.sh; chmod 755 ~ward/arvados-cwl-runner-with-checksum.sh' locally
2018-06-08 17:14:03 Running 'cd common-workflow-language; git pull; ARVADOS_API_HOST=104.198.76.112:444 ARVADOS_API_TOKEN=suppressed' locally
Already up-to-date.
--- Running conformance test v1.0 on /home/ward/arvados-cwl-runner-with-checksum.sh ---
/usr/bin/arvados-cwl-runner 1.1.4.20180604132029, arvados-python-client
1.1.4.20180507184611, cwltool 1.0.20180524215209
Test [1/131] General test of command line generation
Test [2/131] Test nested prefixes with arrays
Test [3/131] Test nested command line bindings
Test [4/131] Test command line with optional input (missing)
Test [5/131] Test command line with optional input (provided)
Test [6/131] Test InitialWorkDirRequirement ExpressionEngineRequirement.engineConfig feature
Test [8/131] Test command execution in Docker with simplified syntax stdout redirection
Test [7/131] Test command execution in Docker with stdout redirection
Test [9/131] Test command execution in Docker with stdout redirection
Test [10/131] Test command line with stderr redirection
Test [11/131] Test command line with stderr redirection, brief syntax
Test [12/131] Test command line with stderr redirection, named brief syntax
Test [13/131] Test command execution in Docker with stdin and stdout redirection
Test [14/131] Test default usage of Any in expressions.
Test [15/131] Test explicitly passing null to Any type inputs with default values.
Test [16/131] Testing the string 'null' does not trip up an Any with a default value.
Test [17/131] Test Any without defaults can be unspecified.
Test [18/131] Test explicitly passing null to Any type without a default value.
Test [19/131] Testing the string 'null' does not trip up an Any without a default value.
Test [20/131] Testing Any type compatibility in outputSource
Test [21/131] Test command execution in with stdin and stdout redirection
Test [22/131] Test ExpressionTool with Docker-based expression engine
Test [23/131] Test outputEval to transform output
Test [24/131] Test two step workflow with imported tools
Test [25/131] Test two step workflow with inline tools
Test [26/131] Test single step workflow with Scatter step
Test [27/131] Test single step workflow with Scatter step and two data links connected to
same input, default merge behavior

Test [28/131] Test single step workflow with Scatter step and two data links connected to
same input, nested merge behavior

Test [29/131] Test single step workflow with Scatter step and two data links connected to
same input, flattened merge behavior

Test [30/131] Test that no MultipleInputFeatureRequirement is necessary when
workflow step source is a single-item list

Test [31/131] Test workflow with default value for input parameter (missing)
Test [32/131] Test workflow with default value for input parameter (provided)
Test [33/131] Test that workflow defaults override tool defaults
Test [34/131] Test EnvVarRequirement
Test [35/131] Test workflow scatter with single scatter parameter
Test [36/131] Test workflow scatter with two scatter parameters and nested_crossproduct join method
Test [37/131] Test workflow scatter with two scatter parameters and flat_crossproduct join method
Test [38/131] Test workflow scatter with two scatter parameters and dotproduct join method
Test [39/131] Test workflow scatter with single empty list parameter
Test [40/131] Test workflow scatter with two scatter parameters and nested_crossproduct join method with second list empty
Test [41/131] Test workflow scatter with two scatter parameters and nested_crossproduct join method with first list empty
Test [42/131] Test workflow scatter with two scatter parameters, one of which is empty and flat_crossproduct join method
Test [43/131] Test workflow scatter with two empty scatter parameters and dotproduct join method

Test [44/131] Test Any type input parameter
Test [45/131] Test nested workflow
Test [46/131] Test requirement priority
Test [47/131] Test requirements override hints
Test [48/131] Test requirements on workflow steps
Test [49/131] Test default value on step input parameter
Test [50/131] Test use default value on step input parameter with empty source
Test [51/131] Test use default value on step input parameter with null source
Test [52/131] Test default value on step input parameter overridden by provided source
Test [53/131] Test simple workflow
Test [54/131] Test unknown hints are ignored.
Test [55/131] Test InitialWorkDirRequirement linking input files and capturing secondaryFiles
on input and output. Also tests the use of a variety of parameter references
and expressions in the secondaryFiles field.

Test [56/131] Test InitialWorkDirRequirement with expression in filename.

Test [57/131] Test if trailing newline is present in file entry in InitialWorkDir
Test [58/131] Test inline expressions

Test [59/131] Test SchemaDefRequirement definition used in tool parameter

Test [60/131] Test SchemaDefRequirement definition used in workflow parameter

Test [61/131] Test parameter evaluation, no support for JS expressions

Test [62/131] Test parameter evaluation, with support for JS expressions

Test [63/131] Test metadata
Test [64/131] Test simple format checking.

Test [65/131] Test format checking against ontology using subclassOf.

Test [66/131] Test format checking against ontology using equivalentClass.

Test [67/131] Test optional output file and optional secondaryFile on output.

Test [68/131] Test that second expression in concatenated valueFrom is not ignored
Test [69/131] Test valueFrom on workflow step.
Test [70/131] Test valueFrom on workflow step with multiple sources
Test [71/131] Test valueFrom on workflow step referencing other inputs
Test [72/131] Test record type output binding.
Test [73/131] Test support for reading cwl.output.json when running in a Docker container
and just 'path' is provided.

Test [74/131] Test support for reading cwl.output.json when running in a Docker container
and just 'location' is provided.

Test [75/131] Test support for returning multiple glob patterns from expression
Test [76/131] Test workflow scatter with single scatter parameter and valueFrom on step input
Test [77/131] Test workflow scatter with two scatter parameters and nested_crossproduct join method and valueFrom on step input
Test [78/131] Test workflow scatter with two scatter parameters and flat_crossproduct join method and valueFrom on step input
Test [79/131] Test workflow scatter with two scatter parameters and dotproduct join method and valueFrom on step input
Test [80/131] Test workflow scatter with single scatter parameter and valueFrom on step input
Test [81/131] Test valueFrom eval on scattered input parameter
Test [82/131] Test workflow two input files with same name.
Test [83/131] Test directory input with parameter reference
Test [84/131] Test directory input in Docker
Test [85/131] Test directory output
Test [86/131] Test directories in secondaryFiles
Test [87/131] Test dynamic initial work dir
Test [88/131] Test writable staged files.

Test [89/131] Test file literal as input
Test [90/131] Test expression in InitialWorkDir listing
Test [91/131] Test nameroot/nameext expression in arguments, stdout
Test [92/131] Test directory input with inputBinding
Test [93/131] Test command line generation of array-of-arrays
Test [94/131] Test $HOME and $TMPDIR are set correctly
Test [95/131] Test $HOME and $TMPDIR are set correctly in Docker
Test [96/131] Test that expressionLib requirement of individual tool step overrides expressionLib of workflow.
Test [97/131] Test output of InitialWorkDir
Test [98/131] Test embedded subworkflow
Test [99/131] Test secondaryFiles on array of files.
Test [100/131] Test directory literal output created by ExpressionTool
Test [101/131] Test file literal output created by ExpressionTool
Test [102/131] Test dockerOutputDirectory
Test [103/131] Test hints with $import
Test [104/131] Test warning instead of error when default path is not found
Test [105/131] Test InlineJavascriptRequirement with multiple expressions in the same tool
Test [106/131] Test if a writable input directory is recursivly copied and writable
Test [107/131] Test that missing parameters are null (not undefined) in expression
Test [108/131] Test that provided parameter is not null in expression
Test [109/131] Test compound workflow document
Test [110/131] Test that nameroot and nameext are generated from basename at execution time by the runner
Test [111/131] Test that file path in $(inputs) for initialworkdir is in $(outdir).
Test [112/131] Test single step workflow with Scatter step and two data links connected to
same input, flattened merge behavior. Workflow inputs are set as list

Test [113/131] Test step input with multiple sources with multiple types
Test [114/131] Test that shell directives are not interpreted.
Test [115/131] Test that shell directives are quoted.
Test [116/131] Test empty writable dir with InitialWorkDirRequirement
Test [117/131] Test empty writable dir with InitialWorkDirRequirement inside Docker
Test [118/131] Test dynamic resource reqs referencing inputs
Test [119/131] Test file literal as input without Docker
Test [120/131] Test that OutputBinding.glob is sorted as specified by POSIX
Test [121/131] Test InitialWorkDirRequirement with a nested directory structure from another step
Test [122/131] Test that boolean flags do not appear on command line if inputBinding is empty and not null
Test [123/131] Test that expression engine does not fail to evaluate reference to self with unprovided input
Test [124/131] Test successCodes
Test [125/131] Test simple workflow with a dynamic resource requirement
Test [126/131] Test that empty array input does not add anything to command line
Test [127/131] Test that ResourceRequirement on a step level redefines requirement on the workflow level
Test [128/131] Test valueFrom with constant value overriding provided array inputs
Test [129/131] Test dynamic resource reqs referencing the size of Files inside a Directory
Test [130/131] Test that it is not an error to connect a parameter to a workflow
step, even if the parameter doesn't appear in the `run` process
inputs.

Test [131/131] Test that parameters that doesn't appear in the `run` process
inputs are not present in the input object used to run the tool.

All tests passed

All tool tests succeeded
2018-06-08 17:20:51 Running 'cd arvados/sdk/cwl/tests; export ARVADOS_API_HOST=104.198.76.112:444 ARVADOS_API_TOKEN=suppressed' locally
bash: line 0: cd: arvados/sdk/cwl/tests: No such file or directory
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2018-06-08 17:20:51 ERROR running command locally: exit code 128

real    6m47.735s
user    9m2.149s
sys    0m31.261s

#5 Updated by Ward Vandewege about 1 year ago

  • Status changed from New to Resolved

#6 Updated by Ward Vandewege about 1 year ago

  • Related to Bug #13491: arvbox deadlocks on parallel usage added

#7 Updated by Tom Clegg about 1 year ago

  • Assigned To set to Tom Clegg
  • Target version set to 2018-06-20 Sprint

#8 Updated by Tom Morris about 1 year ago

  • Release set to 13

Also available in: Atom PDF