Bug #18238

CWL integration test failing

Added by Peter Amstutz 22 days ago. Updated about 5 hours ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
CWL
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-
Release relationship:
Auto

Related issues

Blocks Arvados - Bug #18264: [CI] simplify the way we run the CWL testsIn Progress

History

#2 Updated by Peter Amstutz 17 days ago

Can reproduce locally, using arvbox and singularity:

Test [17/43] RunInSingleContainer handles dynamic resource requests on step
Test 17 failed: arvados-cwl-runner --disable-reuse --compute-checksum --api=containers --outdir=/tmp/tmphqdknf_c --quiet wf/runin-reqs-wf.cwl
RunInSingleContainer handles dynamic resource requests on step
Returned non-zero
ERROR [container runin-reqs-wf.cwl] (x2z00-dz642-i4adalavi61zxpd) error log:

  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.758696399Z stderr   File "/usr/share/python2.7/dist/python-arvados-cwl-runner/lib/python2.7/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.758724310Z stderr     return wrapped(*args, **kwargs)
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.758728058Z stderr   File "/usr/share/python2.7/dist/python-arvados-cwl-runner/lib/python2.7/site-packages/googleapiclient/discovery.py", line 222, in build
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.758769735Z stderr     requested_url, discovery_http, cache_discovery, cache)
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.758774574Z stderr   File "/usr/share/python2.7/dist/python-arvados-cwl-runner/lib/python2.7/site-packages/googleapiclient/discovery.py", line 269, in _retrieve_discovery_doc
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.758822315Z stderr     resp, content = http.request(actual_url)
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.758822315Z stderr   File "/usr/share/python2.7/dist/python-arvados-cwl-runner/lib/python2.7/site-packages/arvados/api.py", line 85, in _intercept_http_request
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.758846194Z stderr     return self.orig_http_request(uri, method, headers=headers, **kwargs)
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.758848796Z stderr   File "/usr/share/python2.7/dist/python-arvados-cwl-runner/lib/python2.7/site-packages/httplib2/__init__.py", line 2135, in request
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759172430Z stderr     cachekey,
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759172430Z stderr   File "/usr/share/python2.7/dist/python-arvados-cwl-runner/lib/python2.7/site-packages/httplib2/__init__.py", line 1796, in _request
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759431490Z stderr     conn, request_uri, method, body, headers
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759431490Z stderr   File "/usr/share/python2.7/dist/python-arvados-cwl-runner/lib/python2.7/site-packages/httplib2/__init__.py", line 1701, in _conn_request
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759674790Z stderr     conn.connect()
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759674790Z stderr   File "/usr/share/python2.7/dist/python-arvados-cwl-runner/lib/python2.7/site-packages/httplib2/__init__.py", line 1368, in connect
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759873843Z stderr     self.host,
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759873843Z stderr   File "/usr/share/python2.7/dist/python-arvados-cwl-runner/lib/python2.7/site-packages/httplib2/__init__.py", line 96, in _ssl_wrap_socket
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759888451Z stderr     return context.wrap_socket(sock, server_hostname=hostname)
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.759890986Z stderr   File "/usr/lib/python2.7/ssl.py", line 363, in wrap_socket
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.760945593Z stderr     _context=self)
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.760945593Z stderr   File "/usr/lib/python2.7/ssl.py", line 611, in __init__
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.761053290Z stderr     self.do_handshake()
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.761053290Z stderr   File "/usr/lib/python2.7/ssl.py", line 848, in do_handshake
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.761184705Z stderr     match_hostname(self.getpeercert(), self.server_hostname)
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.761184705Z stderr   File "/usr/lib/python2.7/ssl.py", line 286, in match_hostname
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.761241630Z stderr     % (hostname, dnsnames[0]))
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.761265120Z stderr ssl.CertificateError: hostname '172.17.0.2' doesn't match 'localhost'
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.802643152Z stderr INFO [job sleep1] Max memory used: 23MiB
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.803634510Z stderr WARNING [job sleep1] completed permanentFail
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.803830502Z stderr WARNING [step sleep1] completed permanentFail
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.804036901Z stderr INFO [workflow ] completed permanentFail
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.805157618Z stderr WARNING Final process status is permanentFail
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:30.991076640Z crunch-run copying "/cwl.output.json" (3 bytes)
  2021-10-04T18:47:41.019308781Z stderr   2021-10-04T18:47:31.070965676Z crunch-run Complete
  2021-10-04T18:47:41.089593379Z stderr WARNING [step substep] completed permanentFail
  2021-10-04T18:47:41.154552210Z stderr INFO [workflow workflow.json#main] completed permanentFail
  2021-10-04T18:47:41.154601178Z stderr ERROR Overall process status is permanentFail
  2021-10-04T18:47:41.257541997Z stderr INFO Final output collection 237233ea9b49c1486aabfc86fbb7bc74+59 "Output of main (2021-10-04T18:47:41.241Z)" (x2z00-4zz18-w7qx63bg59j5pqx)
  2021-10-04T18:47:41.376510035Z stderr WARNING Final process status is permanentFail
  2021-10-04T18:47:41.641452772Z crunch-run Complete
ERROR Overall process status is permanentFail
WARNING Final process status is permanentFail

#3 Updated by Peter Amstutz 17 days ago

18238-update-cwltool @ e91291c2959c07c9929617e3d080077477a19b33

  • Update cwltool and schema-salad deps
  • Update API server revision
  • Make backwards compatible with 2.2 by checking revision
  • Update tests to use newer arvados/jobs image which fixes the actual integration test problem

https://ci.arvados.org/view/Developer/job/developer-run-tests/2721/

#4 Updated by Peter Amstutz 14 days ago

  • Release set to 42

#5 Updated by Peter Amstutz 14 days ago

  • Status changed from New to In Progress

#6 Updated by Ward Vandewege 8 days ago

Running the 1.2 integration suite against 9tee4 in https://ci.arvados.org/view/Developer/job/developer-run-tests-cwl-suite/11/console, seeing errors like this (I re-ran it with --debug):

root@build-xknq2l:/tmp/workspace/developer-run-tests-cwl-suite/cwl-v1.2# /tmp/workspace/developer-run-tests-cwl-suite/arvados-cwl-runner-with-checksum.sh --debug tests/cat4-tool.cwl tests/cat-job.json
INFO /usr/bin/arvados-cwl-runner 2.3.0.dev20211008201118, arvados-python-client 2.3.0.dev20211006204117, cwltool 3.1.20210922203925
INFO Resolved 'tests/cat4-tool.cwl' to 'file:///tmp/workspace/developer-run-tests-cwl-suite/cwl-v1.2/tests/cat4-tool.cwl'
INFO Using cluster 9tee4 (https://workbench2.9tee4.arvadosapi.com/)
INFO Upload local files: "hello.txt" 
DEBUG {'9tee4-bi6l4-2jdc5yke951odjy': OrderedDict([('href', '/keep_services/9tee4-bi6l4-2jdc5yke951odjy'), ('kind', 'arvados#keepService'), ('etag', 'a1uo5iocrnt9njhvla0cphmeg'), ('uuid', '9tee4-bi6l4-2jdc5yke951odjy'), ('owner_uuid', '9tee4-tpzed-p0bss3u24xdoty1'), ('created_at', '2014-08-29T20:40:58.397282000Z'), ('modified_by_client_uuid', '9tee4-ozdt8-yft7aixdt2dx5ni'), ('modified_by_user_uuid', '9tee4-tpzed-p0bss3u24xdoty1'), ('modified_at', '2014-08-29T20:41:12.868731000Z'), ('service_host', 'keep.9tee4.arvadosapi.com'), ('service_port', 443), ('service_ssl_flag', True), ('service_type', 'proxy'), ('read_only', False), ('_service_root', 'https://keep.9tee4.arvadosapi.com:443/')])}
DEBUG 59ca0efa9f5633cb0371bbc0355478d8+13: ['https://keep.9tee4.arvadosapi.com:443/']
DEBUG Pool max threads is 1
DEBUG Request: PUT https://keep.9tee4.arvadosapi.com:443/59ca0efa9f5633cb0371bbc0355478d8
INFO PUT 200: 13 bytes in 318.6819553375244 msec (0.000 MiB/sec)
DEBUG KeepWriterThread <KeepWriterThread(Thread-1, started daemon 140089930639104)> succeeded 59ca0efa9f5633cb0371bbc0355478d8+13 https://keep.9tee4.arvadosapi.com:443/
INFO Using collection 6acd28dff09b88f5f550950fcb7203a3+53 (9tee4-4zz18-m86okq2yf9jctdv)
INFO ['docker', 'pull', 'docker.io/debian:stable-slim']
stable-slim: Pulling from library/debian
Digest: sha256:6d6db11a446181868d088e4638f1392ea1489ce24a77d352d0d75d8e56307909
Status: Image is up to date for debian:stable-slim
INFO Uploading Docker image docker.io/debian:stable-slim
ERROR no matching image found
ERROR Workflow error:
tests/cat4-tool.cwl:7:5: keepdocker exited with code 1
Traceback (most recent call last):
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/commands/keepdocker.py", line 391, in main
    image_hash = find_one_image_hash(args.image, args.tag)
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/commands/keepdocker.py", line 170, in find_one_image_hash
    raise DockerError("no matching image found")
arvados.commands.keepdocker.DockerError: no matching image found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados_cwl/arvdocker.py", line 68, in arv_docker_get_image
    arvados.commands.keepdocker.main(args, stdout=sys.stderr, install_sig_handlers=False, api=api_client)
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/commands/keepdocker.py", line 394, in main
    sys.exit(1)
SystemExit: 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados_cwl/arvdocker.py", line 72, in arv_docker_get_image
    raise WorkflowException("keepdocker exited with code %s" % e.code)
cwltool.errors.WorkflowException: keepdocker exited with code 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/cwltool/main.py", line 1248, in main
    tool, initialized_job_order_object, runtimeContext, logger=_logger
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados_cwl/executor.py", line 592, in arv_executor
    merged_map = upload_workflow_deps(self, tool)
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados_cwl/runner.py", line 579, in upload_workflow_deps
    upload_docker(arvrunner, tool)
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados_cwl/runner.py", line 463, in upload_docker
    arvrunner.runtimeContext.tmp_outdir_prefix)
  File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados_cwl/arvdocker.py", line 84, in arv_docker_get_image
    cached_lookups[dockerRequirement["dockerImageId"]] = pdh
  File "schema_salad/sourceline.py", line 256, in __exit__
cwltool.errors.WorkflowException: tests/cat4-tool.cwl:7:5: keepdocker exited with code 1

Docker is present and running, and the image arv-keepdocker is supposed to be uploading (debian:stable-slim) is there locally:

root@build-xknq2l:/tmp/workspace/developer-run-tests-cwl-suite/cwl-v1.2# docker --version
Docker version 18.09.1, build 4c52b90
root@build-xknq2l:/tmp/workspace/developer-run-tests-cwl-suite/cwl-v1.2# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
python              3-slim              5d47fc63aa54        35 hours ago        122MB
debian              stable-slim         11ed9f070876        36 hours ago        80.4MB
alpine              latest              14119a10abf4        6 weeks ago         5.6MB

Okay, the problem is that the cwl file specifies:

hints:
  DockerRequirement:
    dockerPull: docker.io/debian:stable-slim

So, keepdocker goes looking for docker.io/debian but the image that is downloaded as docker.io/debian:stable-slim is called debian:stable-slim, so it can not be found.

#7 Updated by Peter Amstutz 8 days ago

  • Target version changed from 2021-10-13 sprint to 2021-10-27 sprint

#8 Updated by Ward Vandewege 8 days ago

  • Blocks Bug #18264: [CI] simplify the way we run the CWL tests added

#10 Updated by Ward Vandewege 8 days ago

Peter Amstutz wrote:

https://ci.arvados.org/view/Developer/job/developer-run-tests/2731/

The wb integration tests failed but that was a spurious failure, I re-ran them and they passed (https://ci.arvados.org/job/developer-run-tests-apps-workbench-integration/2902).

18238-docker-inspect @ dcb534accf55e83d385b210cd2a4a55d6d0c295e

Just a typo in the comment, s/images/image/:

+ # Query for a Docker images with the repository and tag and return

otherwise LGTM thanks, nice and simple!

#11 Updated by Ward Vandewege 6 days ago

Getting closer. One more test failure in the CWL 1.2 suite on 9tee4, #221, see https://ci.arvados.org/view/Developer/job/developer-run-tests-cwl-suite/13/console. I re-ran the failing test with --debug on 9tee4:

ci@shell:~/cwl-v1.2$ arvados-cwl-runner --disable-reuse --compute-checksum --api=containers --outdir=/tmp/tmpwn7jhrb1 --debug tests/timelimit2-wf.cwl tests/empty.json
INFO /usr/bin/arvados-cwl-runner 2.3.0.dev20211013202728, arvados-python-client 2.3.0.dev20211013202728, cwltool 3.1.20210922203925
INFO Resolved 'tests/timelimit2-wf.cwl' to 'file:///home/ci/cwl-v1.2/tests/timelimit2-wf.cwl'
INFO Using cluster 9tee4 (https://workbench2.9tee4.arvadosapi.com/)
INFO Upload local files: "timelimit2-wf.cwl" 
DEBUG {'9tee4-bi6l4-uwgvqzzw4z97qca': OrderedDict([('href', '/keep_services/9tee4-bi6l4-uwgvqzzw4z97qca'), ('kind', 'arvados#keepService'), ('etag', 'c65xiys117q01qz9s7zvjt66i'), ('uuid', '9tee4-bi6l4-uwgvqzzw4z97qca'), ('owner_uuid', '9tee4-tpzed-ruo0j78b8s7jk57'), ('created_at', '2014-08-27T02:19:29.655913000Z'), ('modified_by_client_uuid', '9tee4-ozdt8-yft7aixdt2dx5ni'), ('modified_by_user_uuid', '9tee4-tpzed-ruo0j78b8s7jk57'), ('modified_at', '2014-08-27T02:20:21.044273000Z'), ('service_host', 'keep0.9tee4.arvadosapi.com'), ('service_port', 25107), ('service_ssl_flag', False), ('service_type', 'disk'), ('read_only', False), ('_service_root', 'http://keep0.9tee4.arvadosapi.com:25107/')]), '9tee4-bi6l4-1fum8ggyzc88g87': OrderedDict([('href', '/keep_services/9tee4-bi6l4-1fum8ggyzc88g87'), ('kind', 'arvados#keepService'), ('etag', 'dn6g7rkwx5gpwq0h053wyquax'), ('uuid', '9tee4-bi6l4-1fum8ggyzc88g87'), ('owner_uuid', '9tee4-tpzed-ruo0j78b8s7jk57'), ('created_at', '2014-08-27T02:20:27.578073000Z'), ('modified_by_client_uuid', '9tee4-ozdt8-yft7aixdt2dx5ni'), ('modified_by_user_uuid', '9tee4-tpzed-ruo0j78b8s7jk57'), ('modified_at', '2014-08-27T02:20:43.978890000Z'), ('service_host', 'keep1.9tee4.arvadosapi.com'), ('service_port', 25107), ('service_ssl_flag', False), ('service_type', 'disk'), ('read_only', False), ('_service_root', 'http://keep1.9tee4.arvadosapi.com:25107/')])}
DEBUG b558b3ee7967e6eec67bfeb707073688+803: ['http://keep0.9tee4.arvadosapi.com:25107/', 'http://keep1.9tee4.arvadosapi.com:25107/']
DEBUG Pool max threads is 2
DEBUG Request: PUT http://keep0.9tee4.arvadosapi.com:25107/b558b3ee7967e6eec67bfeb707073688
DEBUG Request: PUT http://keep1.9tee4.arvadosapi.com:25107/b558b3ee7967e6eec67bfeb707073688
INFO PUT 200: 803 bytes in 95.90554237365723 msec (0.008 MiB/sec)
DEBUG KeepWriterThread <KeepWriterThread(Thread-1, started daemon 139994888959744)> succeeded b558b3ee7967e6eec67bfeb707073688+803 http://keep0.9tee4.arvadosapi.com:25107/
INFO PUT 200: 803 bytes in 144.728422164917 msec (0.005 MiB/sec)
DEBUG KeepWriterThread <KeepWriterThread(Thread-2, started daemon 139994880567040)> succeeded b558b3ee7967e6eec67bfeb707073688+803 http://keep1.9tee4.arvadosapi.com:25107/
INFO Uploaded to dda3361138f841bd7cd47eb050192a94+63 (9tee4-4zz18-9piblqacgwgr8p4)
INFO Upload local files: "timelimit2-wf.cwl" 
DEBUG b558b3ee7967e6eec67bfeb707073688+803: ['http://keep0.9tee4.arvadosapi.com:25107/', 'http://keep1.9tee4.arvadosapi.com:25107/']
DEBUG Pool max threads is 2
DEBUG Request: PUT http://keep0.9tee4.arvadosapi.com:25107/b558b3ee7967e6eec67bfeb707073688
DEBUG Request: PUT http://keep1.9tee4.arvadosapi.com:25107/b558b3ee7967e6eec67bfeb707073688
INFO PUT 200: 803 bytes in 91.95637702941895 msec (0.008 MiB/sec)
DEBUG KeepWriterThread <KeepWriterThread(Thread-4, started daemon 139994660337408)> succeeded b558b3ee7967e6eec67bfeb707073688+803 http://keep1.9tee4.arvadosapi.com:25107/
INFO PUT 200: 803 bytes in 94.21706199645996 msec (0.008 MiB/sec)
DEBUG KeepWriterThread <KeepWriterThread(Thread-3, started daemon 139994880567040)> succeeded b558b3ee7967e6eec67bfeb707073688+803 http://keep0.9tee4.arvadosapi.com:25107/
INFO Uploaded to dda3361138f841bd7cd47eb050192a94+63 (9tee4-4zz18-ctpbaru03zespzi)
INFO Using collection cache size 256 MiB
DEBUG ENTER jobiter 1634318432.3216517
DEBUG EXIT jobiter 1634318432.3222976 0.0006458759307861328
DEBUG ENTER run 1634318432.322403
INFO [container timelimit2-wf.cwl] submitted container_request 9tee4-xvhdp-mjhn2lc18mt87jo
INFO Monitor workflow progress at https://workbench2.9tee4.arvadosapi.com/processes/9tee4-xvhdp-mjhn2lc18mt87jo
DEBUG EXIT run 1634318432.5110302 0.1886272430419922
DEBUG ENTER jobiter 1634318432.511198
DEBUG EXIT jobiter 1634318432.5112936 9.560585021972656e-05
INFO [container timelimit2-wf.cwl] 9tee4-xvhdp-mjhn2lc18mt87jo is Final
DEBUG 89a46be783b45a49229e4a9bb336ef13+63085+A5fa305f4b7476b29ae30f15737b77f2d44f763e9@617c2da8: ['http://keep1.9tee4.arvadosapi.com:25107/', 'http://keep0.9tee4.arvadosapi.com:25107/']
DEBUG Request: GET http://keep1.9tee4.arvadosapi.com:25107/89a46be783b45a49229e4a9bb336ef13+63085+A5fa305f4b7476b29ae30f15737b77f2d44f763e9@617c2da8
INFO GET 200: 63085 bytes in 2.2094249725341797 msec (27.230 MiB/sec)
ERROR [container timelimit2-wf.cwl] (9tee4-dz642-82piszqu5g7vzxm) error log:

  2021-10-15T17:21:31.858018933Z stderr DEBUG Request: GET http://keep0.9tee4.arvadosapi.com:25107/022a00c92d6a86b8ea65fe2c1d908ed1+25584+Af28ce79ff95f140a595c3bfd41aa33ddf22ec76f@617c2d9b
  2021-10-15T17:21:31.913908478Z stderr INFO GET 200: 25584 bytes in 55.72247505187988 msec (0.438 MiB/sec)
  2021-10-15T17:21:31.914288812Z stderr ERROR [container step1] (9tee4-xvhdp-oiikgtbqcgfihl0) error log:
  2021-10-15T17:21:31.914288812Z stderr
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:20.514800333Z crunch-run Not starting a gateway server (GatewayAuthSecret was not provided by dispatcher)
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:20.514997142Z crunch-run crunch-run 2.3.0~dev20211008165008 (go1.17.1) started
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:20.515020338Z crunch-run Executing container '9tee4-dz642-mtc7a3m5qokgx4a' using singularity runtime
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:20.515058644Z crunch-run Executing on host 'compute0.9tee4.arvadosapi.com'
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:20.622030412Z crunch-run container token "v2/9tee4-gj3su-eg1hp3dziyx5f5o/5in4ijb92sd5ffs25r2w3rc8932v78yiqd5ugit9o3ztribdxd/9tee4-dz642-mtc7a3m5qokgx4a" 
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:20.622260592Z crunch-run Running [arv-mount --foreground --allow-other --read-write --storage-classes default --crunchstat-interval=10 --file-cache 268435456 --mount-by-pdh by_id --mount-by-id by_uuid /tmp/crunch-run.9tee4-dz642-mtc7a3m5qokgx4a.2309731760/keep2021734447]
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:21.527915426Z crunch-run Fetching Docker image from collection 'f11b94d02052ac9552d6fe8c125eff8c+261'
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:21.584044926Z crunch-run Using Docker image id "sha256:8058c81ad6e572ac29ca7f5bf86eff40a636b1eb710567e6329132d08d1fae09" 
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:21.584102784Z crunch-run Loading Docker image from keep
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:22.025504168Z crunch-run Starting container
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:22.026570397Z crunch-run Waiting for container to finish
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:27.026871952Z crunch-run maximum run time exceeded. Stopping container.
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:27.026943332Z crunch-run stopping container
  2021-10-15T17:21:31.914288812Z stderr   2021-10-15T17:21:27.138086025Z crunch-run Cancelled
  2021-10-15T17:21:31.993604560Z stderr ERROR [step step1] Output is missing expected field file:///var/lib/cwl/workflow.json#main/step1/o
  2021-10-15T17:21:32.068456398Z stderr DEBUG [step step1] produced output {}
  2021-10-15T17:21:32.068483057Z stderr WARNING [step step1] completed permanentFail
  2021-10-15T17:21:32.146910738Z stderr DEBUG ENTER jobiter 1634318492.146714
  2021-10-15T17:21:32.147021481Z stderr DEBUG [workflow workflow.json#main] job step file:///var/lib/cwl/workflow.json#main/step2 not ready
  2021-10-15T17:21:32.147185398Z stderr INFO [workflow workflow.json#main] completed permanentFail
  2021-10-15T17:21:32.147304175Z stderr DEBUG [workflow workflow.json#main] outputs {
  2021-10-15T17:21:32.147304175Z stderr     "o": null
  2021-10-15T17:21:32.147304175Z stderr }
  2021-10-15T17:21:32.147354213Z stderr ERROR Overall process status is permanentFail
  2021-10-15T17:21:32.216245921Z stderr DEBUG EXIT jobiter 1634318492.216103 0.06938910484313965
  2021-10-15T17:21:32.217114953Z stderr DEBUG e45a9756aa5ca5ceb9b58f8f31794ccb+17: ['http://keep1.9tee4.arvadosapi.com:25107/', 'http://keep0.9tee4.arvadosapi.com:25107/']
  2021-10-15T17:21:32.217177681Z stderr DEBUG Pool max threads is 2
  2021-10-15T17:21:32.217635607Z stderr DEBUG Request: PUT http://keep1.9tee4.arvadosapi.com:25107/e45a9756aa5ca5ceb9b58f8f31794ccb
  2021-10-15T17:21:32.218043661Z stderr DEBUG Request: PUT http://keep0.9tee4.arvadosapi.com:25107/e45a9756aa5ca5ceb9b58f8f31794ccb
  2021-10-15T17:21:32.219839325Z stderr INFO PUT 200: 17 bytes in 1.9564628601074219 msec (0.008 MiB/sec)
  2021-10-15T17:21:32.219912446Z stderr DEBUG KeepWriterThread <KeepWriterThread(Thread-6, started daemon 22902328260352)> succeeded e45a9756aa5ca5ceb9b58f8f31794ccb+17 http://keep1.9tee4.arvadosapi.com:25107/
  2021-10-15T17:21:32.238384282Z stderr INFO PUT 200: 17 bytes in 20.125865936279297 msec (0.001 MiB/sec)
  2021-10-15T17:21:32.238440994Z stderr DEBUG KeepWriterThread <KeepWriterThread(Thread-7, started daemon 22902326159104)> succeeded e45a9756aa5ca5ceb9b58f8f31794ccb+17 http://keep0.9tee4.arvadosapi.com:25107/
  2021-10-15T17:21:32.281973311Z stderr INFO Final output collection 42866e7c2f47b2fd38dca68903e59c4b+59 "Output of main (2021-10-15T17:21:32.263Z)" (9tee4-4zz18-cea4i07ad1kuzuh)
  2021-10-15T17:21:32.445167570Z stderr WARNING Final process status is permanentFail
  2021-10-15T17:21:32.851152213Z crunch-run Complete
DEBUG e45a9756aa5ca5ceb9b58f8f31794ccb+17+A62ebe73d279fa40966c00382108aba16cf016aa3@617c2da8: ['http://keep1.9tee4.arvadosapi.com:25107/', 'http://keep0.9tee4.arvadosapi.com:25107/']
DEBUG Request: GET http://keep1.9tee4.arvadosapi.com:25107/e45a9756aa5ca5ceb9b58f8f31794ccb+17+A62ebe73d279fa40966c00382108aba16cf016aa3@617c2da8
INFO GET 200: 17 bytes in 1.2454986572265625 msec (0.013 MiB/sec)
ERROR Overall process status is permanentFail
INFO Final output collection 42866e7c2f47b2fd38dca68903e59c4b+59
INFO Output at https://workbench2.9tee4.arvadosapi.com/collections/42866e7c2f47b2fd38dca68903e59c4b+59
{
    "o": null
}
WARNING Final process status is permanentFail

Seeing the same failure on tordo (singularity + a-d-c), cf. https://ci.arvados.org/view/Developer/job/developer-run-tests-cwl-suite/15/console.

Not seeing the failure on ce8i5 (docker + a-d-c), cf. https://ci.arvados.org/view/Developer/job/developer-run-tests-cwl-suite/16/console

#12 Updated by Peter Amstutz about 9 hours ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF