Actions
Bug #11068
closed[Arvados-CWL-runner] need better error message when there are issues getting the docker image
Story points:
-
Description
The 9tee4 diagnostics job (e.g. 9tee4-xvhdp-yuv7hvb80fbb3lk) failed with this message:
2017-02-07T16:40:16.675203241Z Executing container '9tee4-dz642-g8msiiezegv6dye' 2017-02-07T16:40:16.675306842Z Executing on host 'compute0.9tee4.arvadosapi.com' 2017-02-07T16:40:16.701872067Z Fetching Docker image from collection 'bb5e399246e3ce463f9dca24cbe5457a+335' 2017-02-07T16:40:16.732104491Z Using Docker image id 'b273175cddfb2672685b7701653490c100884c64e65a3444db9a0c70e5d02f83' 2017-02-07T16:40:16.735103990Z Docker image is available 2017-02-07T16:40:17.093587136Z Creating Docker container 2017-02-07T16:40:17.795483495Z Attaching container streams 2017-02-07T16:40:17.870947173Z Starting Docker container id 'fde1821f59d63968fa3a27d63313d826aace767d1b1adc5d43045d94535f8375' 2017-02-07T16:40:18.134036546Z Waiting for container to finish 2017-02-07T16:40:18.097797521Z notice: reading stats from /sys/fs/cgroup/cpuacct/docker/fde1821f59d63968fa3a27d63313d826aace767d1b1adc5d43045d94535f8375/cgroup.procs 2017-02-07T16:40:18.097886671Z notice: reading stats from /sys/fs/cgroup/memory/docker/fde1821f59d63968fa3a27d63313d826aace767d1b1adc5d43045d94535f8375/memory.stat 2017-02-07T16:40:18.098155788Z mem 0 cache 0 swap 0 pgmajfault 876544 rss 2017-02-07T16:40:18.098202650Z notice: reading stats from /sys/fs/cgroup/cpuacct/docker/fde1821f59d63968fa3a27d63313d826aace767d1b1adc5d43045d94535f8375/cpuacct.stat 2017-02-07T16:40:18.098299730Z notice: reading stats from /sys/fs/cgroup/cpuset/docker/fde1821f59d63968fa3a27d63313d826aace767d1b1adc5d43045d94535f8375/cpuset.cpus 2017-02-07T16:40:18.098341572Z cpu 0.0100 user 0.0000 sys 20 cpus 2017-02-07T16:40:18.098373191Z notice: reading stats from /sys/fs/cgroup/blkio/docker/fde1821f59d63968fa3a27d63313d826aace767d1b1adc5d43045d94535f8375/blkio.io_service_bytes 2017-02-07T16:40:18.098549373Z net:eth0 0 tx 0 rx 2017-02-07T16:40:20.001184707Z 2017-02-07 16:40:20 cwltool INFO: /usr/bin/arvados-cwl-runner aaa45b09de0e9437743fce53d7c0bf8165074b5e 1.0.20170203223449, arvados-python-client 0.1.20170203223449, cwltool 1.0.20170119234115 2017-02-07T16:40:20.001544310Z 2017-02-07 16:40:20 cwltool INFO: Resolved '/var/lib/cwl/workflow.json#main' to 'file:///var/lib/cwl/workflow.json#main' 2017-02-07T16:40:20.292931320Z 2017-02-07 16:40:20 cwltool ERROR: Workflow error, try again with --debug for more information: 2017-02-07T16:40:20.292931320Z ../../lib/cwl/workflow.json:1:264: [Errno 2] No such file or directory 2017-02-07T16:40:21.890447683Z Complete
This wasn't exactly clear.
We re-ran the container request with --debug (as 9tee4-xvhdp-bfykiehawkcduh9), which showed the error path:
2017-02-07T17:48:39.192771836Z Executing container '9tee4-dz642-prcd58gh6c69s2c' 2017-02-07T17:48:39.192906547Z Executing on host 'compute0.9tee4.arvadosapi.com' 2017-02-07T17:48:39.219819476Z Fetching Docker image from collection 'bb5e399246e3ce463f9dca24cbe5457a+335' 2017-02-07T17:48:39.257291374Z Using Docker image id 'b273175cddfb2672685b7701653490c100884c64e65a3444db9a0c70e5d02f83' 2017-02-07T17:48:39.260113555Z Docker image is available 2017-02-07T17:48:39.610759818Z Creating Docker container 2017-02-07T17:48:40.411530002Z Attaching container streams 2017-02-07T17:48:40.484833355Z Starting Docker container id '3f672e5be9356276d596d07123752d2376aa6ac495c4e936b6dc7c2042e56742' 2017-02-07T17:48:40.737958719Z Waiting for container to finish 2017-02-07T17:48:40.713845501Z notice: reading stats from /sys/fs/cgroup/cpuacct/docker/3f672e5be9356276d596d07123752d2376aa6ac495c4e936b6dc7c2042e56742/cgroup.procs 2017-02-07T17:48:40.713913821Z notice: reading stats from /sys/fs/cgroup/memory/docker/3f672e5be9356276d596d07123752d2376aa6ac495c4e936b6dc7c2042e56742/memory.stat 2017-02-07T17:48:40.714180103Z mem 0 cache 0 swap 0 pgmajfault 3338240 rss 2017-02-07T17:48:40.714226835Z notice: reading stats from /sys/fs/cgroup/cpuacct/docker/3f672e5be9356276d596d07123752d2376aa6ac495c4e936b6dc7c2042e56742/cpuacct.stat 2017-02-07T17:48:40.714315071Z notice: reading stats from /sys/fs/cgroup/cpuset/docker/3f672e5be9356276d596d07123752d2376aa6ac495c4e936b6dc7c2042e56742/cpuset.cpus 2017-02-07T17:48:40.714354801Z cpu 0.0100 user 0.0100 sys 20 cpus 2017-02-07T17:48:40.714381461Z notice: reading stats from /sys/fs/cgroup/blkio/docker/3f672e5be9356276d596d07123752d2376aa6ac495c4e936b6dc7c2042e56742/blkio.io_service_bytes 2017-02-07T17:48:40.714531430Z net:eth0 90 tx 90 rx 2017-02-07T17:48:42.677454263Z 2017-02-07 17:48:42 cwltool INFO: /usr/bin/arvados-cwl-runner aaa45b09de0e9437743fce53d7c0bf8165074b5e 1.0.20170203223449, arvados-python-client 0.1.20170203223449, cwltool 1.0.20170119234115 2017-02-07T17:48:42.677946190Z 2017-02-07 17:48:42 cwltool INFO: Resolved '/var/lib/cwl/workflow.json#main' to 'file:///var/lib/cwl/workflow.json#main' 2017-02-07T17:48:42.975179073Z 2017-02-07 17:48:42 cwltool ERROR: Workflow error, try again with --debug for more information: 2017-02-07T17:48:42.975179073Z ../../lib/cwl/workflow.json:1:264: [Errno 2] No such file or directory 2017-02-07T17:48:42.975179073Z Traceback (most recent call last): 2017-02-07T17:48:42.975179073Z File "/usr/lib/python2.7/dist-packages/cwltool/main.py", line 717, in main 2017-02-07T17:48:42.975179073Z **vars(args)) 2017-02-07T17:48:42.975179073Z File "/usr/lib/python2.7/dist-packages/arvados_cwl/__init__.py", line 347, in arv_executor 2017-02-07T17:48:42.975179073Z upload_workflow_deps(self, tool) 2017-02-07T17:48:42.975179073Z File "/usr/lib/python2.7/dist-packages/arvados_cwl/runner.py", line 197, in upload_workflow_deps 2017-02-07T17:48:42.975179073Z upload_docker(arvrunner, tool) 2017-02-07T17:48:42.975179073Z File "/usr/lib/python2.7/dist-packages/arvados_cwl/runner.py", line 135, in upload_docker 2017-02-07T17:48:42.975179073Z arv_docker_get_image(arvrunner.api, docker_req, True, arvrunner.project_uuid) 2017-02-07T17:48:42.975179073Z File "/usr/lib/python2.7/dist-packages/arvados_cwl/arvdocker.py", line 71, in arv_docker_get_image 2017-02-07T17:48:42.975179073Z return pdh 2017-02-07T17:48:42.975179073Z File "/usr/lib/python2.7/dist-packages/schema_salad/sourceline.py", line 146, in __exit__ 2017-02-07T17:48:42.975179073Z raise self.makeError(unicode(exc_value)) 2017-02-07T17:48:42.975179073Z WorkflowException: ../../lib/cwl/workflow.json:1:264: [Errno 2] No such file or directory 2017-02-07T17:48:44.525795385Z Complete
I will fix the permissions problem, but the error message should be improved so re-running with --debug is not necessary.
Updated by Tom Morris about 7 years ago
- Target version set to Arvados Future Sprints
Updated by Tom Morris about 7 years ago
- Target version changed from Arvados Future Sprints to 2017-09-13 Sprint
Updated by Peter Amstutz about 7 years ago
- Target version changed from 2017-09-13 Sprint to 2017-09-27 Sprint
Updated by Peter Amstutz about 7 years ago
Improved error handling/reporting: 11068-cwl-missing-docker
Updated by Lucas Di Pentima about 7 years ago
This LGTM.
I think it would be nice to ask the cwltool project to enhance its error reporting when executing external commands on cwltool.docker.get_image()
to avoid this kind of issues in the future.
Updated by Peter Amstutz about 7 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:58c6f3aa42f4f30fc4a764ca56ab1a198754b69b.
Actions