Project

General

Profile

Actions

Bug #7481

closed

Docker Daemon failure or FUSE problem

Added by Bryan Cosca over 8 years ago. Updated over 8 years ago.

Status:
Duplicate
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

https://workbench.tb05z.arvadosapi.com/collections/80c6a5e6a158508bc58969e93d5348e5+87/tb05z-8i9sb-2vmkv1gm5jvbw8a.log.txt

2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr run-command: caught exception
2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr Traceback (most recent call last):
2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr File "/tmp/crunch-job/src/crunch_scripts/run-command", line 393, in <module>
2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr (pid, status) = os.wait()
2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr OSError: [Errno 4] Interrupted system call
2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr run-command: the following output files will be saved to keep:
2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr run-command: 11411 ./scatter.intervals
2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr run-command: 0 ./.scatter.intervals.done
2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr run-command: 0 ./24385-200_AH5G7WCCXX_S4_L004_R1_001_markdup.target.intervals.list
2015-10-07_21:18:17 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr run-command: start writing output to keep
2015-10-07_21:18:19 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr time="2015-10-07T21:18:19Z" level=fatal msg="Post http:///var/run/docker.sock/v1.18/containers/975e98ca99280f647ab2cda7b45eddbad95d5e3dceb3916a0f0d8bc2d4067c4a/wait: dial unix /var/run/docker.sock: no such file or directory. Are you trying to connect to a TLS-enabled daemon without TLS?"
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr 2015-10-07 21:18:20 arvados.arvados_fuse20316 ERROR: Unhandled exception during FUSE operation
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr Traceback (most recent call last):
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr File "/usr/local/lib/python2.7/dist-packages/arvados_fuse/__init__.py", line 276, in catch_exceptions_wrapper
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr return orig_func(self, *args, **kwargs)
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr File "/usr/local/lib/python2.7/dist-packages/arvados_fuse/__init__.py", line 461, in forget
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr ent = self.inodes[inode]
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr File "/usr/local/lib/python2.7/dist-packages/arvados_fuse/__init__.py", line 214, in getitem
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr return self._entries[item]
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr KeyError: 47L
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr srun: error: compute1: task 0: Terminated
2015-10-07_21:18:20 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 stderr srun: Force Terminated job step 215.5
2015-10-07_21:18:21 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 child 9378 on compute1.1 exit 15 success=false
2015-10-07_21:18:21 tb05z-8i9sb-2vmkv1gm5jvbw8a 9201 0 failure (#1, permanent) after 126 seconds


Related issues

Related to Arvados - Bug #5956: [Deployment] Docker configuration changes restart Docker on compute nodes, interrupting running jobsDuplicate05/07/2015Actions
Actions

Also available in: Atom PDF