Bug #4769

compute29 broken?

Added by Bryan Cosca almost 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Start date:
12/10/2014
Due date:
% Done:

0%

Estimated time:
Story points:
0.5

Description

pipeline_instances/qr1hi-d1hrv-h5yigp3nsdlnfs2

2014-12-10_15:56:51 qr1hi-8i9sb-tu9dk90ixn8t589 32635 check slurm allocation
2014-12-10_15:56:51 qr1hi-8i9sb-tu9dk90ixn8t589 32635 node compute29 - 8 slots
2014-12-10_15:56:51 qr1hi-8i9sb-tu9dk90ixn8t589 32635 start
2014-12-10_15:56:52 qr1hi-8i9sb-tu9dk90ixn8t589 32635 Clean work dirs
2014-12-10_15:56:53 qr1hi-8i9sb-tu9dk90ixn8t589 32635 Cleanup command exited 0
2014-12-10_15:56:53 qr1hi-8i9sb-tu9dk90ixn8t589 32635 Looking for version ac21f0d45a76294aaca0c0c0fdf06eb72d03368d from repository arvados
2014-12-10_15:56:53 qr1hi-8i9sb-tu9dk90ixn8t589 32635 Using local repository '/var/lib/arvados/internal.git'
2014-12-10_15:56:53 qr1hi-8i9sb-tu9dk90ixn8t589 32635 Version ac21f0d45a76294aaca0c0c0fdf06eb72d03368d is commit ac21f0d45a76294aaca0c0c0fdf06eb72d03368d
2014-12-10_15:56:53 qr1hi-8i9sb-tu9dk90ixn8t589 32635 Run install script on all workers
2014-12-10_15:56:53 qr1hi-8i9sb-tu9dk90ixn8t589 32635 Install script exited 0
2014-12-10_15:56:53 runtime/cgo: pthread_create failed: Resource temporarily unavailable
2014-12-10_15:56:53 SIGABRT: abort
2014-12-10_15:56:53 PC=0x9996c9
2014-12-10_15:56:53
2014-12-10_15:56:53 goroutine 0 [idle]:
2014-12-10_15:56:53
2014-12-10_15:56:53 goroutine 16 [running]:
2014-12-10_15:56:53 runtime.asmcgocall(0x407060, 0x7f0cd1a05f20)
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/asm_amd64.s:692 +0x3a fp=0x7f0cd1a05f10 sp=0x7f0cd1a05f08
2014-12-10_15:56:53 newm(0x4205d0, 0x0)
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/proc.c:930 +0xad fp=0x7f0cd1a05f50 sp=0x7f0cd1a05f10
2014-12-10_15:56:53 runtime.main()
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/proc.c:219 +0x3c fp=0x7f0cd1a05fa8 sp=0x7f0cd1a05f50
2014-12-10_15:56:53 runtime.goexit()
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/proc.c:1445 fp=0x7f0cd1a05fb0 sp=0x7f0cd1a05fa8
2014-12-10_15:56:53 created by rt0_go
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/asm_amd64.s:97 +0x120
2014-12-10_15:56:53
2014-12-10_15:56:53 rax 0x0
2014-12-10_15:56:53 rbx 0xb
2014-12-10_15:56:53 rcx 0xffffffffffffffff
2014-12-10_15:56:53 rdx 0x6
2014-12-10_15:56:53 rdi 0x62c1
2014-12-10_15:56:53 rsi 0x62c1
2014-12-10_15:56:53 rbp 0x7fff62be6f00
2014-12-10_15:56:53 rsp 0x7fff62be6d38
2014-12-10_15:56:53 r8 0x22c7880
2014-12-10_15:56:53 r9 0x616e7520796c6972
2014-12-10_15:56:53 r10 0x8
2014-12-10_15:56:53 r11 0x202
2014-12-10_15:56:53 r12 0x22c9e00
2014-12-10_15:56:53 r13 0x941530
2014-12-10_15:56:53 r14 0x9415c0
2014-12-10_15:56:53 r15 0x0
2014-12-10_15:56:53 rip 0x9996c9
2014-12-10_15:56:53 rflags 0x202
2014-12-10_15:56:53 cs 0x33
2014-12-10_15:56:53 fs 0x0
2014-12-10_15:56:53 gs 0x0
2014-12-10_15:56:53 runtime/cgo: pthread_create failed: Resource temporarily unavailable
2014-12-10_15:56:53 SIGABRT: abort
2014-12-10_15:56:53 PC=0x9996c9
2014-12-10_15:56:53
2014-12-10_15:56:53 goroutine 0 [idle]:
2014-12-10_15:56:53
2014-12-10_15:56:53 goroutine 16 [running]:
2014-12-10_15:56:53 runtime.asmcgocall(0x407060, 0x7f9aae22ff20)
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/asm_amd64.s:692 +0x3a fp=0x7f9aae22ff10 sp=0x7f9aae22ff08
2014-12-10_15:56:53 newm(0x4205d0, 0x0)
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/proc.c:930 +0xad fp=0x7f9aae22ff50 sp=0x7f9aae22ff10
2014-12-10_15:56:53 runtime.main()
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/proc.c:219 +0x3c fp=0x7f9aae22ffa8 sp=0x7f9aae22ff50
2014-12-10_15:56:53 runtime.goexit()
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/proc.c:1445 fp=0x7f9aae22ffb0 sp=0x7f9aae22ffa8
2014-12-10_15:56:53 created by _rt0_go
2014-12-10_15:56:53 /usr/local/go/src/pkg/runtime/asm_amd64.s:97 +0x120
2014-12-10_15:56:53
2014-12-10_15:56:53 rax 0x0
2014-12-10_15:56:53 rbx 0xb
2014-12-10_15:56:53 rcx 0xffffffffffffffff
2014-12-10_15:56:53 rdx 0x6
2014-12-10_15:56:53 rdi 0x62c4
2014-12-10_15:56:53 rsi 0x62c4
2014-12-10_15:56:53 rbp 0x7fff4cd1cfe0
2014-12-10_15:56:53 rsp 0x7fff4cd1ce18
2014-12-10_15:56:53 r8 0x2555880
2014-12-10_15:56:53 r9 0x616e7520796c6972
2014-12-10_15:56:53 r10 0x8
2014-12-10_15:56:53 r11 0x202
2014-12-10_15:56:53 r12 0x2557df0
2014-12-10_15:56:53 r13 0x941530
2014-12-10_15:56:53 r14 0x9415c0
2014-12-10_15:56:53 r15 0x0
2014-12-10_15:56:53 rip 0x9996c9
2014-12-10_15:56:53 rflags 0x202
2014-12-10_15:56:53 cs 0x33
2014-12-10_15:56:53 fs 0x0
2014-12-10_15:56:53 gs 0x0
2014-12-10_15:57:05 Traceback (most recent call last):
2014-12-10_15:57:05 File "/usr/local/bin/arv-get", line 200, in <module>
2014-12-10_15:57:05 srun: error: compute29: task 0: Exited with exit code 2
2014-12-10_15:57:05 outfile.write(data)
2014-12-10_15:57:05 IOError: [Errno 32] Broken pipe
2014-12-10_15:57:05 qr1hi-8i9sb-tu9dk90ixn8t589 32635 Installing Docker image from 6b6c4eacc10099ec34b81f665f064cc9+4872 exited 2 at /usr/local/arvados/src/sdk/cli/bin/crunch-job line 603
2014-12-10_15:57:05 qr1hi-8i9sb-tu9dk90ixn8t589 32635 Freeze not implemented
2014-12-10_15:57:05 qr1hi-8i9sb-tu9dk90ixn8t589 32635 collate
2014-12-10_15:57:05 Traceback (most recent call last):
2014-12-10_15:57:05 File "/usr/local/bin/arv-put", line 4, in <module>
2014-12-10_15:57:05 main()
2014-12-10_15:57:05 File "/usr/local/lib/python2.7/dist-packages/arvados/commands/put.py", line 451, in main
2014-12-10_15:57:05 writer.finish_current_stream()
2014-12-10_15:57:05 File "/usr/local/lib/python2.7/dist-packages/arvados/collection.py", line 534, in finish_current_stream
2014-12-10_15:57:05 self.flush_data()
2014-12-10_15:57:05 File "/usr/local/lib/python2.7/dist-packages/arvados/commands/put.py", line 299, in flush_data
2014-12-10_15:57:05 super(ArvPutCollectionWriter, self).flush_data()
2014-12-10_15:57:05 File "/usr/local/lib/python2.7/dist-packages/arvados/collection.py", line 480, in flush_data
2014-12-10_15:57:05 self._my_keep().put(data_buffer[0:self.KEEP_BLOCK_SIZE]))
2014-12-10_15:57:05 File "/usr/local/lib/python2.7/dist-packages/arvados/retry.py", line 157, in num_retries_setter
2014-12-10_15:57:05 return orig_func(self, *args, **kwargs)
2014-12-10_15:57:05 File "/usr/local/lib/python2.7/dist-packages/arvados/keep.py", line 709, in put
2014-12-10_15:57:05 t.start()
2014-12-10_15:57:05 File "/usr/lib/python2.7/threading.py", line 495, in start
2014-12-10_15:57:05 _start_new_thread(self.
_bootstrap, ())
2014-12-10_15:57:05 thread.error: can't start new thread
2014-12-10_15:57:05 qr1hi-8i9sb-tu9dk90ixn8t589 32635 log_writer_finish: arv-put exited 1
2014-12-10_15:57:05 qr1hi-8i9sb-tu9dk90ixn8t589 32635 log manifest is
2014-12-10_15:57:05 Died at /usr/local/arvados/src/sdk/cli/bin/crunch-job line 1464, <DATA> line 1.
2014-12-10_15:57:05 salloc: Relinquishing job allocation 9048

History

#1 Updated by Ward Vandewege almost 6 years ago

  • Status changed from New to Resolved
  • Target version set to 2014-12-10 sprint

This was caused by crunch-dispatch not liking a memory limit. I reverted the limit for now.

#2 Updated by Ward Vandewege almost 6 years ago

  • Story points set to 0.5

Also available in: Atom PDF