Bug #7225
Updated by Brett Smith over 9 years ago
qr1hi-8i9sb-ozbzy8do1i217gq qr1hi-8i9sb-h5yt6xmpk8u6dps is a pretty typical BWA aligner job. The aligner apparently ran fine, but then run-command got stuck afterward. uploading the data. These lines were are the last interesting ones that appear in the log: <pre>2015-09-07_10:06:18 qr1hi-8i9sb-ozbzy8do1i217gq 21483 <pre>2015-09-04_15:16:52 qr1hi-8i9sb-h5yt6xmpk8u6dps 4106 0 stderr run-command: /keep/39c6f22d40001074f4200a72559ae7eb+5745/bwa completed with exit code 0 (success) 2015-09-07_10:06:18 qr1hi-8i9sb-ozbzy8do1i217gq 21483 2015-09-04_15:16:52 qr1hi-8i9sb-h5yt6xmpk8u6dps 4106 0 stderr run-command: the following output files will be saved to keep: 2015-09-07_10:06:18 qr1hi-8i9sb-ozbzy8do1i217gq 21483 2015-09-04_15:16:52 qr1hi-8i9sb-h5yt6xmpk8u6dps 4106 0 stderr run-command: 1455988972 ./[filename].sai 2015-09-07_10:06:18 qr1hi-8i9sb-ozbzy8do1i217gq 21483 2015-09-04_15:16:52 qr1hi-8i9sb-h5yt6xmpk8u6dps 4106 0 stderr run-command: start writing output to keep </pre> After that, run-command stayed alive, . was never heard from again. When I checked on the compute node, the run-command process was still alive, but not doing anything. strace reported it was stuck in a futex call. I'm tentatively filing this as an SDK bug figuring the issue is probably in PySDK collection or Keep code.