Project

General

Profile

Bug #7225

Updated by Brett Smith over 8 years ago

qr1hi-8i9sb-ozbzy8do1i217gq qr1hi-8i9sb-h5yt6xmpk8u6dps is a pretty typical BWA aligner job.    The aligner apparently ran fine, but then run-command got stuck afterward. uploading the data.    These lines were are the last interesting ones that appear in the log: 

 <pre>2015-09-07_10:06:18 qr1hi-8i9sb-ozbzy8do1i217gq 21483 <pre>2015-09-04_15:16:52 qr1hi-8i9sb-h5yt6xmpk8u6dps 4106 0 stderr run-command: /keep/39c6f22d40001074f4200a72559ae7eb+5745/bwa completed with exit code 0 (success) 
 2015-09-07_10:06:18 qr1hi-8i9sb-ozbzy8do1i217gq 21483 2015-09-04_15:16:52 qr1hi-8i9sb-h5yt6xmpk8u6dps 4106 0 stderr run-command: the following output files will be saved to keep: 
 2015-09-07_10:06:18 qr1hi-8i9sb-ozbzy8do1i217gq 21483 2015-09-04_15:16:52 qr1hi-8i9sb-h5yt6xmpk8u6dps 4106 0 stderr run-command:     1455988972 ./[filename].sai 
 2015-09-07_10:06:18 qr1hi-8i9sb-ozbzy8do1i217gq 21483 2015-09-04_15:16:52 qr1hi-8i9sb-h5yt6xmpk8u6dps 4106 0 stderr run-command: start writing output to keep 
 </pre> 

 After that, run-command stayed alive, . was never heard from again.    When I checked on the compute node, the run-command process was still alive, but not doing anything.    strace reported it was stuck in a futex call. 

 I'm tentatively filing this as an SDK bug figuring the issue is probably in PySDK collection or Keep code.

Back