Project

General

Profile

Bug #10808

Updated by Ward Vandewege almost 8 years ago

Job c97qk-8i9sb-bj9c3ojdng85osz appears to be unkillable via the cancel button on workbench. 

 There are several pipeline instances waiting on it: 

 <pre> 
 2017-01-04_18:27:11.93074 2017-01-04 18:27:11 +0000 -- pipeline_instance c97qk-d1hrv-n6pik83zizjk5hn 
 2017-01-04_18:27:11.93074 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0} 
 2017-01-04_18:27:13.03719  
 2017-01-04_18:27:13.03721 2017-01-04 18:27:12 +0000 -- pipeline_instance c97qk-d1hrv-0thxn81rmpaedyo 
 2017-01-04_18:27:13.03721 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0} 
 2017-01-04_18:27:14.53057  
 2017-01-04_18:27:14.53060 2017-01-04 18:27:14 +0000 -- pipeline_instance c97qk-d1hrv-frf2e4vls4gq22v 
 2017-01-04_18:27:14.53062 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0} 
 2017-01-04_18:27:15.74509  
 2017-01-04_18:27:15.74511 2017-01-04 18:27:15 +0000 -- pipeline_instance c97qk-d1hrv-5dzt55sa9wlq495 
 2017-01-04_18:27:15.74512 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0} 
 2017-01-04_18:27:16.69833  
 2017-01-04_18:27:16.69834 2017-01-04 18:27:16 +0000 -- pipeline_instance c97qk-d1hrv-1uwxdzktqgl8hr6 
 2017-01-04_18:27:16.69835 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0} 
 2017-01-04_18:27:20.34010  
 </pre> 

 It is not actually running: 

 <pre> 
 c97qk:/etc/service# sinfo 
 PARTITION AVAIL    TIMELIMIT    NODES    STATE NODELIST 
 compute*       up     infinite        7 drain* compute[3-9] 
 compute*       up     infinite      249    down* compute[0-2,10-255] 
 crypto         up     infinite        7 drain* compute[3-9] 
 crypto         up     infinite      249    down* compute[0-2,10-255] 
 c97qk:/etc/service# squeue_long  
   JOBID PARTITION NAME       USER ST         TIME    NODES NODELIST(REASON) 
 </pre>

Back