Bug #3649

[Crunch] 'Pipeline_instances' page shows job running when it has failed

Added by Abram Connelly about 5 years ago. Updated about 5 years ago.

Status:
Rejected
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Start date:
08/21/2014
Due date:
% Done:

0%

Estimated time:
Story points:
1.0

Description

I believe I cancelled the second and last job (but I can't quite remember, sorry) in the pipeline qr1hi-d1hrv-aa011c9iq7twjbp and now the pipeline instances page shows that the second job to be running when it has failed.

Attached are screenshots of the pipeline instace page and the 'show job details' page which show the pipeline instance page with a status of 'running' and the 'show job details' page with a status of failed.

Here is hopefully the relevant portion of the output log:

2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369  Job cancelled at 2014-08-21T18:52:41Z by user qr1hi-tpzed-rd7cyyspidta11c
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369  wait for last 1 children to finish
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 sending 2x signal 2 to pid 22767
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr srun: interrupt (one more within 1 sec to abort)
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr srun: task 0: running
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr srun: sending Ctrl-C to job 754.4
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr crunchstat: caught signal:interrupt
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr slurmd[compute4]: error: *** STEP 754.4 KILLED AT 2014-08-21T18:52:41 WITH SIGNAL 9 ***
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 child 22767 on compute4.1 exit 9 signal 0 success=
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 failure (#1, permanent) after 410 seconds
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 output 
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369  status: 1 done, 0 running, 0 todo
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369  release job allocation
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369  Freeze not implemented
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369  collate
pip_update_issue0.png (80.8 KB) pip_update_issue0.png Abram Connelly, 08/21/2014 03:19 PM
pip_update_issue1.png (61.6 KB) pip_update_issue1.png Abram Connelly, 08/21/2014 03:19 PM

History

#1 Updated by Ward Vandewege about 5 years ago

  • Target version set to 2014-08-27 Sprint

#2 Updated by Ward Vandewege about 5 years ago

  • Subject changed from 'Pipeline_instances' page shows job running when it has failed to [Crunch] 'Pipeline_instances' page shows job running when it has failed

#3 Updated by Ward Vandewege about 5 years ago

  • Story points set to 1.0

#4 Updated by Peter Amstutz about 5 years ago

Duplicate of #3136

#5 Updated by Peter Amstutz about 5 years ago

  • Status changed from New to Rejected

Also available in: Atom PDF