Project

General

Profile

Actions

Bug #3649

closed

[Crunch] 'Pipeline_instances' page shows job running when it has failed

Added by Abram Connelly over 9 years ago. Updated over 9 years ago.

Status:
Rejected
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Story points:
1.0

Description

I believe I cancelled the second and last job (but I can't quite remember, sorry) in the pipeline qr1hi-d1hrv-aa011c9iq7twjbp and now the pipeline instances page shows that the second job to be running when it has failed.

Attached are screenshots of the pipeline instace page and the 'show job details' page which show the pipeline instance page with a status of 'running' and the 'show job details' page with a status of failed.

Here is hopefully the relevant portion of the output log:

2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369  Job cancelled at 2014-08-21T18:52:41Z by user qr1hi-tpzed-rd7cyyspidta11c
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369  wait for last 1 children to finish
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 sending 2x signal 2 to pid 22767
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr srun: interrupt (one more within 1 sec to abort)
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr srun: task 0: running
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr srun: sending Ctrl-C to job 754.4
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr crunchstat: caught signal:interrupt
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 stderr slurmd[compute4]: error: *** STEP 754.4 KILLED AT 2014-08-21T18:52:41 WITH SIGNAL 9 ***
2014-08-21_18:52:41 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 child 22767 on compute4.1 exit 9 signal 0 success=
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 failure (#1, permanent) after 410 seconds
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369 1 output 
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369  status: 1 done, 0 running, 0 todo
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369  release job allocation
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369  Freeze not implemented
2014-08-21_18:52:42 qr1hi-8i9sb-2p2onyry48ioxen 22369  collate

Files

pip_update_issue0.png (80.8 KB) pip_update_issue0.png Abram Connelly, 08/21/2014 03:19 PM
pip_update_issue1.png (61.6 KB) pip_update_issue1.png Abram Connelly, 08/21/2014 03:19 PM
Actions #1

Updated by Ward Vandewege over 9 years ago

  • Target version set to 2014-08-27 Sprint
Actions #2

Updated by Ward Vandewege over 9 years ago

  • Subject changed from 'Pipeline_instances' page shows job running when it has failed to [Crunch] 'Pipeline_instances' page shows job running when it has failed
Actions #3

Updated by Ward Vandewege over 9 years ago

  • Story points set to 1.0
Actions #4

Updated by Peter Amstutz over 9 years ago

Duplicate of #3136

Actions #5

Updated by Peter Amstutz over 9 years ago

  • Status changed from New to Rejected
Actions

Also available in: Atom PDF