Bug #3384

[Crunch] Termination of jobs due to 'Connection timed out'?

Added by Abram Connelly about 5 years ago. Updated almost 5 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
Tim Pierce
Category:
Crunch
Target version:
Start date:
07/28/2014
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

Pipeline instance qr1hi-8i9sb-n1yv047kymyjtxs failed when it was working before. Looking at the output log collection 2482f18b2f601d248bb4fe93e296b862+87, there is a line that says:

2014-07-28_14:02:15 qr1hi-8i9sb-n1yv047kymyjtxs 10767 50 stderr socket.error: [Errno 110] Connection timed out
2014-07-28_14:02:15 qr1hi-8i9sb-n1yv047kymyjtxs 10767 50 stderr srun: error: compute0: task 0: Exited with exit code 1

followed by subsquent job cancellations:

2014-07-28_14:02:16 qr1hi-8i9sb-n1yv047kymyjtxs 10767 54 stderr srun: sending Ctrl-C to job 3133.57
2014-07-28_14:02:16 qr1hi-8i9sb-n1yv047kymyjtxs 10767 54 stderr crunchstat: caught signal:interrupt

History

#1 Updated by Tom Clegg about 5 years ago

Possible solution (or at least helpful improvement):

[Crunch] API communication fail should result in recording temporary task failure, not permanent.

#2 Updated by Ward Vandewege almost 5 years ago

  • Target version set to Bug Triage

#3 Updated by Ward Vandewege almost 5 years ago

  • Project changed from Arvados to Arvados Private

#4 Updated by Tim Pierce almost 5 years ago

  • Target version changed from Bug Triage to 2014-10-08 sprint

#5 Updated by Tim Pierce almost 5 years ago

  • Subject changed from Termination of jobs due to 'Connection timed out'? to [Crunch] Termination of jobs due to 'Connection timed out'?
  • Category set to Crunch
  • Project changed from Arvados Private to Arvados
  • Assigned To set to Tim Pierce

#6 Updated by Tim Pierce almost 5 years ago

  • Status changed from New to Closed

Could not reproduce; re-running this pipeline at https://workbench.qr1hi.arvadosapi.com/pipeline_instances/qr1hi-d1hrv-ftse9e4sz35fot7 yielded success.

Also available in: Atom PDF