Project

General

Profile

Actions

Bug #8810

closed

[Crunch] `docker load` fails to connect to endpoint; srun exits 0

Added by Brett Smith about 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-

Description

2016-03-22_16:33:38 wx7k5-8i9sb-ose8gk9vuxqe9gd 48074  stderr starting: ['srun','--nodelist=compute11','/bin/bash','-o','pipefail','-ec',' if ! docker.io images -q --no-trunc --all | grep -qxF d33416e64af4370471ed15d19211e84991a8e158626199f4e4747e4310144b83; then     arv-get 17b65db74aae73465b5e286d1cdb0e23\\+798\\/d33416e64af4370471ed15d19211e84991a8e158626199f4e4747e4310144b83\\.tar | docker.io load fi ']
2016-03-22_16:33:40 wx7k5-8i9sb-ose8gk9vuxqe9gd 48074  stderr Post http:///var/run/docker.sock/v1.20/images/load: EOF.
2016-03-22_16:33:40 wx7k5-8i9sb-ose8gk9vuxqe9gd 48074  stderr * Are you trying to connect to a TLS-enabled daemon without TLS?
2016-03-22_16:33:40 wx7k5-8i9sb-ose8gk9vuxqe9gd 48074  stderr * Is your docker daemon up and running?
2016-03-22_16:41:14 wx7k5-8i9sb-ose8gk9vuxqe9gd 48074  stderr srun: error: Node failure on compute11
2016-03-22_16:41:14 wx7k5-8i9sb-ose8gk9vuxqe9gd 48074  stderr srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
2016-03-22_16:41:14 wx7k5-8i9sb-ose8gk9vuxqe9gd 48074  load docker image: exit 0

From here the job continued running and generating errors until the UID 0 check failed. Instead crunch-job should detect this error and exit such that crunch-dispatch retries the job.


Subtasks 1 (0 open1 closed)

Task #8888: Review 8810-crunch-improve-docker-loading-wipResolvedBrett Smith04/05/2016Actions

Related issues

Related to Arvados - Bug #8811: [Crunch] `srun --nodes=1` reports "Unable to create job step: Required node not available (down or drained)" and exits 1ResolvedBrett Smith03/31/2016Actions
Related to Arvados - Bug #8869: [Crunch] Job was repeatedly retried on same bad compute node until abandonedClosed03/31/2016Actions
Actions

Also available in: Atom PDF