Bug #11494
closedbcbio NA12878 validation runs: failed steps due to being unable to setup container
Description
I'm running bcbio CWL validation workflows on NA12878 chr20 in preparation for the GA4GH workflow challenge:
https://github.com/bcbio/bcbio_validation_workflows
We had an almost successful run:
https://cloud.curoverse.com/pipeline_instances/qr1hi-d1hrv-xzppm82ydryksab
but have 3 failed variant calling jobs. It looks like bcbio never actually run and appears to be
an issue with setting up the instance:
https://cloud.curoverse.com/jobs/qr1hi-8i9sb-h8g7y5er07o8dp7#Log
https://cloud.curoverse.com/jobs/qr1hi-8i9sb-m9okvuwbhem966d#Log
https://cloud.curoverse.com/jobs/qr1hi-8i9sb-07qhvcy8jpomdn1#Log
This looks like the useful part of the log:
```
stderr starting: ['srun','--nodes=1','/bin/sh','-ec','/usr/bin/docker.io run --user=crunch a458ac1b067f2938da2860b2d3212900660905e3713906ce20caa5c353cdb45a id --user']
stderr Unable to find user crunch
stderr Error response from daemon: Cannot start container d51100ab82905a45759faeb14fa7487102141930c256381a609b69f39ed05c0f: [8] System error: Unable to find user crunch
stderr srun: error: compute56: task 0: Exited with exit code 1
check whether user 'crunch' is UID 0: exit 1
check whether user 'nobody' is UID 0: start
stderr starting: ['srun','--nodes=1','/bin/sh','-ec','/usr/bin/docker.io run --user=nobody a458ac1b067f2938da2860b2d3212900660905e3713906ce20caa5c353cdb45a id --user']
stderr srun: error: Unable to create job step: Required node not available (down or drained)
```
Thanks for any suggestions and help debugging.
Updated by Tom Morris over 6 years ago
- Target version set to Arvados Future Sprints
Updated by Ward Vandewege almost 3 years ago
- Target version deleted (
Arvados Future Sprints)