Project

General

Profile

Actions

Bug #11495

closed

bcbio NA12878 validation runs: job re-use failure with non-existant collection

Added by Brad Chapman about 7 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

Due to failures reported in #11494 I tried to re-run the bcbio CWL validation pipeline with re-use enabled and I get a failure
accessing one of the collections:
```
2017-04-13 09:16:39 arvados.cwl-runner INFO: Pipeline instance qr1hi-d1hrv-7lnm3bklaagipg6
112017-04-13 09:17:02 arvados.cwl-runner INFO: [job prep_samples_to_rec] qr1hi-8i9sb-xv69ktrwccodeja is Queued
2017-04-13 09:17:03 arvados.cwl-runner INFO: [job alignment_to_rec] reused job qr1hi-8i9sb-r83xspm4oxsvm7v
2017-04-13 09:17:08 arvados.cwl-runner ERROR: Got unknown exception while collecting output for job alignment_to_rec:
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/arvados_cwl/arvjob.py", line 195, in done
num_retries=self.arvrunner.num_retries)
File "/usr/lib/python2.7/dist-packages/arvados/collection.py", line 1680, in init
super(CollectionReader, self).__init__(manifest_locator_or_text, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/arvados/collection.py", line 1236, in init
self._populate()
File "/usr/lib/python2.7/dist-packages/arvados/collection.py", line 1367, in _populate
error_via_keep))
NotFoundError: Failed to retrieve collection '9385673b342238f0cd9b7251b725a4ed+85' from either API server (<HttpError 404 when requesting https://qr1hi.arvadosapi.com/arvados/v1/collections/9385673b342238f0cd9b7251b725a4ed%2B85?alt=json returned "Path not found">) or Keep (9385673b342238f0cd9b7251b725a4ed+85 not found: http://keep23.qr1hi.arvadosapi.com:25107/ responded with 403 HTTP/1.1 403 Forbidden
; http://keep24.qr1hi.arvadosapi.com:25107/ responded with 403 HTTP/1.1 403 Forbidden
; http://keep27.qr1hi.arvadosapi.com:25107/ responded with 403 HTTP/1.1 403 Forbidden
; http://keep20.qr1hi.arvadosapi.com:25107/ responded with 403 HTTP/1.1 403 Forbidden
; http://keep21.qr1hi.arvadosapi.com:25107/ responded with 403 HTTP/1.1 403 Forbidden
; http://keep25.qr1hi.arvadosapi.com:25107/ responded with 403 HTTP/1.1 403 Forbidden
; http://keep22.qr1hi.arvadosapi.com:25107/ responded with 403 HTTP/1.1 403 Forbidden
; http://keep26.qr1hi.arvadosapi.com:25107/ responded with 403 HTTP/1.1 403 Forbidden
).
2017-04-13 09:17:08 cwltool ERROR: [step alignment_to_rec] Output is missing expected field file:///home/bchapman/runs/NA12878-platinum-chr20-workflow-arvados/main-NA12878-platinum-chr20.cwl#alignment_to_rec/alignment_rec
2017-04-13 09:17:08 cwltool WARNING: [step alignment_to_rec] completed permanentFail
2017-04-13 09:17:08 cwltool INFO: [workflow main-NA12878-platinum-chr20.cwl] outdir is $(task.outdir)
2017-04-13 09:17:08 arvados.cwl-runner WARNING: Overall process status is permanentFail
```
The UUID for the previous runs output is 79e25208a213b812dddda032e28eac07+223:

https://cloud.curoverse.com/jobs/qr1hi-8i9sb-cazntclejrng7dg#Status

so I'm not sure where the collection hash it requests above, which does not exist, comes from.

Actions

Also available in: Atom PDF