Bug #8805
Updated by Sarah Guthrie about 8 years ago
Two jobs, identical in inputs, both on qr1hi, in quick succession (on the same node) failed at different points in the process of running a crunch script. * https://workbench.qr1hi.arvadosapi.com/pipeline_instances/qr1hi-d1hrv-hu682pkl5bjfpny# * https://workbench.qr1hi.arvadosapi.com/pipeline_instances/qr1hi-d1hrv-0agoz5tizyfudr1# They both go through a directory, using os.walk, copying everything to a temporary directory. I used excessive logging to figure out that both jobs were failing at different points, despite having the same inputs, crunch_script version, docker image, and compute node. This behavior was not observed when using subprocess.check_call(['cp', '-r'])