Bug #4499
closed[SDKs] one_task_per_input_file should call normalize() before getting its list of files
Description
sguthrie reports that qr1hi-d1hrv-3n1gezo6gondyrv, despite running with one_task_per_input_file, spawns 300 tasks on an input collection with 60 files in it:
Job qr1hi-8i9sb-zc7ivqn134e8czx takes input collection qr1hi-4zz18-zc07fkmhzhfzriz, which has 64 files.
This job runs addPopulationMemEff.py
at commit 5299b46be, which specifies:
arvados.job_setup.one_task_per_input_file(if_sequence=0, and_end_task=True, input_as_path=True)
but the job produces nearly 300 tasks:
"tasks_summary": { "done": 1, "running": 8, "failed": 0, "todo": 292 },
Updated by Tim Pierce about 10 years ago
- Description updated (diff)
It's interesting to me that this collection is kind of degenerate: each of the output files are produced from thousands or millions of chunks from the input blocks. The result is a collection with 60 streams in a 55MB manifest with records up to a million bytes long.
Not yet sure if this is hurting one_task_per_input_file but it seems worth keeping an eye on.
Updated by Tim Pierce about 10 years ago
- Subject changed from pipeline creates 300 tasks for job with 64 files to one_task_per_input_file creates 300 tasks for job with 64 files
- Category set to SDKs
Updated by Tim Pierce about 10 years ago
- Subject changed from one_task_per_input_file creates 300 tasks for job with 64 files to [SDKs] one_task_per_input_file creates 300 tasks for job with 64 files
Updated by Tim Pierce about 10 years ago
Sally reports that re-running this pipeline appears to be producing the expected results, so I recommend that we table this ticket for now and revisit it if we see it happen again.
Updated by Tom Clegg about 10 years ago
- Target version changed from Bug Triage to Arvados Future Sprints
Updated by Ward Vandewege about 10 years ago
updated report from Sally:
- running this pipeline on 9tee4 works fine.
- re-running on qr1hi showed the same issue: qr1hi-d1hrv-j1ejo9zu0muvrw7
Updated by Sarah Guthrie about 10 years ago
Re-appeared on 9tee4 after updating docker image. Update to docker image re-installed the python arvados sdk. Change producing the errant behavior was from e7134f -> d11fe66
sguthrie/numpy latest d11fe661b55e 9tee4-4zz18-yoqdj38cazjmf7l Mon Nov 17 23:51:41 2014
sguthrie/numpy latest e7134f3d7e0d 9tee4-4zz18-hbsoox45ftmw6r7 Thu Nov 13 23:40:07 2014
Problem appeared on pipeline instance: 9tee4-d1hrv-x9d8l2ntz3grvim
Updated by Tom Clegg about 10 years ago
I'm going out on a limb here and guessing adding cr.normalize()
before before this part of one_task_per_input_file
will fix it:
for s in cr.all_streams():
for f in s.all_files():
Updated by Sarah Guthrie about 10 years ago
Adding cr.normalize() fixed the bug:
https://workbench.qr1hi.arvadosapi.com/pipeline_instances/qr1hi-d1hrv-o0tfeiug897lht6
Updated by Tom Clegg about 10 years ago
- Target version changed from Arvados Future Sprints to 2014-12-10 sprint
Updated by Tom Clegg about 10 years ago
- Subject changed from [SDKs] one_task_per_input_file creates 300 tasks for job with 64 files to [SDKs] one_task_per_input_file should call normalize() before getting its list of files
Updated by Tim Pierce about 10 years ago
- Status changed from New to In Progress
Updated by Tim Pierce about 10 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:181e90a0934d0057202b92010a96139df934aaf1.