Bug #4499

[SDKs] one_task_per_input_file should call normalize() before getting its list of files

Added by Tim Pierce about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Tim Pierce
Category:
SDKs
Target version:
Start date:
12/09/2014
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
0.5

Description

sguthrie reports that qr1hi-d1hrv-3n1gezo6gondyrv, despite running with one_task_per_input_file, spawns 300 tasks on an input collection with 60 files in it:

Job qr1hi-8i9sb-zc7ivqn134e8czx takes input collection qr1hi-4zz18-zc07fkmhzhfzriz, which has 64 files.

This job runs addPopulationMemEff.py at commit 5299b46be, which specifies:

arvados.job_setup.one_task_per_input_file(if_sequence=0, and_end_task=True, input_as_path=True)

but the job produces nearly 300 tasks:

        "tasks_summary": {
          "done": 1,
          "running": 8,
          "failed": 0,
          "todo": 292
        },


Subtasks

Task #4755: Review 4499-one-task-per-input-file-normalizeResolvedTim Pierce

Task #4619: call normalize before reporting file listResolvedTim Pierce

Associated revisions

Revision 181e90a0
Added by Tim Pierce almost 6 years ago

Merge branch '4499-one-task-per-input-file-normalize'

Fixes #4499.

History

#1 Updated by Tim Pierce about 6 years ago

  • Description updated (diff)

#2 Updated by Tim Pierce about 6 years ago

  • Description updated (diff)

It's interesting to me that this collection is kind of degenerate: each of the output files are produced from thousands or millions of chunks from the input blocks. The result is a collection with 60 streams in a 55MB manifest with records up to a million bytes long.

Not yet sure if this is hurting one_task_per_input_file but it seems worth keeping an eye on.

#3 Updated by Tim Pierce about 6 years ago

  • Subject changed from pipeline creates 300 tasks for job with 64 files to one_task_per_input_file creates 300 tasks for job with 64 files
  • Category set to SDKs

#4 Updated by Tim Pierce about 6 years ago

  • Subject changed from one_task_per_input_file creates 300 tasks for job with 64 files to [SDKs] one_task_per_input_file creates 300 tasks for job with 64 files

#5 Updated by Tim Pierce about 6 years ago

Sally reports that re-running this pipeline appears to be producing the expected results, so I recommend that we table this ticket for now and revisit it if we see it happen again.

#6 Updated by Tom Clegg about 6 years ago

  • Target version changed from Bug Triage to Arvados Future Sprints

#7 Updated by Tom Clegg about 6 years ago

  • Story points set to 0.5

#8 Updated by Ward Vandewege about 6 years ago

updated report from Sally:
- running this pipeline on 9tee4 works fine.
- re-running on qr1hi showed the same issue: qr1hi-d1hrv-j1ejo9zu0muvrw7

#9 Updated by Sarah Guthrie about 6 years ago

Re-appeared on 9tee4 after updating docker image. Update to docker image re-installed the python arvados sdk. Change producing the errant behavior was from e7134f -> d11fe66

sguthrie/numpy latest d11fe661b55e 9tee4-4zz18-yoqdj38cazjmf7l Mon Nov 17 23:51:41 2014
sguthrie/numpy latest e7134f3d7e0d 9tee4-4zz18-hbsoox45ftmw6r7 Thu Nov 13 23:40:07 2014

Problem appeared on pipeline instance: 9tee4-d1hrv-x9d8l2ntz3grvim

#10 Updated by Tom Clegg about 6 years ago

I'm going out on a limb here and guessing adding cr.normalize() before before this part of one_task_per_input_file will fix it:

        for s in cr.all_streams():
            for f in s.all_files():

#11 Updated by Sarah Guthrie about 6 years ago

#12 Updated by Tom Clegg about 6 years ago

  • Target version changed from Arvados Future Sprints to 2014-12-10 sprint

#13 Updated by Tim Pierce about 6 years ago

  • Assigned To set to Tim Pierce

#14 Updated by Tom Clegg about 6 years ago

  • Subject changed from [SDKs] one_task_per_input_file creates 300 tasks for job with 64 files to [SDKs] one_task_per_input_file should call normalize() before getting its list of files

#15 Updated by Tim Pierce almost 6 years ago

  • Status changed from New to In Progress

#16 Updated by Peter Amstutz almost 6 years ago

LGTM

#17 Updated by Tim Pierce almost 6 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:181e90a0934d0057202b92010a96139df934aaf1.

Also available in: Atom PDF