Project

General

Profile

Actions

Bug #3342

closed

[Crunch] Pipeline outputs not added to current project.

Added by Peter Amstutz over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
1.0

Description

Replace the pipeline template component's output_is_persistent flag with an output_name attribute. Use the output_name to add a name link to the appropriate project (i.e., the one the pipeline instance belongs to).


Subtasks 1 (0 open1 closed)

Task #3708: Review 3342-pipeline-output-current-projectResolvedPeter Amstutz08/27/2014Actions
Actions #1

Updated by Peter Amstutz over 9 years ago

  • Subject changed from Pipeline outputs not added to current project. to [Crunch] Pipeline outputs not added to current project.
  • Category set to Crunch

The output files from a pipeline run are not added to any project. This is very inconvenient because in Workbench it is very difficult to manipulate collections that are not part of a project.

Actions #2

Updated by Radhika Chippada over 9 years ago

  • Target version set to 2014-08-27 Sprint
Actions #3

Updated by Radhika Chippada over 9 years ago

  • Story points set to 0.5
Actions #4

Updated by Brett Smith over 9 years ago

  • Assigned To set to Brett Smith
  • Story points changed from 0.5 to 1.0
Actions #5

Updated by Tom Clegg over 9 years ago

  • Description updated (diff)
Actions #6

Updated by Peter Amstutz over 9 years ago

  • Status changed from New to In Progress
  • Assigned To changed from Brett Smith to Peter Amstutz
Actions #7

Updated by Tim Pierce over 9 years ago

Reviewing at 7c8e8d37:

apps/workbench/app/models/pipeline_instance.rb:
  • Since each component has its own output_name field, different components of the pipeline may have different names -- does it really make sense for PipelineInstance.friendly_link_name to just pick one at random for the whole instance? I'm concerned especially that because PipelineInstance.components is a hash and not an array (i.e. not ordered), then friendly_link_name won't necessarily return consistent results for the same pipeline instance over time.
apps/workbench/app/views/pipeline_instances/_show_components_running.html.erb
  • I'm finding it hard to follow what the difference is between current_job and pj are -- I would have thought that each item produced by render_pipeline_jobs would be a Job, but apparently pj[:job] is the Job and pj is something else. The fact that we're replacing current_job[:output] with pj[:output_uuid] is puzzling. This isn't exactly a request - just an observation that it's hard to tell whether the change is doing the right thing.
sdk/cli/bin/crunch-job
  • If we no longer use the @stripped_manifest_lines or $stripped_manifest_text hack, can we get rid of that code entirely? That would be awesome.
Actions #8

Updated by Peter Amstutz over 9 years ago

Tim Pierce wrote:

Reviewing at 7c8e8d37:

apps/workbench/app/models/pipeline_instance.rb:
  • Since each component has its own output_name field, different components of the pipeline may have different names -- does it really make sense for PipelineInstance.friendly_link_name to just pick one at random for the whole instance? I'm concerned especially that because PipelineInstance.components is a hash and not an array (i.e. not ordered), then friendly_link_name won't necessarily return consistent results for the same pipeline instance over time.

I added this because I wanted the Collection page to be able to link back to the pipeline that produced the collection output, with sensible link text. That code is checking to see if the pipeline instance has a empty name, and if so, it instead returns the name of the template that the pipeline was generated from. It doesn't have anything to do with the addition of output_name.

There was a bug where it wasn't checking if anything was returned from PipelineTemplate.where(), that's fixed now.

apps/workbench/app/views/pipeline_instances/_show_components_running.html.erb
  • I'm finding it hard to follow what the difference is between current_job and pj are -- I would have thought that each item produced by render_pipeline_jobs would be a Job, but apparently pj[:job] is the Job and pj is something else. The fact that we're replacing current_job[:output] with pj[:output_uuid] is puzzling. This isn't exactly a request - just an observation that it's hard to tell whether the change is doing the right thing.

The job continues to store the manifest hash in output, however the pipeline instance component now stores output_uuid which is the collection created by arv-run-pipeline-instance. The change is to link to the actual collection that was created by the pipeline instance, rather than to the manifest hash which may be referenced by multiple jobs. Regarding pj, this code is a mess generally, hopefully it will be cleaned up in #3187

sdk/cli/bin/crunch-job
  • If we no longer use the @stripped_manifest_lines or $stripped_manifest_text hack, can we get rid of that code entirely? That would be awesome.

You are absolutely correct, since we don't need to provide portable_data_hash along with the manifest text any more, then we don't need to do the calculation to hash the stripped md5 sum either. Deleted.

Actions #9

Updated by Tim Pierce over 9 years ago

LGTM to merge. Thanks for this.

Actions #10

Updated by Anonymous over 9 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:0042e4b42d9f4d3900aefc68617cb28c5a61a522.

Actions

Also available in: Atom PDF