Feature #9275

[CWL] arvados-cwl-runner creates job components instead of pipeline instance

Added by Peter Amstutz about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Radhika Chippada
Category:
-
Target version:
Start date:
06/03/2016
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Subtasks

Task #9295: Review branch 9275-cwl-runner-creates-jobsResolvedPeter Amstutz

Associated revisions

Revision 6d940c5e
Added by Peter Amstutz about 3 years ago

Merge branch '9275-cwl-runner-creates-jobs' closes #9275

Revision b7db50d7 (diff)
Added by Peter Amstutz about 3 years ago

Bugfix submitting cwl jobs with arvados-cwl-runner refs #9275

History

#1 Updated by Peter Amstutz about 3 years ago

  • Release deleted (11)

#2 Updated by Brett Smith about 3 years ago

  • Target version set to 2016-06-08 sprint

#3 Updated by Brett Smith about 3 years ago

  • Assigned To set to Radhika Chippada

#4 Updated by Radhika Chippada about 3 years ago

  • Story points set to 1.0

#5 Updated by Radhika Chippada about 3 years ago

From our conversation earlier today: we would be running cwl work flows using:

  1. submit mode where runner job is submitted to server
  2. in local mode, where we are creating a pipeline instance and attaching the runner job to it
  3. from workbench, where a pipeline instance is created from the template

We also talked about work units / processes which would only be pipelines or containers. This seems to be in conflict with how we are handling the submit mode.

Should we be creating a pipeline instance in the case of submit as well?

Please clarify requirements.

#6 Updated by Peter Amstutz about 3 years ago

Radhika Chippada wrote:

From our conversation earlier today: we would be running cwl work flows using:

  1. submit mode where runner job is submitted to server
  2. in local mode, where we are creating a pipeline instance and attaching the runner job to it
  3. from workbench, where a pipeline instance is created from the template

We also talked about work units / processes which would only be pipelines or containers. This seems to be in conflict with how we are handling the submit mode.

Should we be creating a pipeline instance in the case of submit as well?

Please clarify requirements.

  1. In submit mode it should create a pipeline instance with a single component which is the cwl-runner job. the cwl-runner job should list its jobs as components of the cwl-runner job.
  2. in local mode, it should create a pipeline add the component jobs to it (the existing behavior remains unchanged)
  3. from workbench, there is nothing to do on the pipeline instance (it is created by workbench), and the cwl-runner job should list its jobs as components of the cwl-runner job.

#7 Updated by Radhika Chippada about 3 years ago

Branch 9275-cwl-runner-creates-jobs at 94d3f4ba addresses the comments from note 6.

#8 Updated by Peter Amstutz about 3 years ago

94d3f4ba358ac5fef765fb7574d6823ff8c335aa

It should create a pipeline instance when submit is true (case 1) or cwl_runner_job is not in kwargs (case 2). If cwl_runner_job is present we should assume there is already a pipeline instance.

For (1) and (3), it needs to update the 'components' field of Job record. That's missing from the branch. You can add this as an alternate behavior in ArvadosJob.update_pipeline_component. If self.arvrunner.pipeline is not None, keep the existing behavior. Otherwise, update the Job record corresponding to cwl_runner_job on API server and add to job["components"][self.name] = record["uuid"]

#9 Updated by Peter Amstutz about 3 years ago

In update_pipeline_component(self, record) starting on line 211:

Add if self.arvrunner.pipeline: because it will raise an exception when self.arvrunner.pipeline is None (it looks like that exception is being caught and discarded, but it will prevent it from getting to the 2nd part).

line 222:

                    components['component'+str(len(components))] = record["uuid"]

This should use self.name instead of 'component'+str(len(components))

Line 683:

        if kwargs.get("submit") or "cwl_runner_job" not in kwargs:

On further examination, think we can drop the kwargs.get("submit"), because the case where we don't want to create a pipeline is the case when "cwl_runner_job" in kwargs is True; it should never be the case for cwl_runner_job and submit to be True.

line 714:

                if "cwl_runner_job" in kwargs:
                    uuid = kwargs.get("cwl_runner_job").get('uuid')

This needs to be self.uuid = ...

#10 Updated by Radhika Chippada about 3 years ago

Make those corrections.

Regarding: use self.name instead of 'component'+str(len(components))

Is self.name guaranteed to be non-null? Otherwise, how about " self.name or 'component'+str(len(components)) "

Line 714: This needs to be self.uuid = ...

Ah. Thanks for noticing it.

#11 Updated by Radhika Chippada about 3 years ago

  • Status changed from New to In Progress

#12 Updated by Peter Amstutz about 3 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:6d940c5e6940a1dca97989c47e67c33d20a4d050.

Also available in: Atom PDF