Story #4035

[Sample pipelines] Proof-of-concept support for common-workflow-language tool description in Arvados

Added by Peter Amstutz over 5 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Sample Pipelines
Target version:
-
Start date:
11/13/2014
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Description

The common-workflow-language working group is working on a tool description language and pipeline framework so that pipelines can be portable across various platforms such as Galaxy, Seven Bridges, and Arvados.

The first step is to implement proof of concept support:
  1. Review and contribute as necessary to the reference implementation and specification documents
  2. Implement a runner crunch script (similar to, but distinct from, run-command) which uses the reference implementation, accepts the tool description language and job inputs, generates a command line, runs the tool, and collects the outputs.

This should not require any changes to Arvados features.

cwltemplate.json (382 Bytes) cwltemplate.json Peter Amstutz, 11/18/2014 04:39 PM
cat1-tool.json (663 Bytes) cat1-tool.json Peter Amstutz, 11/18/2014 04:43 PM

Subtasks

Task #4511: Work on reference implementationResolvedPeter Amstutz

Task #4513: Conformance test frameworkResolvedPeter Amstutz

Task #4604: Sandboxing for javascript expressionsResolvedPeter Amstutz

Task #4382: Decide what to doResolvedPeter Amstutz

Task #4512: Implement support in arv-run-pipeline-instanceResolvedPeter Amstutz

Task #4572: Automatically build and install cwltool python packageResolvedWard Vandewege

Task #4575: Review 4035-pipeline-support-cwl-toolResolvedPeter Amstutz


Related issues

Related to Arvados - Support #4564: [Documentation] Document using common-workflow-language tools with Arvados pipelinesResolved12/08/2014

Related to Arvados - Story #4687: [Crunch] Support Brad Chapman to port bcbio tools and workflows to CWLResolved01/12/2015

History

#1 Updated by Peter Amstutz over 5 years ago

  • Description updated (diff)
  • Story points set to 3.0

#2 Updated by Tom Clegg over 5 years ago

  • Subject changed from Proof-of-concept support for common-workflow-language tool description in Arvados to [DRAFT] [Sample pipelines] Proof-of-concept support for common-workflow-language tool description in Arvados

#3 Updated by Tom Clegg over 5 years ago

  • Target version set to Arvados Future Sprints

#4 Updated by Tom Clegg over 5 years ago

  • Subject changed from [DRAFT] [Sample pipelines] Proof-of-concept support for common-workflow-language tool description in Arvados to [Sample pipelines] Proof-of-concept support for common-workflow-language tool description in Arvados
  • Description updated (diff)
  • Category set to Sample Pipelines

#5 Updated by Ward Vandewege over 5 years ago

  • Target version changed from Arvados Future Sprints to 2014-11-19 sprint

#6 Updated by Ward Vandewege over 5 years ago

  • Assigned To set to Peter Amstutz

#7 Updated by Peter Amstutz over 5 years ago

  • Status changed from New to In Progress

#8 Updated by Peter Amstutz over 5 years ago

  • Description updated (diff)

#9 Updated by Peter Amstutz over 5 years ago

To test:

$ git clone https://github.com/rabix/common-workflow-language.git
$ cd common-workflow-language/reference
$ easy_install .
$ arv-put cat1-tool.json
$ arv-run-pipeline-instance --template cwltemplate.json --run-jobs-here

#10 Updated by Peter Amstutz over 5 years ago

  • Target version changed from 2014-11-19 sprint to 2014-12-10 sprint

#11 Updated by Peter Amstutz over 5 years ago

  • Story points changed from 3.0 to 1.0

#12 Updated by Tim Pierce over 5 years ago

First pass on review:

The rabix/experiments repo seems to duplicate a lot of the work in the common-workflow-language repository, but less thoroughly and hasn't been updated since August. Does common-workflow-language supersede rabix/experiments?

In the common-workflow-language repo:

I could not get the reference implementation to pass its unit tests:

hitchcock:/home/twp/common-workflow-language/reference% python setup.py test
running test
Searching for jsonschema>=2.4.0
Reading https://pypi.python.org/simple/jsonschema/
Best match: jsonschema 2.4.0
Downloading https://pypi.python.org/packages/source/j/jsonschema/jsonschema-2.4.0.zip#md5=f645c88123189976058fcf550c02e50f
Processing jsonschema-2.4.0.zip
Writing /tmp/easy_install-xZWzjY/jsonschema-2.4.0/setup.cfg
Running jsonschema-2.4.0/setup.py -q bdist_egg --dist-dir /tmp/easy_install-xZWzjY/jsonschema-2.4.0/egg-dist-tmp-DwEYCO
zip_safe flag not set; analyzing archive contents...
jsonschema.tests.test_jsonschema_test_suite: module references __file__

Installed /home/twp/common-workflow-language/reference/jsonschema-2.4.0-py2.7.egg
running egg_info
writing requirements to cwltool.egg-info/requires.txt
writing cwltool.egg-info/PKG-INFO
writing top-level names to cwltool.egg-info/top_level.txt
writing dependency_links to cwltool.egg-info/dependency_links.txt
writing entry points to cwltool.egg-info/entry_points.txt
reading manifest file 'cwltool.egg-info/SOURCES.txt'
writing manifest file 'cwltool.egg-info/SOURCES.txt'
running build_ext
test_job_order (tests.test_examples.TestExamples) ... ERROR

======================================================================
ERROR: test_job_order (tests.test_examples.TestExamples)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/twp/common-workflow-language/reference/tests/test_examples.py", line 8, in test_job_order
    job = t.job(from_url("../examples/bwa-mem-job.json"))
TypeError: job() takes at least 3 arguments (2 given)

----------------------------------------------------------------------
Ran 1 test in 0.129s

FAILED (errors=1)

The conformance suite also fails: each test fails with "No module named jsonschema.exceptions". I tried installing the reference implementation into a virtualenv and activating that, but then each test failed with "No module named requests".

I realize that the schema and documentation are in flux, but these things that jumped out at me on a first reading:

  • The tool schema #requirements/resources doesn't specify units for diskspace or mem. This probably can't be enforced at the schema level but needs to be documented formally.
  • examples/cat2-tool.json refers to a #inputs/file1/path parameter that isn't present.

#13 Updated by Peter Amstutz over 5 years ago

  • Target version changed from 2014-12-10 sprint to 2015-01-07 sprint

#14 Updated by Peter Amstutz over 5 years ago

  • Target version changed from 2015-01-07 sprint to Arvados Future Sprints

#15 Updated by Peter Amstutz almost 4 years ago

  • Status changed from In Progress to Resolved

#16 Updated by Tom Clegg about 3 years ago

  • Target version deleted (Arvados Future Sprints)

Also available in: Atom PDF