Project

General

Profile

Actions

Idea #4035

closed

[Sample pipelines] Proof-of-concept support for common-workflow-language tool description in Arvados

Added by Peter Amstutz over 9 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Sample Pipelines
Target version:
-
Start date:
11/13/2014
Due date:
Story points:
1.0

Description

The common-workflow-language working group is working on a tool description language and pipeline framework so that pipelines can be portable across various platforms such as Galaxy, Seven Bridges, and Arvados.

The first step is to implement proof of concept support:
  1. Review and contribute as necessary to the reference implementation and specification documents
  2. Implement a runner crunch script (similar to, but distinct from, run-command) which uses the reference implementation, accepts the tool description language and job inputs, generates a command line, runs the tool, and collects the outputs.

This should not require any changes to Arvados features.


Files

cwltemplate.json (382 Bytes) cwltemplate.json Peter Amstutz, 11/18/2014 04:39 PM
cat1-tool.json (663 Bytes) cat1-tool.json Peter Amstutz, 11/18/2014 04:43 PM

Subtasks 7 (0 open7 closed)

Task #4511: Work on reference implementationResolvedPeter Amstutz11/13/2014Actions
Task #4513: Conformance test frameworkResolvedPeter Amstutz11/13/2014Actions
Task #4604: Sandboxing for javascript expressionsResolvedPeter Amstutz11/13/2014Actions
Task #4382: Decide what to doResolvedPeter Amstutz11/13/2014Actions
Task #4512: Implement support in arv-run-pipeline-instanceResolvedPeter Amstutz11/13/2014Actions
Task #4572: Automatically build and install cwltool python packageResolvedWard Vandewege11/13/2014Actions
Task #4575: Review 4035-pipeline-support-cwl-toolResolvedPeter Amstutz11/13/2014Actions

Related issues

Related to Arvados - Support #4564: [Documentation] Document using common-workflow-language tools with Arvados pipelinesResolvedPeter Amstutz12/08/2014Actions
Related to Arvados - Idea #4687: [Crunch] Support Brad Chapman to port bcbio tools and workflows to CWLResolvedPeter Amstutz01/12/2015Actions
Actions #1

Updated by Peter Amstutz over 9 years ago

  • Description updated (diff)
  • Story points set to 3.0
Actions #2

Updated by Tom Clegg over 9 years ago

  • Subject changed from Proof-of-concept support for common-workflow-language tool description in Arvados to [DRAFT] [Sample pipelines] Proof-of-concept support for common-workflow-language tool description in Arvados
Actions #3

Updated by Tom Clegg over 9 years ago

  • Target version set to Arvados Future Sprints
Actions #4

Updated by Tom Clegg over 9 years ago

  • Subject changed from [DRAFT] [Sample pipelines] Proof-of-concept support for common-workflow-language tool description in Arvados to [Sample pipelines] Proof-of-concept support for common-workflow-language tool description in Arvados
  • Description updated (diff)
  • Category set to Sample Pipelines
Actions #5

Updated by Ward Vandewege over 9 years ago

  • Target version changed from Arvados Future Sprints to 2014-11-19 sprint
Actions #6

Updated by Ward Vandewege over 9 years ago

  • Assigned To set to Peter Amstutz
Actions #7

Updated by Peter Amstutz over 9 years ago

  • Status changed from New to In Progress
Actions #8

Updated by Peter Amstutz over 9 years ago

  • Description updated (diff)
Actions #9

Updated by Peter Amstutz over 9 years ago

To test:

$ git clone https://github.com/rabix/common-workflow-language.git
$ cd common-workflow-language/reference
$ easy_install .
$ arv-put cat1-tool.json
$ arv-run-pipeline-instance --template cwltemplate.json --run-jobs-here
Actions #10

Updated by Peter Amstutz over 9 years ago

  • Target version changed from 2014-11-19 sprint to 2014-12-10 sprint
Actions #11

Updated by Peter Amstutz over 9 years ago

  • Story points changed from 3.0 to 1.0
Actions #12

Updated by Tim Pierce over 9 years ago

First pass on review:

The rabix/experiments repo seems to duplicate a lot of the work in the common-workflow-language repository, but less thoroughly and hasn't been updated since August. Does common-workflow-language supersede rabix/experiments?

In the common-workflow-language repo:

I could not get the reference implementation to pass its unit tests:

hitchcock:/home/twp/common-workflow-language/reference% python setup.py test
running test
Searching for jsonschema>=2.4.0
Reading https://pypi.python.org/simple/jsonschema/
Best match: jsonschema 2.4.0
Downloading https://pypi.python.org/packages/source/j/jsonschema/jsonschema-2.4.0.zip#md5=f645c88123189976058fcf550c02e50f
Processing jsonschema-2.4.0.zip
Writing /tmp/easy_install-xZWzjY/jsonschema-2.4.0/setup.cfg
Running jsonschema-2.4.0/setup.py -q bdist_egg --dist-dir /tmp/easy_install-xZWzjY/jsonschema-2.4.0/egg-dist-tmp-DwEYCO
zip_safe flag not set; analyzing archive contents...
jsonschema.tests.test_jsonschema_test_suite: module references __file__

Installed /home/twp/common-workflow-language/reference/jsonschema-2.4.0-py2.7.egg
running egg_info
writing requirements to cwltool.egg-info/requires.txt
writing cwltool.egg-info/PKG-INFO
writing top-level names to cwltool.egg-info/top_level.txt
writing dependency_links to cwltool.egg-info/dependency_links.txt
writing entry points to cwltool.egg-info/entry_points.txt
reading manifest file 'cwltool.egg-info/SOURCES.txt'
writing manifest file 'cwltool.egg-info/SOURCES.txt'
running build_ext
test_job_order (tests.test_examples.TestExamples) ... ERROR

======================================================================
ERROR: test_job_order (tests.test_examples.TestExamples)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/twp/common-workflow-language/reference/tests/test_examples.py", line 8, in test_job_order
    job = t.job(from_url("../examples/bwa-mem-job.json"))
TypeError: job() takes at least 3 arguments (2 given)

----------------------------------------------------------------------
Ran 1 test in 0.129s

FAILED (errors=1)

The conformance suite also fails: each test fails with "No module named jsonschema.exceptions". I tried installing the reference implementation into a virtualenv and activating that, but then each test failed with "No module named requests".

I realize that the schema and documentation are in flux, but these things that jumped out at me on a first reading:

  • The tool schema #requirements/resources doesn't specify units for diskspace or mem. This probably can't be enforced at the schema level but needs to be documented formally.
  • examples/cat2-tool.json refers to a #inputs/file1/path parameter that isn't present.
Actions #13

Updated by Peter Amstutz over 9 years ago

  • Target version changed from 2014-12-10 sprint to 2015-01-07 sprint
Actions #14

Updated by Peter Amstutz over 9 years ago

  • Target version changed from 2015-01-07 sprint to Arvados Future Sprints
Actions #15

Updated by Peter Amstutz almost 8 years ago

  • Status changed from In Progress to Resolved
Actions #16

Updated by Tom Clegg about 7 years ago

  • Target version deleted (Arvados Future Sprints)
Actions

Also available in: Atom PDF