Story #6265

[CWL] Contribute to the CWL specification

Added by Brett Smith over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
06/23/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Subtasks

Task #6290: Review specificationResolvedPeter Amstutz

Task #6364: Process remaining draft 2 change requestsResolvedPeter Amstutz

History

#1 Updated by Brett Smith over 6 years ago

  • Target version changed from Bug Triage to 2015-07-08 sprint

#2 Updated by Peter Amstutz over 6 years ago

  • Assigned To set to Peter Amstutz

#3 Updated by Peter Amstutz over 6 years ago

CWL draft 2 is ready for review.

The specification is online at

https://common-workflow-language.github.io/draft-2/

It is generated from this file:

https://github.com/common-workflow-language/common-workflow-language/blob/master/schemas/draft-2/cwl-avro.yml

At this point I don't want to make any draft 2 technical changes, but any suggestions for copy editing or clarifications would be very helpful.

#4 Updated by Peter Amstutz over 6 years ago

  • Status changed from New to In Progress

#5 Updated by Brett Smith over 6 years ago

Peter Amstutz wrote:

CWL draft 2 is ready for review.

Trying to describe little copyedits in prose always seems frustrating to both parties, so I've submitted a pull request with a bunch of those. I kept that very limited to to clear fixes. Beyond that, here are points in the draft where improvement requires a little more judgment.

  • When there is a dependency, the workflow engine must execute the producer process first and wait for it to successfully complete and produce output before executing the dependent second process that will consume the output.
    I had to go over this sentence a couple of times to make sure I was following it right. Could it be simplified a bit, trimming some of the redundant description? Something like, "When there is a dependency, the workflow engine must execute the dependency process and wait for it to successfully produce output before executing the dependent process."
  • Otherwise add `prefix` and recursively add individual elements.
    I think "recursively" means to follow the same binding behavior for each item in the array. I start to wonder exactly where and when the prefix is getting added, though. If the prefix is "foo" and the binding is an array of strings ["bar", "baz"], should this be evaluated to "foofoobarfoobaz"? I assume not, but that seems like one possible interpretation of this wording. The same comment applies to the rule for object below this.
  • If provided an array, match all patterns in the array.
    I think this means "Find files that match any pattern in the array" (as opposed to "Find files that match all patterns in the array"). If so, I think saying that might be clearer.
  • Assign a sorting key for each leaf binding object by appending nested `position` fields together with the array index or map key of the data at each nesting level.
    An example here (as in step 1) might be helpful.
  • If the program does not include a path separator, search the `$PATH` variable in the runtime environment find the absolute path of the executable.
    Earlier, the specification says that the process' runtime environment must not be inherited from the parent. Are we searching the workflow runner's $PATH here, or the process' defined $PATH? Clarification might be helpful either way—if the latter, it would be nice to remind readers that $PATH is empty by default.
  • The input shall be an array consisting of exactly one entry for each input link.
    This is the only occurrence of "shall" in the document, so I wonder what it means.
  • May be skipped if `dockerPull` is specified, in which case the `dockerPull` image id will be used.
    I think it would be clearer if "will" was replaced with "may" or "must" here, as appropriate.
  • The DockerRequirement doc links to itself a lot. Referring to specific fields of the record might be more helpful in those instances. It also doesn't consistently distinguish between the case where it's specified as a hint vs. a requirement. For example, it says, "The platform must execute the tool in the container using `docker run` with the appropriate Docker image and tool command line," but I don't think this is true when the DockerRequirement is only a hint.
  • This is veering into technical territory, so it's probably a post-draft 2 thing, but I wonder how EnvironmentDef interacts with DockerRequirement. Should those variables be defined when docker run is invoked?

Thanks.

#6 Updated by Brett Smith over 6 years ago

  • Target version changed from 2015-07-08 sprint to 2015-07-22 sprint

#7 Updated by Peter Amstutz over 6 years ago

  • Status changed from In Progress to Resolved

#8 Updated by Brett Smith over 6 years ago

For the benefit of any future readers, most of the concerns above were addressed in https://github.com/common-workflow-language/common-workflow-language/commit/261d5c3f4fbe3320c352397a45ea4d1b280999a9

Also available in: Atom PDF