Project

General

Profile

Actions

Feature #17054

open

Custom naming for scatter steps

Added by Peter Amstutz over 3 years ago. Updated about 2 months ago.

Status:
In Progress
Priority:
Normal
Assigned To:
-
Category:
CWL
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

Add an extension to cwltool that allows the user to provide an expression that will determine the runtime name of a workflow step or scatter step. When a new cwltool is released, update the dependency arvados-cwl-runner.

Suggested approach

1. Add a new process requirement to cwltool/extensions-v1.1.yml

- name: StepNameHint
  type: record
  inVocab: false
  extends: cwl:ProcessRequirement
  doc: |
    Provide a hint for naming the runtime workflow step in logs or user interface.
  fields:
    - name: class
      type: string
      doc: "Always 'StepNameHint'" 
      jsonldPredicate:
        "_id": "@type" 
        "_type": "@vocab" 
    - name: stepname:
      type: [string, Expression]
      doc: |
        A string or expression returning a string with the preferred name for the step.  
        If it is an expression, it is evaluated after the input object has been completely determined.

2. update supportedProcessRequirements

Add "http://commonwl.org/cwltool#StepNameHint" to process.py supportedProcessRequirements

3. Update setup_schema() in main.py

use_custom_schema("v1.2", "http://commonwl.org/cwltool", ext11)

you should also add this to the "else" branch:
use_standard_schema("v1.2")

4. Update WorkflowJobStep in workflow_job.py

Add code to the job() method that

  1. checks if the current workflow step has "http://commonwl.org/cwltool#StepNameHint" in "hints" or "requirements"
  2. If so, gets the value of "stepname"
  3. Then does self.name = expression.do_eval(stepname)

5. Add tests

Write a workflow that uses the new hint to with an expression that uses something from the input to set the name of the workflow step.

Write a test case that calls cwltool --enable-ext and checks that the log output uses the custom name.


Files

errormsg.txt (97.9 KB) errormsg.txt Jiayong Li, 06/08/2021 08:42 PM

Subtasks 1 (1 open0 closed)

Task #17456: ReviewNewPeter AmstutzActions

Related issues

Related to Arvados Epics - Idea #20273: More CWL runner improvementsNewActions
Actions #1

Updated by Peter Amstutz over 3 years ago

  • Related to Idea #16011: CWL support, docs, training, website added
Actions #2

Updated by Nico César over 3 years ago

  • Related to Feature #16462: Expand arvados-controller to expose forecast features added
Actions #3

Updated by Peter Amstutz about 3 years ago

  • Target version set to 2021-03-31 sprint
  • Assigned To set to Jiayong Li
Actions #4

Updated by Peter Amstutz about 3 years ago

  • Description updated (diff)
Actions #5

Updated by Peter Amstutz about 3 years ago

  • Description updated (diff)
Actions #6

Updated by Peter Amstutz about 3 years ago

  • Target version changed from 2021-03-31 sprint to 2021-04-14 sprint
Actions #7

Updated by Peter Amstutz about 3 years ago

  • Target version changed from 2021-04-14 sprint to 2021-04-28 bughunt sprint
Actions #8

Updated by Peter Amstutz about 3 years ago

  • Target version deleted (2021-04-28 bughunt sprint)
Actions #9

Updated by Peter Amstutz about 3 years ago

  • Target version set to 2021-04-28 bughunt sprint
Actions #10

Updated by Jiayong Li about 3 years ago

  • Status changed from New to In Progress
Actions #11

Updated by Peter Amstutz about 3 years ago

  • Target version changed from 2021-04-28 bughunt sprint to 2021-05-12 sprint
Actions #12

Updated by Jiayong Li almost 3 years ago

Working plan for changes

    def job(
        self,
        joborder: CWLObjectType,
        output_callback: Optional[OutputCallbackType],
        runtimeContext: RuntimeContext,
    ) -> JobsGeneratorType:
        runtimeContext = runtimeContext.copy()
        runtimeContext.part_of = self.name

        # change custom naming
        for hint in self.step["hints"]:
            if hint["class"] == "http://commonwl.org/cwltool#StepNameHint":
                runtimeContext.name = expression.do_eval(hint["stepname"])
        else:
            runtimeContext.name = shortname(self.id)

        _logger.info("[%s] start", self.name)

        yield from self.step.job(joborder, output_callback, runtimeContext)
Actions #13

Updated by Peter Amstutz almost 3 years ago

  • Target version changed from 2021-05-12 sprint to 2021-05-26 sprint
Actions #14

Updated by Peter Amstutz almost 3 years ago

  • Target version changed from 2021-05-26 sprint to 2021-06-09 sprint
Actions #15

Updated by Jiayong Li almost 3 years ago

My test command line tool, workflow, and input yml are

echo.cwl

cwlVersion: v1.1
class: CommandLineTool
inputs:
  text:
    type: string
    inputBinding: {}
outputs: []
baseCommand: echo

scatter-echo-wf.cwl

$namespaces:
  cwltool: "http://commonwl.org/cwltool#" 
cwlVersion: v1.1
class: Workflow
requirements:
  ScatterFeatureRequirement: {}
  InlineJavascriptRequirement: {}

inputs:
  texts:
    type: string[]

outputs: []

steps:
  echo:
    run: echo.cwl
    scatter: text
    hints:
      cwltool:StepNameHint:
        stepname: $(inputs.text)
    in:
      text: texts
    out: []

scatter-echo-wf.yml

texts: ["a", "b", "c"]

The code I replaced in L72 of workflow_job.py is

runtimeContext.name = expression.do_eval(hint["stepname"], joborder, self.step.requirements, None, None, {})

I'm getting an error when I run "cwltool scatter-echo-wf.cwl scatter-echo-wf.yml".

Traceback (most recent call last):
  File "/home/jiayong/Code/env/cwltool/lib/python3.6/site-packages/cwltool/sandboxjs.py", line 332, in execjs
    return cast(CWLOutputType, json.loads(stdout))
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/jiayong/Code/env/cwltool/lib/python3.6/site-packages/cwltool/expression.py", line 414, in do_eval
    else 2,
  File "/home/jiayong/Code/env/cwltool/lib/python3.6/site-packages/cwltool/expression.py", line 308, in interpolate
    js_console=js_console,
  File "/home/jiayong/Code/env/cwltool/lib/python3.6/site-packages/cwltool/expression.py", line 243, in evaluator
    js_console=js_console,
  File "/home/jiayong/Code/env/cwltool/lib/python3.6/site-packages/cwltool/sandboxjs.py", line 338, in execjs
    ) from err
cwltool.sandboxjs.JavascriptException: Expecting value: line 1 column 1 (char 0)
script was:
01 "use strict";
02 var inputs = {
03     "file:///home/jiayong/Code/work_scripts/cwl/echo/scatter-echo-wf.cwl#echo/text": "a" 
04 };
05 var self = null;
06 var runtime = {
07     "tmpdir": null,
08     "outdir": null
09 };
10 (function(){return ((inputs.text));})()
stdout was: 'undefined'
stderr was: ''

The three key arguments hint["stepname"], joborder, self.step.requirements are as follows

hint["stepname"]: $(inputs.text)
joborder: {'file:///home/jiayong/Code/work_scripts/cwl/echo/scatter-echo-wf.cwl#echo/text': 'a'}
self.step.requirements: [ordereddict([('class', 'InlineJavascriptRequirement')]), ordereddict([('class', 'ScatterFeatureRequirement')])]

Any idea what went wrong there?

Actions #16

Updated by Jiayong Li almost 3 years ago

Now the problem is that cwltool runs without --enable-ext, but errors out when the flag is turned on.

echo.cwl

cwlVersion: v1.1
class: CommandLineTool
inputs:
  text:
    type: string
    inputBinding: {}
outputs: []
baseCommand: echo

scatter-echo-wf.cwl

$namespaces:
  cwltool: "http://commonwl.org/cwltool#" 
cwlVersion: v1.1
class: Workflow
requirements:
  ScatterFeatureRequirement: {}
  InlineJavascriptRequirement: {}

inputs:
  texts:
    type: string[]

outputs: []

steps:
  echo:
    run: echo.cwl
    scatter: text
    hints:
      cwltool:StepNameHint:
        stepname: $("test_" + inputs.text.split('.')[0])
    in:
      text: texts
    out: []

scatter-echo-wf.yml

texts: ["a.vcf", "b.vcf", "c.vcf"]

Error message:

$ cwltool --enable-ext scatter-echo-wf.cwl scatter-echo-wf.yml 
INFO /home/jiayong/Code/env/cwltool-jiayong/bin/cwltool 3.1.20210511185845
INFO Resolved 'scatter-echo-wf.cwl' to 'file:///home/jiayong/Code/work_scripts/cwl/echo/scatter-echo-wf.cwl'
ERROR Tool definition failed validation:
http://commonwl.org/cwltool:68:3: checking object `http://commonwl.org/cwltool#StepNameHint`
http://commonwl.org/cwltool:74:3:   checking field `fields`
http://commonwl.org/cwltool:81:7:     checking object
                                      `http://commonwl.org/cwltool#StepNameHint/stepname`
                                        Field `type` references unknown identifier `Expression`,
                                        tried http://commonwl.org/cwltool#Expression

I figured out the failed validation is coming from the appended section in extensions-v1.1.yml

- name: StepNameHint
  type: record
  inVocab: false
  extends: cwl:ProcessRequirement
  doc: |
    Provide a hint for naming the runtime workflow step in logs or user interface.
  fields:
    - name: class
      type: string
      doc: "Always 'StepNameHint'" 
      jsonldPredicate:
        "_id": "@type" 
        "_type": "@vocab" 
    - name: stepname
      type: [string, Expression]
      doc: |
        A string or expression returning a string with the preferred name for the step.
        If it is an expression, it is evaluated after the input object has been completely determined.

I'm not sure how I should write the hint differently so the tool definition validation will pass.

Actions #17

Updated by Jiayong Li almost 3 years ago

Attached error message from running cwltool --enable-ext scatter-echo-wf.cwl scatter-echo-wf.yml with the following changes to extensions-v1.1.yml

- name: StepNameHint
  type: record
  inVocab: false
  extends: cwl:ProcessRequirement
  doc: |
    Provide a hint for naming the runtime workflow step in logs or user interface.
  fields:
    - name: class
      type: string
      doc: "Always 'StepNameHint'" 
      jsonldPredicate:
        "_id": "@type" 
        "_type": "@vocab" 
    - name: stepname
      type: [string, cwl:Expression]
      doc: |
        A string or expression returning a string with the preferred name for the step.
        If it is an expression, it is evaluated after the input object has been completely determined.
Actions #18

Updated by Peter Amstutz almost 3 years ago

  • Target version changed from 2021-06-09 sprint to 2021-06-23 sprint
Actions #19

Updated by Peter Amstutz almost 3 years ago

  • Target version changed from 2021-06-23 sprint to 2021-07-07 sprint
Actions #20

Updated by Peter Amstutz almost 3 years ago

Jiayong,

If you merge with the latest cwltool, the reference to cwl:Expression in extensions-v1.1.yml should no longer produce an error.

Actions #21

Updated by Peter Amstutz almost 3 years ago

  • Related to Idea #17848: CWL runner improvements added
Actions #22

Updated by Peter Amstutz almost 3 years ago

  • Related to deleted (Feature #16462: Expand arvados-controller to expose forecast features)
Actions #23

Updated by Peter Amstutz almost 3 years ago

  • Related to deleted (Idea #16011: CWL support, docs, training, website)
Actions #24

Updated by Peter Amstutz almost 3 years ago

  • Target version changed from 2021-07-07 sprint to 2021-07-21 sprint
Actions #25

Updated by Peter Amstutz almost 3 years ago

  • Target version changed from 2021-07-21 sprint to 2021-08-04 sprint
Actions #26

Updated by Jiayong Li over 2 years ago

1. In https://dev.arvados.org/issues/17054#4-Update-WorkflowJobStep-in-workflow_jobpy, you mentioned "checks if the current workflow step has "http://commonwl.org/cwltool#StepNameHint" in "hints" or "requirements"". Right now I'm only checking this under "hints", since it's called "StepNameHint", also the doc field says "provide a hint". Should I expect this to appear under "requirements" as well?

2. I wrote a unit test for custom naming, and it passed. However, some other tests have failed even though I made no changes for them.

=========================== short test summary info ============================
FAILED tests/test_context.py::test_replace_default_stdout_stderr - cwltool.er...
FAILED tests/test_examples.py::test_factory - cwltool.errors.WorkflowExceptio...
FAILED tests/test_load_tool.py::test_check_version - cwltool.errors.WorkflowE...
====== 3 failed, 376 passed, 132 skipped, 2 warnings in 184.91s (0:03:04) ======

Actions #27

Updated by Peter Amstutz over 2 years ago

Jiayong Li wrote:

1. In https://dev.arvados.org/issues/17054#4-Update-WorkflowJobStep-in-workflow_jobpy, you mentioned "checks if the current workflow step has "http://commonwl.org/cwltool#StepNameHint" in "hints" or "requirements"". Right now I'm only checking this under "hints", since it's called "StepNameHint", also the doc field says "provide a hint". Should I expect this to appear under "requirements" as well?

In general, those extensions are supposed to be available under either hints or requirements, so it is good to accept them in both places for consistency.

You want to be using the method "get_requirement" which searches for a given process requirement (in both "hints" and "requirements") with the correct precedence rules.

2. I wrote a unit test for custom naming, and it passed. However, some other tests have failed even though I made no changes for them.
[...]

Where is your branch so I can review it?

As I said in standup, you should create a pull request for cwltool. In addition to the unit tests there are other code quality tools that are very picky, you will probably need to make further changes to make them happy.

Actions #28

Updated by Peter Amstutz over 2 years ago

  • Target version changed from 2021-08-04 sprint to 2021-08-18 sprint
Actions #30

Updated by Peter Amstutz over 2 years ago

  • Target version changed from 2021-08-18 sprint to 2021-09-01 sprint
Actions #31

Updated by Peter Amstutz over 2 years ago

  • Target version changed from 2021-09-01 sprint to 2021-09-15 sprint
Actions #32

Updated by Peter Amstutz over 2 years ago

  • Target version deleted (2021-09-15 sprint)
Actions #36

Updated by Peter Amstutz about 2 years ago

  • Target version set to 2022-03-30 Sprint
  • Assigned To deleted (Jiayong Li)
Actions #37

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2022-03-30 Sprint to 2022-04-13 Sprint
Actions #38

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2022-04-13 Sprint to 2022-04-27 Sprint
Actions #39

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2022-04-27 Sprint to 2022-05-11 sprint
Actions #40

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2022-05-11 sprint to 2022-05-25 sprint
Actions #41

Updated by Peter Amstutz almost 2 years ago

  • Target version deleted (2022-05-25 sprint)
Actions #42

Updated by Peter Amstutz about 1 year ago

  • Release set to 60
Actions #43

Updated by Peter Amstutz about 1 year ago

  • Related to Idea #20273: More CWL runner improvements added
Actions #44

Updated by Peter Amstutz about 1 year ago

  • Related to deleted (Idea #17848: CWL runner improvements)
Actions #45

Updated by Peter Amstutz about 2 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF