Project

General

Profile

Actions

Bug #19571

closed

arvados-cwl-runner scattering bug

Added by Tom Schoonjans 4 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
CWL
Target version:
Start date:
10/28/2022
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

Hi all,

The attached simple CWL workflow and inputs file returns the following error when run with arvados-cwl-runner:

$ arvados-cwl-runner workflow-fixed.cwl cwl.inputs.json
INFO /usr/bin/arvados-cwl-runner 2.4.2, arvados-python-client 2.4.2, cwltool 3.1.20220623174452
INFO Resolved 'workflow-fixed.cwl' to 'file:///home/tom/temp/crunch-failure/workflow.cwl'
INFO Using cluster xxxxx (https://xxxxx.yyyyy.com/)
ERROR Input object failed validation:
identifier field '['string one', 'second string', 'three three three']' must be a string

This was surprising given that this workflow works just fine with cwltool (3.1.20220802125926).

We suspect that this may have to do with us calling the same scattered step twice, but unsure exactly as to why.


Files

cwl.inputs.json (60 Bytes) cwl.inputs.json Tom Schoonjans, 09/23/2022 12:34 PM
workflow-simplified.cwl (1.44 KB) workflow-simplified.cwl Tom Schoonjans, 09/23/2022 12:34 PM
19571-ref_resolver.patch (1.1 KB) 19571-ref_resolver.patch Joshua Randall, 10/28/2022 04:08 PM

Subtasks 1 (0 open1 closed)

Task #19611: Review 19678-job-loaderResolvedPeter Amstutz10/28/2022

Actions

Related issues

Has duplicate Arvados - Bug #19678: arvados-cwl-runner: id name must be a stringResolvedPeter Amstutz

Actions
Actions #1

Updated by Peter Amstutz 4 months ago

  • Target version set to 2022-10-12 sprint
Actions #2

Updated by Peter Amstutz 4 months ago

  • Target version changed from 2022-10-12 sprint to 2022-10-26 sprint
Actions #3

Updated by Peter Amstutz 4 months ago

  • Assigned To set to Peter Amstutz
Actions #4

Updated by Peter Amstutz 4 months ago

  • Category set to CWL
Actions #5

Updated by Peter Amstutz 3 months ago

  • Target version changed from 2022-10-26 sprint to 2022-11-09 sprint
Actions #6

Updated by Peter Amstutz 3 months ago

  • Related to Bug #19678: arvados-cwl-runner: id name must be a string added
Actions #7

Updated by Peter Amstutz 3 months ago

  • Related to deleted (Bug #19678: arvados-cwl-runner: id name must be a string)
Actions #8

Updated by Peter Amstutz 3 months ago

  • Has duplicate Bug #19678: arvados-cwl-runner: id name must be a string added
Actions #9

Updated by Peter Amstutz 3 months ago

I think this is actually the same bug as was reported again in #19678, the having a input parameter named name runs into trouble.

Actions #10

Updated by Joshua Randall 3 months ago

Having an input parameter with an id of 'name' does work as long as the type is "string" - the problem occurs when the type is anything other than "string".

Note also that this does not only apply to input parameters but to field names as well.

i.e. this is valid:

'''
type:
type: record
fields:
- name: name
type: string
'''

but this is rejected:
'''
type:
type: record
fields:
- name: name
type: boolean
'''

We have confirmed that simply removing the two places in schema_salad where it explicitly throws a ValidationException when it finds a value for a name field that is not of type str solves this problem, and arvados-cwl-runner still works for all of our test cases. We have not yet investigated why this problem appears to be specific to arvados-cwl-runner and is not an issue for cwltool. That seems a bit surprising since they both use schema_salad, but perhaps they are invoking it in a different way?

Actions #11

Updated by Joshua Randall 3 months ago

For our use-cases the attached patch (simply removing the two raise statements) completely solves this issue.

Actions #12

Updated by Peter Amstutz 3 months ago

Hey Josh! Good to hear from you.

The way that arvados-cwl-runner reads the workflow, packs it, and re-reads the packed version has occasionally turned up bugs on the second trip through the sausage machine (parsing and loading). I'll give this a look.

Actions #13

Updated by Peter Amstutz 3 months ago

19678-job-loader @ e2267bd99209651c61425f335230e515421b2ef4

  • Fix for parameters called 'name'
  • Also fix regression involving default file references appearing in
    nested processes (inline declaration of a tool within a workflow).
  • Also fixed some dependency issues preventing arvados/jobs developer
    image from working.

developer-run-tests: #3347

Actions #14

Updated by Peter Amstutz 3 months ago

  • Status changed from New to In Progress
Actions #15

Updated by Lucas Di Pentima 3 months ago

This LGTM, thanks.

Actions #16

Updated by Peter Amstutz 3 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF