Bug #14723
closed[CWL] File that should appear multiple times in output collection only appears once
Description
A user has reported a situation where, in producing CWL output, there are directories that should contain the same file, however the file only appears in one directory. The theory is that the source files get mapped to exactly one destination in the output collection (picked arbitrarily) instead of copied to every location in the destination collection.
Specifically, make_output_collection uses NoFollowPathMapper() to map each file to exactly one destination. It probably ought to work the other way, mapping each destination to an input file.
Updated by Peter Amstutz about 6 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz about 6 years ago
- Status changed from In Progress to New
Updated by Tom Morris almost 6 years ago
- Assigned To set to Eric Biagiotti
- Target version changed from To Be Groomed to 2019-04-10 Sprint
Updated by Eric Biagiotti almost 6 years ago
- Status changed from New to In Progress
Updated by Eric Biagiotti almost 6 years ago
Am able to reproduce with the following workflow:
class: ExpressionTool
cwlVersion: v1.0
inputs:
inputfile: File
outputs:
out: Directory[]
requirements:
InlineJavascriptRequirement: {}
expression: |
${
var dirs = [];
dirs.push({"class": "Directory",
"basename": "testdir1",
"listing": [inputs.inputfile]});
dirs.push({"class": "Directory",
"basename": "testdir2",
"listing": [inputs.inputfile]});
return {"out": dirs};
}
and input file:
inputfile:
class: File
path: test.txt
Updated by Eric Biagiotti almost 6 years ago
- Target version changed from 2019-04-10 Sprint to 2019-04-24 Sprint
Updated by Eric Biagiotti almost 6 years ago
The 14723-cwl-multiple-file-targets
branch at 810fc3e7aab72aa5ae951141955b56f6d4e47cfd now maps files from their target destination instead. The exception is for literals, which store file contents instead of a resolved location.
I also added a test for this and found a bug where an ExpressionTool with a file array output of literals with the same name would not use name conflict resolution. I fixed this and also added a test.
Unit Tests: https://ci.curoverse.com/view/Developer/job/developer-run-tests/1185/
C/I Tests: https://ci.curoverse.com/view/CWL/job/arvados-cwl-conformance-tests/89/
Updated by Eric Biagiotti almost 6 years ago
Re-run of service/api tests: https://ci.curoverse.com/view/Developer/job/developer-run-tests-services-api/1210/
Updated by Peter Amstutz almost 6 years ago
Please add a comment about how StagingPathMapper, instead of internally mapping source → target, maps targets → source.
Also how mapper() is overridden to maintain the behavior of mapping source → target
Also how literals are a special case, they are still mapped by source (identifier) → target
if v.resolved == src or (v.type == "CreateFile" and k == src): return v
Let's make this logic exclusive:
if (v.type != "CreateFile" and v.resolved == src) or (v.type == "CreateFile" and k == src):
The code blocks at arvcontainer.py:150 and executor.py:444 are very similar if not identical in function, I wonder if they could be refactored to use the same logic.
Unrelated to this immediate bug, but I noticed this in executor.py:464
except IOError as e: logger.warning("While preparing output collection: %s", e)
Seems like that should be logger.error() and/or re-raise the exception.
Updated by Eric Biagiotti almost 6 years ago
Peter Amstutz wrote:
Please add a comment about how StagingPathMapper, instead of internally mapping source → target, maps targets → source.
Also how mapper() is overridden to maintain the behavior of mapping source → target
Also how literals are a special case, they are still mapped by source (identifier) → target
Done
[...]
Let's make this logic exclusive:
[...]
Fixed
The code blocks at arvcontainer.py:150 and executor.py:444 are very similar if not identical in function, I wonder if they could be refactored to use the same logic.
I don't see a huge benefit to refactoring this at the moment, at least as a part of this bug. Maybe we can add a ticket for this?
Unrelated to this immediate bug, but I noticed this in executor.py:464
[...]
Seems like that should be logger.error() and/or re-raise the exception.
Fixed
Latest at 454ee2b8f0385c542b6f1165a3baf2820425e1a3.
Unit Tests: https://ci.curoverse.com/view/Developer/job/developer-run-tests/1195/
C/I Tests: Run and passed on my machine.
Updated by Peter Amstutz almost 6 years ago
Eric Biagiotti wrote:
Latest at 454ee2b8f0385c542b6f1165a3baf2820425e1a3.
Unit Tests: https://ci.curoverse.com/view/Developer/job/developer-run-tests/1195/
C/I Tests: Run and passed on my machine.
LGTM
Updated by Eric Biagiotti almost 6 years ago
- Status changed from In Progress to Resolved