Bug #11584
closedAllow "keep:" references on command line.
Description
arvados-cwl-runner lets you provide input parameter values on the command line. However, currently it assumes they are relative paths local files and doesn't allow you to use references to keep. This should be fixed to permit references to keep.
As a secondary problem, "file not found" when checking command line parameters results in a backtrace, rather being caught and printing a friendlier error message.
Updated by Peter Amstutz over 7 years ago
- Subject changed from [CWL] Allow keep: references on command line. to [CWL] Allow "keep:" references on command line.
- Description updated (diff)
Updated by Tom Morris over 7 years ago
- Target version set to Arvados Future Sprints
Updated by Peter Amstutz almost 4 years ago
- Related to Idea #16011: CWL support, docs, training, website added
Updated by Peter Amstutz almost 4 years ago
- Target version changed from Arvados Future Sprints to 2021-03-31 sprint
- Assigned To set to Jiayong Li
Updated by Peter Amstutz almost 4 years ago
- Category set to CWL
- Subject changed from [CWL] Allow "keep:" references on command line. to Allow "keep:" references on command line.
Updated by Peter Amstutz over 3 years ago
- Target version changed from 2021-03-31 sprint to 2021-04-14 sprint
Updated by Peter Amstutz over 3 years ago
- Target version changed from 2021-04-14 sprint to 2021-04-28 bughunt sprint
Updated by Peter Amstutz over 3 years ago
- Target version deleted (
2021-04-28 bughunt sprint)
Updated by Peter Amstutz over 3 years ago
- Related to Idea #17848: CWL runner improvements added
Updated by Peter Amstutz over 3 years ago
- Related to deleted (Idea #16011: CWL support, docs, training, website)
Updated by Lucas Di Pentima almost 3 years ago
- Assigned To changed from Jiayong Li to Lucas Di Pentima
Updated by Lucas Di Pentima almost 3 years ago
I think I've tracked down the issue.
cwltool's main.py
file does some argument parsing in init_job_order()
and translates for example:
['--script', 'keep:zztop-4zz18-x9a7cx3mw6a2czo/OSC_Preprocessing.ssf', '--sessions', 'keep:3c2af8506ae4d7844701000867858f67+8593/20220113']
to
{'job_order': None, 'script': {'class': 'File', 'location': 'file:///Users/lucas/Devel/git/arvados/sdk/cwl/keep%3Azztop-4zz18-x9a7cx3mw6a2czo/OSC_Preprocessing.ssf'}, 'sessions': [{'class': 'Directory', 'location': 'file:///Users/lucas/Devel/git/arvados/sdk/cwl/keep%3A3c2af8506ae4d7844701000867858f67%2B8593/20220113'}]}
At a-c-r
's side, there're some checks for inputs starting with "keep:
" that don't get executed because of this translation.
What's not clear to me is if a-c-r
should decode the file:///full-path/keep%3Asome-uuid/some-file
URI to something understandable like keep:some-uuid/some-file
, or if it's a cwltool
bug (or feature that can be disabled)
Updated by Lucas Di Pentima almost 3 years ago
I think the problem might be here:
The location
of File
and Directory
inputs get translated to an os.path.abspath()
without even checking that what we have is a valid (and existing) relative path.
Updated by Peter Amstutz almost 3 years ago
Lucas Di Pentima wrote:
I think the problem might be here:
The
location
ofFile
andDirectory
inputs get translated to anos.path.abspath()
without even checking that what we have is a valid (and existing) relative path.
Good catch! Probably not that hard to fix either, it just needs to expand those file/URI references differently.
Updated by Lucas Di Pentima almost 3 years ago
- Status changed from New to In Progress
What I did is something like this:
diff --git a/cwltool/argparser.py b/cwltool/argparser.py
index ea885b64..ca135a5d 100644
--- a/cwltool/argparser.py
+++ b/cwltool/argparser.py
@@ -730,12 +730,16 @@ class FSAction(argparse.Action):
values: Union[AnyStr, Sequence[Any], None],
option_string: Optional[str] = None,
) -> None:
+ if os.path.exists(os.path.abspath(cast(AnyStr, values))):
+ location = file_uri(str(os.path.abspath(cast(AnyStr, values))))
+ else:
+ location = cast(AnyStr, values)
setattr(
namespace,
self.dest,
{
"class": self.objclass,
- "location": file_uri(str(os.path.abspath(cast(AnyStr, values)))),
+ "location": location,
},
)
@@ -762,10 +766,14 @@ class FSAppendAction(argparse.Action):
if not g:
g = []
setattr(namespace, self.dest, g)
+ if os.path.exists(os.path.abspath(cast(AnyStr, values))):
+ location = file_uri(str(os.path.abspath(cast(AnyStr, values))))
+ else:
+ location = cast(AnyStr, values)
g.append(
{
"class": self.objclass,
- "location": file_uri(str(os.path.abspath(cast(AnyStr, values)))),
+ "location": location,
}
)
My thought process is that as cwltool
doesn't know the keep:xxx
URIs, and it's not an existing relative path, it shouldn't translate it at all. I guess cwltool
users will get an "file not found" error somewhere downstream anyways.
I've ran the tests and 5 fail, although the same number of tests fail on main
, so I'm not sure if this change broke something.
What would be the correct procedure to submit these changes to the project? (provided that they make sense)
Updated by Peter Amstutz almost 3 years ago
There's a better way to do this, there is a urljoin method that cwltool uses, which arvados-cwl-runner hooks into to make it aware of "keep:" URIs. It should "urljoin" on the file URI of the current directory and the string from the command line, and that has the correct logic to recognize when something is a URI and when it is a relative file reference (that's how this works for all the file loading, it should have been used here and was just overlooked).
Updated by Peter Amstutz almost 3 years ago
- Target version set to 2022-02-16 sprint
Updated by Lucas Di Pentima almost 3 years ago
Fix PR at: https://github.com/common-workflow-language/cwltool/pull/1613 -- we'll have to update arvados-cwl-runner
's pin on cwltool
when this gets merged & released.
Updated by Peter Amstutz almost 3 years ago
Ticket #18723 is merged which includes the cwltool update with the bug fix, could you confirm this bug is fixed in arvados-cwl-runner?
Updated by Lucas Di Pentima almost 3 years ago
Yes, it seems to be working now. Thanks!
Updated by Peter Amstutz almost 3 years ago
- Assigned To changed from Lucas Di Pentima to Peter Amstutz
- Status changed from In Progress to Resolved