Bug #10978
open[CWL] Avoid using "+" char in mount paths
Added by Peter Amstutz almost 8 years ago. Updated 10 months ago.
Description
The "+" character is misinterpreted by some tools such as older versions of Picard. Consider changing it to "-" in keep mount paths for improved compatibility with poorly behaved tools.
Updated by Tom Clegg almost 8 years ago
We could also consider naming the mount dirs according to purpose instead of content -- e.g., mount at "/mnt/bamFile"
instead of "/keep/d41d8cd98f00b204e9800998ecf8427e+0"
.
That might also make container logs easier to read: "/mnt/tumor/sample1234.bam" and "/mnt/normal/sample5678.bam" instead of "/keep/pdh78af66a/sample1234.bam" and "/keep/pdh83b2f9a/sample5678.bam".
Updated by Tom Morris almost 8 years ago
[Tom replied after I'd composed this reply, but before I submitted it. I like this idea, but still think the questions below are valid.]
Am I wrong in thinking that any change to the "containers API" automatically invalidates all jobs which have been run to-date from a reusability point of view?
Can you expand on the pros and cons that you perceive for such a change?
Updated by Peter Amstutz almost 8 years ago
Tom Clegg wrote:
We could also consider naming the mount dirs according to purpose instead of content -- e.g., mount at
"/mnt/bamFile"
instead of"/keep/d41d8cd98f00b204e9800998ecf8427e+0"
.That might also make container logs easier to read: "/mnt/tumor/sample1234.bam" and "/mnt/normal/sample5678.bam" instead of "/keep/pdh78af66a/sample1234.bam" and "/keep/pdh83b2f9a/sample5678.bam".
Yes, we could take the input parameter name into account when determining the mount path. However it would require a bit of work since that information currently isn't easily available to the part of the code that decides where to mount things.
Updated by Peter Amstutz almost 8 years ago
Tom Morris wrote:
[Tom replied after I'd composed this reply, but before I submitted it. I like this idea, but still think the questions below are valid.]
Am I wrong in thinking that any change to the "containers API" automatically invalidates all jobs which have been run to-date from a reusability point of view?
You are not wrong. This would change mount points which would invalidate job reuse.
Can you expand on the pros and cons that you perceive for such a change?
Pros: solves a user problem with a commonly-used tool.
Cons: accommodating badly-behaved tools is a slippery slope, latest version of tool doesn't have the problem, reasonable workarounds (e.g. putting the code in Docker instead of Keep) are available.
Updated by Tom Clegg almost 8 years ago
Just a thought: if it's too much work under the hood to use symbolic names like "bamFile"
, we could consider "/mnt/d41d8cd9"
.
(Having "-" and "+" PDH forms in various places seems like a confusing road to go down. Keep-web is forced into it because it really needs to communicate the whole PDH, but in this case we don't actually need it to be a PDH at all: it just has to be deterministic so re-use works, and it's best if it doesn't make the logs too hard for a human to read.)
Updated by Tom Clegg almost 8 years ago
- Subject changed from [CWL] Consider changing keep mounts to use "-" instead of "+" in containers API to [CWL] Avoid using "+" char in mount paths
Updated by Tom Morris over 7 years ago
- Target version set to Arvados Future Sprints
Updated by Ward Vandewege over 3 years ago
- Target version deleted (
Arvados Future Sprints)