Bug #21943
closed
arvados-cwl-runner crashes when output includes directory listings
Added by Peter Amstutz 5 months ago.
Updated 4 months ago.
Release relationship:
Auto
Description
When constructing the final output collection, it uses the "NoFollowPathmapper" to avoid visiting the full contents of Directory objects with keep references. However, if the output includes directory listings, it may try to apply path mapping on the output object from the intermediate collection to the final collection on items in the listing that were not mapped, resulting in a crash.
(RT ticket 800)
- Description updated (diff)
- Assigned To set to Peter Amstutz
- Target version changed from Development 2024-07-24 sprint to Development 2024-08-07 sprint
- Description updated (diff)
21943-dir-output @ f9e025e91087db8dcd2adac98d27bc83d415f0f6
developer-run-tests: #4359
- All agreed upon points are implemented / addressed.
- reproduced the bug & fixed it
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- Code is tested and passing, both automated and manual, what manual testing was done is described
- added new integration test
- Documentation has been updated.
- Behaves appropriately at the intended scale (describe intended scale).
- Considered backwards and forwards compatibility issues between client and server.
- Follows our coding standards and GUI style guidelines.
This fixes an edge case where the same file appears multiple times in the output object and could fail to be included in the path map, leading to a crash. This happened because it first constructs a list of all file references in the output object, the de-duplicates the list, then path maps them, but skips directory listings. However, if de-duplication selected a file in a directory listing but then it was skipped in later traversal, it would not get path mapped at all.
The solution is to trim 'listing' earlier in the process.
- Status changed from New to In Progress
The update LGTM, I just have one question:
In RT#800 it's mentioned that using CWL 1.1 or 1.2 could fix things, so I'm wondering if this is a CWL 1.0 bug that we are mitigating with this update? If that's so, and for reproducibility reasons, should we fix CWL bugs in our runner implementation?
Lucas Di Pentima wrote in #note-8:
The update LGTM, I just have one question:
In RT#800 it's mentioned that using CWL 1.1 or 1.2 could fix things, so I'm wondering if this is a CWL 1.0 bug that we are mitigating with this update? If that's so, and for reproducibility reasons, should we fix CWL bugs in our runner implementation?
The reason is CWL 1.0 requires directory listings to be loaded by default, whereas CWL 1.1 and later only loads directory listings when explicitly requested (because we learned it was expensive to include directory listings for large/deep directory trees by default when you were not going to use them). The hypothesis was that by changing it to 1.1 so it does not load the directory listing, it would avoid the bug.
- Status changed from In Progress to Resolved
Also available in: Atom
PDF