Project

General

Profile

Actions

Bug #18723

closed

arvados-cwl-runner not uploading all dependencies

Added by Peter Amstutz almost 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
High
Assigned To:
Category:
-
Target version:
Story points:
-
Release relationship:
Auto
Actions #1

Updated by Peter Amstutz almost 3 years ago

  • Status changed from New to In Progress
Actions #2

Updated by Peter Amstutz almost 3 years ago

The buggy behavior is here:

process.py:1109 mergedirs()

This function was designed to clean up the contents of "listing" or "secondaryFiles" to be suitable for materialization, where there can only be one file with a given 'basename' but there could be two or more directory literals with the same name that can be safely merged into a single one.

It also handles the case of two or more file entries with the same basename that don't refer to the same file.

However, this method is both buggy, and used inappropriately.

  • It handles the case of two files that have different locations but the same basename, but does not check the location of Directories, so if you have two directories with different locations, one of them gets forgotten.
  • It should only be applied to "listing" and "secondaryFiles". scandeps applies it to every level regardless of whether enforcing unique basenames makes sense.

scandeps should selectively apply it only to secondaryFiles and listing, or possibly not at all. I'm not sure what problem this was originally intended to solve.

Actions #3

Updated by Peter Amstutz almost 3 years ago

I think I remember what this was for.

If you had two dependent files which were in subdirectories:

- class: File
  basename: file1
  location: foo/file1
- class: File
  basename: file2
  location: foo/file2

If you apply "nesting" to mimic the source directory structure, you get this:

- class: Directory
  basename: foo
  listing:
    - class: File
      basename: file1
      location: foo/file1
- class: Directory
  basename: foo
  listing:
    - class: File
      basename: file2
      location: foo/file2

mergedirs() turns it into this:

- class: Directory
  basename: foo
  listing:
    - class: File
      basename: file1
      location: foo/file1
    - class: File
      basename: file2
      location: foo/file2
Actions #4

Updated by Peter Amstutz almost 3 years ago

The bug is in the cwltool scandeps / mergedir function:

https://github.com/common-workflow-language/cwltool/pull/1615

Actions #5

Updated by Peter Amstutz almost 3 years ago

Building a new cwltool release with the bug fix.

Will have a branch that updates to the new release.

Actions #7

Updated by Peter Amstutz almost 3 years ago

  • Release set to 49
Actions #8

Updated by Peter Amstutz almost 3 years ago

18723-cwl-upload @ e2b3ed63a292537ec7bf37a904ad2925d6afa0bf

Added integration test

Actions #9

Updated by Peter Amstutz almost 3 years ago

  • Priority changed from Normal to High
  • Assigned To deleted (Peter Amstutz)
  • Status changed from In Progress to Resolved
  • Category deleted (CWL)
Actions #10

Updated by Peter Amstutz almost 3 years ago

  • Assigned To set to Peter Amstutz
Actions

Also available in: Atom PDF