Project

General

Profile

Actions

Feature #19385

closed

a-c-r uploads workflow files + dependencies to a collection & executes from that instead of packed workfows

Added by Peter Amstutz over 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
CWL
Target version:
Story points:
5.0
Release relationship:
Auto

Files


Subtasks 1 (0 open1 closed)

Task #19395: Review 19385-cwl-fast-packResolvedStephen Smith02/07/2023Actions
Actions #2

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-08-31 sprint to 2022-09-14 sprint
Actions #3

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-09-14 sprint to 2022-09-28 sprint
Actions #4

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-09-28 sprint to 2022-10-12 sprint
Actions #5

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-10-12 sprint to 2022-11-09 sprint
Actions #6

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-11-09 sprint to 2022-11-23 sprint
Actions #7

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-11-23 sprint to 2022-12-07 Sprint
Actions #8

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-12-07 Sprint to 2022-12-21 Sprint
Actions #9

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-12-21 Sprint to 2023-01-18 sprint
Actions #10

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2023-01-18 sprint to 2023-02-01 sprint
Actions #11

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2023-02-01 sprint to 2023-01-18 sprint
Actions #12

Updated by Peter Amstutz over 1 year ago

  • Story points set to 5.0
  • Status changed from New to In Progress
Actions #13

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2023-01-18 sprint to 2023-02-01 sprint
Actions #14

Updated by Peter Amstutz about 1 year ago

  • Release set to 57
Actions #15

Updated by Peter Amstutz about 1 year ago

  • Target version changed from 2023-02-01 sprint to 2023-02-15 sprint
Actions #17

Updated by Peter Amstutz about 1 year ago

19385-cwl-fast-pack @ 083b86a4e748900bcc285cac8bfd2ecdd36679f6

This is a significant rewrite of the upload_workflow() method.

The purpose of this method is to bundle the workflow up into a form where it can be uploaded to Arvados for execution, with all of the workflow's external dependencies replaced with Arvados references.

The previous approach was to "pack" the workflow into a monolithic JSON document, but this approach has a couple of drawbacks.

  • The resulting "packed" file is reformatted from the original file, and not particularly human friendly
  • The "pack" process itself is slow.

The new approach uploads the files making up the workflow to a Collection. These are lightly updated but the processing is much less intensive than using pack(). The resulting files in Arvados are also much closer (or unchanged entirely) from the original files.

This branch also streamlines the workflow launch process by eliminating instances where it would re-load the Workflow document, determining this was largely redundant work that contributed significantly to the runtime.

When executing arvados-cwl-runner --create-workflow on a large customer workflow, execution time went from 8m9s on 2.5.0 to 1m16s on this branch.

This branch also adds support for the --fast-parser feature (not yet enabled by default) This uses a different code path for parsing and validating the CWL document which is significantly more efficient and results in even more runtime improvement (36s in the previous example) however there are usability issues around reporting parsing and runtime errors that are still being worked on.

Tested and passing with CWL unit tests, CWL conformance tests v1.0 - v1.2, and Arvados CWL integration tests.

There are quite a lot of commits here, I recommend reviewing by looking at git diff main..19385-cwl-fast-pack rather than trying to follow the development history.

developer-run-tests: #3477

Actions #18

Updated by Stephen Smith about 1 year ago

This lgtm!

Actions #19

Updated by Peter Amstutz about 1 year ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF