[CWL] [Workbench] Support for copying/moving pipelines
|Velocity based estimate||-|
A common practice for writing a pipeline is to first test it in a private project, e.g. Home, because we don't want to clutter the production project since tests tend to be error-prone. Ultimately the successful pipeline run needs to be in the production project that's shared with customers, but as of now, there's no way of copying/moving the jobs and output collections of the successful test pipeline.
- I tried rerunning the successful pipeline using arvados-cwl-runner with --project-uuid PRODUCTION_PROJECT_UUID, hoping job-reuse would copy the jobs and outputs to the production project, but it doesn't do that.
- I tried the command line tool (written by Brett) https://github.com/curoverse/arvados-clients/blob/master/recursive_move.py but it only moved the top-level runner job. This tool is outdated.
- I used the command line tool (written by Bryan) https://github.com/bcosc/arvados-tools/blob/master/move_outputs_into_one_project.py to copy the output collections but it doesn't copy jobs.
We need a feature integrated in Workbench to copy/move pipelines.
#2 Updated by Bryan Cosca 4 months ago
A workaround would be to do:
arv-copy --src cluster_uuid --dst cluster_uuid --dst-git-repo any_repo pipeline_uuid to move the pipeline instance and top level jobs.
Then clone https://github.com/bcosc/arvados-tools/blob/master/copy-cwl-pi.py and run
python copy-cwl-pi.py pi_uuid project_uuid
to move all the outputs and log collections of all the child jobs.
I'm unsure what else would need to be copied, but that can be added to copy-cwl-pi.py