Story #3699

Updated by Tim Pierce over 5 years ago

Use case: user can copy a pipeline instance between from one Arvados instances, in order to rerun a pipeline on another cluster and compare results with the original computation. Example:

# User runs @arv-copy 1h9kt-pipeline-uuid 1h9kt 4xphq@ to copy
instance @1h9kt-pipeline-uuid@ to cluster 4xphq
# User views the new pipeline instance on 4xphq's workbench
# User clicks "run" on the copied pipeline template page (selecting an appropriate input collection, probably the input collection
another, so that was copied along with the pipeline target instance and template)
# Jobs run.
# User uses "compare pipelines" on 4xphq to compare
can run the original, copied 1h9kt pipeline instance with the new 4xphq instance that was just generated. pipeline.

$ arv-copy [--recursive=true/false] [pipeline-instance-uuid] [source-arvados] [destination-arvados] [target-arvados]

By default, arv-copy exports the specified pipeline instance from the _source-arvados_ instance and imports it to _destination-arvados_. arv-copy removes the following pipeline instance fields before importing them to the destination Arvados:
* @owner_uuid@

The @--recursive@ option, which defaults to true, also copies the following data:
* collections (copy blocks and then copy manifest_text)
* docker images (collection copy + docker specific tags)
* pipeline templates (copy name, components)
* git repository (clone entire repository; update name of repository to use in components of target pipeline template)

If @--recursive=false@, copy only the pipeline instance, but emit a warning that the user will have to fix the pipeline template UUID by hand.

arv-copy works only on pipeline instances. It returns an error if _pipeline-instance-uuid_ refers to anything other than a pipeline instance.