Bug #6968

[SDKs] arv-copy continues copying when a collection's content address is wrong, but warns and exits a special code

Added by Brett Smith about 4 years ago. Updated about 4 years ago.

Assigned To:
Target version:
Start date:
Due date:
% Done:


Estimated time:
Story points:


Currently when arv-copy copies a collection by content address (e.g., from a pipeline), it assumes that the content address is correct, and tries to create the collection on the destination cluster with the same content address. However, some old API servers have collections with bad content addresses; e.g., because they were calculated including location hints. When it tries to create the same collection on the destination cluster, the API server refuses it for having the wrong content address, and then arv-copy crashes from the resulting exception.

Instead, we want arv-copy to create a collection on the destination no matter what, and continue the copy as much as possible, but warn the user about this issue and exit with a dedicated exit code > 2 when this happens. The prevailing principle here is that we want arv-copy to try as hard as possible to save as much as possible from the source object being copied, but changes in content address are a serious error that should be reported to the user as clearly as possible. Details:

  • After arv-copy fetches the source collection, it should calculate the correct content address for that collection. If the content addresses are different, the logic that checks to see if the collection already exists on the destination cluster should try looking up both the source content address and the correct content address.
  • When arv-copy creates the destination collection, it should not specify the portable_data_hash field. If the destination collection is created with a different content address than the source collection, it should issue a warning to tell the user about the difference, and set a flag such that arv-copy exits with a special exit code > 2 at the end of execution.
  • If the source content address is referenced in another object being copied (e.g., a pipeline instance or template), the references should be updated on destination objects to use the destination's content address, just as we currently do for collection UUIDs.


#1 Updated by Brett Smith about 4 years ago

  • Story points set to 2.0

Also available in: Atom PDF