Bug #7154

Updated by Brett Smith almost 5 years ago

The Arvados API server and clients identify Docker images by @docker_image_hash@ and @docker_image_repo+tag@ links associated with collections. If links don't point to this collection, it's undiscoverable: jobs can't refer to the image by this metadata, and @arv keep docker@ won't list it. (You can still use it in will never be recognized as a job by specifying the collection content address as your @docker_image@.) Docker image, no matter what.

At an API level, we can't prevent users from copying Docker image collections without copying the associated metadata. However, a few high-level copy tools also lose the metadata, which is not what users expect:

* Basically all of the Workbench copy mechanisms.
** "Copy to project…" from the collection page.
** Selecting the collection in a project and copying it from the pulldown menu.
** On the collection page, select the single Docker image file in it, and create a new collection from that (although maybe users don't/shouldn't expect this to work)
* arv-copy by collection content address: In this case, arv-copy searches for all collections with that content address, and chooses one to copy based on which one is likely to have the best name. If it chooses a copy that's already lost the metadata links, it won't create any links on the destination either. It should probably copy over metadata from any collection with a matching content address.

Maybe we should just fix all these individual bugs in the copy tools. But I wanted to raise the question: should we be identifying Docker images a different way that less likely to be lost? Maybe through properties on the collection, or something like that?