Improve container image handling
12/31/2022 (Due in 23 days)
We want to improve UX for common workflows like:
- Use Workbench2 (and no shell node) to run a workflow that depends on docker images that are available on dockerhub
- Build an image from a Dockerfile or git repo, and use that image to run Arvados containers without pushing it to dockerhub
- Build an image and share it with other users on your cluster without pushing it to dockerhub
- Share a project that contains a workflow execution plus all of the docker images needed to re-run it even after the referenced images on dockerhub/arvados have been updated or removed
- Improve tagging behavior: Avoid inconvenient/mysterious behaviors (e.g., image collection is visible by two users but only one can use it as an image because the tag link isn't shared) by using collection properties instead of tag links to identify docker images. #19846
- Server-side pull: Workflow runner, when running without access to a local docker daemon (e.g., inside a container submitted by Workbench), can use a container request to pull an externally hosted (dockerhub) image and refer to that image by PDH when executing workflow steps. #11724
- Server-side build: Workflow runner or other client without access to a local docker daemon (or preferring to leave a better provenance trail) can use a container request to build an image from a specified Dockerfile. #13794
- Arvados-hosted image namespace: Each cluster has a "docker images" project. Any user with username X can use arv-keepdocker (or a new arvados-client command?) to save docker images in a subproject named X. A container request with container image "arvados:X/Y" will use the image saved in the collection named Y in the subproject named X, subject to usual permissions.
- Implicit pull at runtime: A container request with container image "docker:abc/def" causes Arvados to fetch/update "abc/def" from dockerhub into an arvados collection, and use that collection PDH as the image in the resulting container.