Project

General

Profile

Actions

Bug #18486

open

Docker containers are always removed

Added by Tom Schoonjans over 2 years ago. Updated 27 days ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

Observed in Arvados 2.3.1:

When trying to debug a CWL workflow running on the Docker container runtime, it appears that the Docker containers are automatically removed after they have finished running.

This happens regardless of the arvados-docker-cleaner service running or the RemoveStoppedContainers setting in its config file.


Related issues

Related to Arvados - Feature #12900: [Crunch2] [crunch-run] Prune old images before installing image for current containerNewActions
Actions #1

Updated by Peter Amstutz over 2 years ago

  • Release deleted (45)

As a process note, we use the "Release" field to designate which release a bug is being fixed in, not as the release the bug was found.

Actions #2

Updated by Peter Amstutz over 2 years ago

You want the (stopped) containers themselves to stick around, not just the images? In general we avoid that because you can fill up your scratch space very quickly, and users typically don't have access to compute nodes with containers anyway.

However we could add some kind of admin-level configuration option for debugging in those cases where the users do have access to the compute node.

You might also be interested in the container shell access feature:

https://doc.arvados.org/v2.3/install/container-shell-access.html

https://doc.arvados.org/v2.3/user/debugging/container-shell-access.html

Actions #3

Updated by Tom Schoonjans over 2 years ago

Peter Amstutz wrote:

You want the (stopped) containers themselves to stick around, not just the images? In general we avoid that because you can fill up your scratch space very quickly, and users typically don't have access to compute nodes with containers anyway.

However we could add some kind of admin-level configuration option for debugging in those cases where the users do have access to the compute node.

You might also be interested in the container shell access feature:

https://doc.arvados.org/v2.3/install/container-shell-access.html

https://doc.arvados.org/v2.3/user/debugging/container-shell-access.html

Yes when we ran into trouble with the Singularity runtime last week, I gave the Docker runtime a try instead, but couldn't debug any issues as the containers were removed immediately after they finished running. This seems to contradict the note in https://doc.arvados.org/v2.3/install/crunch2/install-compute-node-docker.html#docker-cleaner, which states that the arvados-docker-cleaner daemon is responsible for cleaning up Docker containers (and images), meaning that no containers should ever get removed if the daemon is not running, or when "RemoveStoppedContainers":"never" is added to its config file.

Not really an issue for us, since we got the Singularity runtime up and running again after Tom's analysis of the problem and suggested fix, but thought it would be good for you to know about this.

Actions #4

Updated by Peter Amstutz over 2 years ago

When the container stops, we call ContainerRemove(). That's by design.

The docker-cleaner service is a bit of a legacy. I think you're right that the documentation is a little misleading. It's been our intention to get rid of docker-cleaner entirely and have have crunch-run be responsible for cleaning up containers and container images when it starts (#12900). That would be closer to what you want, then the container would stick around at least until the next container starts (or you could drain the node to prevent new jobs from being scheduled).

On the other hand, singularity doesn't leave containers or container images around at all after it stops, and we load the singularity image from arv-mount, so none of this applies to the singularity case.

Actions #5

Updated by Peter Amstutz over 2 years ago

  • Related to Feature #12900: [Crunch2] [crunch-run] Prune old images before installing image for current container added
Actions #6

Updated by Peter Amstutz about 1 year ago

  • Release set to 60
Actions #7

Updated by Peter Amstutz 27 days ago

  • Target version set to Future
Actions

Also available in: Atom PDF