Project

General

Profile

Actions

Docker security » History » Revision 1

Revision 1/7 | Next »
Peter Amstutz, 10/07/2016 03:02 AM


Docker security

The fundamental Docker security issue is that a "root" (uid 0) user
inside container is equivalent to "root" outside, unless steps are
taken to limit container permissions. We want to disallow containers
from sending data outside the private Arvados network, prevent
breakout from the container, and limit access if a breakout does
occur. We don't allow end users to invoke Docker directly, so we can
impose security measures both in the daemon configuration and the
individual container invocation.

Some of the knobs we have include:

Setting the uid/gid of pid 1 in container

We can explicitly set the uid/gid of pid 1 inside the container so it
is not uid 0. This overrides the USER directive of the image. One
drawback is that some programs behave badly when the current uid
cannot be found in /etc/passwd.

User id mapping
docker daemon --userns-remap

User ids inside container corresponds to a different host user id.
Can map uid 0 inside the container to non-root user outside the
container. Unclear if uid 0 inside the container still has some "root
powers" (like bypassing file access checks when accessing files inside
the container), or if this means uid 0 is just a regular unprivileged
user who happens to have a uid of 0. (More research necessary)

Dropping capabilities
docker run --drop-cap

Drop capabilities of root user inside the container ("man
capabilities" for list). Dropping all capabilities effectively
neuters the root user (for example, without CAP_DAC_OVERRIDE the root
user is subject to the same file permission checks as regular users).
Unclear if this is necessary when user id remapping is in effect; it
may be the case that when user id mapping is in effect

Restrict container networking
Crunch v2 communicates via arv-mount, which means most containers
don't need networking to read/write to Keep. Crunch v2 policy is that
networking is disabled by default but can be enabled with the runtime
constraint API: true (necessary for arvados-aware containers). The
Docker network bridge should be configured with a whitelist firewall
that limits communication to essential Arvados services (API server +
Keep server).

Disable inter-container communication
docker daemon --icc=false

Our containers don't need to talk to each other.

Resource limits via cgroups

Slurm can set up a cgroup (control group) to dictate resource limits,
and crunch-run can instruct Docker to put the container in the cgroup
set up by slurm. Note, for this to work, we may need to invoke the
Docker daemon with this option:
--exec-opt native.cgroupdriver=cgroupfs

Further research is required to see if slurm cgroup settings are
sufficient to prevent overloading the node or denial-of-service, or if
we need to set other limits (for example, a limit on the number of
processes inside the container to prevent forkbomb attacks.)

Resource limits via ulimit

We can also set ulimits on daemon invocation (--default-ulimit) and on
container invocation (--ulimit). ulimit has some overlap with cgroups
but the difference seems to be that most ulimit settings apply
per-process rather than to a group of processes.

seccomp

Seccomp filters system calls that can be made by programs inside the
container; many system calls it filters can also be blocked by
dropping capabilities.
https://docs.docker.com/engine/security/seccomp/

AppArmor

Can further limit what programs (including those running as "root")
inside the container can do. To be really effective, need to tailor
profiles to specific application containers.

https://docs.docker.com/engine/security/apparmor/

SELinux

docker daemon --selinux-enabled

Enable SELinux support. I don't know what that entails.

Updated by Peter Amstutz about 8 years ago · 1 revisions