Project

General

Profile

Docker security » History » Version 3

Peter Amstutz, 10/07/2016 02:29 PM

1 1 Peter Amstutz
h1. Docker security
2
3 2 Peter Amstutz
The fundamental Docker security issue is that a  "root" (uid 0) user inside container is equivalent to "root" outside, unless steps are taken to limit container permissions.  We want to disallow containers from sending data outside the private Arvados network, prevent breakout from the container, and limit access if a breakout does occur.  We don't allow end users to invoke Docker directly, so we can impose security measures both in the daemon configuration and the individual container invocation.
4 1 Peter Amstutz
5
Some of the knobs we have include:
6
7
h2. Setting the uid/gid of pid 1 in container
8
9 3 Peter Amstutz
docker run --user
10
11 2 Peter Amstutz
We can explicitly set the uid/gid of pid 1 inside the container so it is not uid 0.  This overrides the USER directive of the image.  One drawback is that some programs behave badly when the current uid
12 1 Peter Amstutz
cannot be found in /etc/passwd.
13
14
h2. User id mapping
15 2 Peter Amstutz
16 1 Peter Amstutz
docker daemon --userns-remap
17 2 Peter Amstutz
18 3 Peter Amstutz
User ids inside container corresponds to a different host user id. Can map uid 0 inside the container to non-root user outside the container.  Processes have two sets of capabilities; one set of capabilities apply when manipulating resources inside the user namespace, a second set of capabilities apply when manipulating resources in the parent namespace.  This makes it possible for a process to be "root" inside the container but "non-root" outside the container.
19 1 Peter Amstutz
20 3 Peter Amstutz
http://man7.org/linux/man-pages/man7/user_namespaces.7.html
21
22
This may also be useful for working with bind-mounted directories.  Mapping "root" to the host user "crunch" would mean that files written by "root" inside the container would actually be owned by "crunch" outside the container.  Because user id mappings are 1:1, this would require always using the same uid inside the container (probably uid 0).
23
24 1 Peter Amstutz
h2. Dropping capabilities
25 3 Peter Amstutz
26 1 Peter Amstutz
docker run --drop-cap
27 2 Peter Amstutz
28
Drop capabilities of root user inside the container ("man capabilities" for list).  Dropping all capabilities effectively neuters the root user (for example, without CAP_DAC_OVERRIDE the root
29 3 Peter Amstutz
user is subject to the same file permission checks as regular users). Can be used to limit the scope of what "root" user can do inside the container.
30 1 Peter Amstutz
31
h2. Restrict container networking
32
33 3 Peter Amstutz
docker run --net=none
34 1 Peter Amstutz
35 3 Peter Amstutz
Crunch v2 communicates via arv-mount, which means most containers don't need networking to read/write to Keep.  Crunch v2 policy is that networking is disabled by default but can be enabled with the runtime constraint @API: true@ (necessary for Arvados-aware containers).  The Docker network bridge should be configured with firewall whitelist that limits communication to essential Arvados services (API server + Keep server).
36
37 2 Peter Amstutz
h2. Disable inter-container communication
38 3 Peter Amstutz
39 1 Peter Amstutz
docker daemon --icc=false
40
41
Our containers don't need to talk to each other.
42
43
h2. Resource limits via cgroups
44
45 2 Peter Amstutz
Slurm can set up a cgroup (control group) to dictate resource limits, and crunch-run can instruct Docker to put the container in the cgroup set up by slurm.  Note, for this to work, we may need to invoke the Docker daemon with this option:
46
47 1 Peter Amstutz
--exec-opt native.cgroupdriver=cgroupfs
48
49 2 Peter Amstutz
Further research is required to see if slurm cgroup settings are sufficient to prevent overloading the node or denial-of-service, or if we need to set other limits (for example, a limit on the number of processes inside the container to prevent forkbomb attacks.)
50 1 Peter Amstutz
51
h2. Resource limits via ulimit
52
53 2 Peter Amstutz
We can also set ulimits on daemon invocation (--default-ulimit) and on container invocation (--ulimit).  ulimit has some overlap with cgroups but the difference seems to be that most ulimit settings apply per-process rather than to a group of processes.
54 1 Peter Amstutz
55
h2. seccomp
56
57 2 Peter Amstutz
Seccomp filters system calls that can be made by programs inside the container; many system calls it filters can also be blocked by dropping capabilities.
58
59 1 Peter Amstutz
https://docs.docker.com/engine/security/seccomp/
60
61
h2. AppArmor
62
63 2 Peter Amstutz
Can further limit what programs (including those running as "root") inside the container can do.  To be really effective, need to tailor profiles to specific application containers.
64 1 Peter Amstutz
65
https://docs.docker.com/engine/security/apparmor/
66
67
h2. SELinux
68
69
docker daemon --selinux-enabled
70
71
Enable SELinux support.  I don't know what that entails.