Project

General

Profile

Actions

Bug #17244

closed

Make sure cgroupsV2 works with Arvados

Added by Nico César over 3 years ago. Updated 9 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
2.0
Release relationship:
Auto

Description

Reading

https://docs.docker.com/config/containers/runmetrics/

Running Docker on cgroup v2

Docker supports cgroup v2 experimentally since Docker 20.10. Running Docker on cgroup v2 also requires the following conditions to be satisfied:

containerd: v1.4 or later
runc: v1.0.0-rc91 or later
Kernel: v4.15 or later (v5.2 or later is recommended)

Note that the cgroup v2 mode behaves slightly different from the cgroup v1 mode:

The default cgroup driver (dockerd --exec-opt native.cgroupdriver) is “systemd” on v2, “cgroupfs” on v1.
The default cgroup namespace mode (docker run --cgroupns) is “private” on v2, “host” on v1.
The docker run flags --oom-kill-disable and --kernel-memory are discarded on v2.

With all this changes, we have to make sure that:

  1. We can run a distro that has cgroup v2 by default (As in Fedora 2020) or kernel parameters that boot up with cgroups v2 enabled in systemd (kernel param systemd.unified_cgroup_hierarchy=1) and docker version >= 2020.04
  2. We can guide the admin to upgrade to cgroup v2 and have a test case easy to check that arvados will run

The last point is important because the current error is kindof cryptic:

applying cgroup configuration for process caused: cannot enter cgroupv2 "/sys/fs/cgroup/docker" with domain controllers

There also cryptic messages with a cgroupsv2 enabled host and Docker 19.03.13

Status: Downloaded newer image for hello-world:latest
docker: Error response from daemon: cgroups: cgroup mountpoint does not exist: unknown.
ERRO[0005] error waiting for container: context canceled

https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html

We can remove the crunchstat command-line program and debian package rather than update it.


Subtasks 3 (0 open3 closed)

Task #20712: Review 17244-cgroup2ResolvedTom Clegg07/18/2023Actions
Task #20808: Confirm works on dev clustersResolvedTom Clegg07/18/2023Actions
Task #20835: Update tordo compute image kernel config from "hybrid" to "unified" modeResolvedLucas Di Pentima08/09/2023Actions

Related issues

Related to Arvados - Bug #17270: Test for docker cgroups issues in crunch-run works on ubuntu 20.04ResolvedNico CésarActions
Related to Arvados - Bug #20616: "cgroup stats files never appeared" on scale clusterDuplicateLucas Di Pentima07/05/2023Actions
Blocks Arvados - Feature #20756: Support crunchstat tracking and memory limits with singularityNewTom CleggActions
Actions

Also available in: Atom PDF