Bug #12991
closed[crunch2] Propagate memory limit from runtime constraints to docker container
Description
Current behavior: When a container tries to use more memory than it asked for, it competes with system processes, and the kernel OOM-killer sometimes kills system processes instead of the container.
Desired behavior: when a container tries to allocate more memory than specified in runtime_constraints, allocation fails and/or the container is killed. System processes (including crunch-run and slurmd) are not killed.
Explanation: We use the memory and cpu figures in container runtime_constraints to choose an appropriate node to run a container on (even taking kernel/system overhead into account), but we don't tell docker to limit the the container's memory use.
Proposed solution: We have an opportunity to do this in source:services/crunch-run/crunchrun.go L918:
Resources: dockercontainer.Resources{
CgroupParent: runner.setCgroupParent,
},
(dockercontainer.Resources also has Memory and NanoCPUs fields)
The container's memory size (including swap) should be limited to the number of bytes given in runtime_constraints.