Project

General

Profile

Actions

Bug #3792

closed

[Crunch] Docker daemon grows to use all RAM, then won't start new containers

Added by Brett Smith over 9 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
-
Story points:
-

Description

One of qr1hi's compute nodes recently got into a state where it would not start any more Docker containers, because it could not allocate RAM for them. The specific error message was:

2014/09/03 14:06:48 Error response from daemon: Cannot start container HASH: fork/exec /tmp/docker/init/dockerinit-1.1.2: cannot allocate memory

free would report that >90% of RAM was free, but ps showed that the Docker daemon had lots of RAM reserved. Compute nodes are configured not to overcommit memory, so Linux wouldn't offer this reserved-but-unused RAM to anything else. Restarting the daemon resolved the issue.

We need to figure out a more permanent way to deal with this. One part could be to restart the Docker daemon regularly between jobs. We also may want to consider tweaks to Linux's RAM tunables on compute nodes.

Actions

Also available in: Atom PDF