Bug #4185
closed
[Crunch] crunchstat memory reports seem suspect for multithreaded programs
Added by Brett Smith about 10 years ago.
Updated about 10 years ago.
Description
qr1hi-8i9sb-zs7g4zo303dhdnu tries to run the GATK HaplotypeCaller with multiple threads and Java's maximum heap size set to 4g. The job ends when GATK aborts, claiming that not enough RAM is available to run the analysis.
qr1hi-8i9sb-l7vv9qqxozezv38 successfully ran the same analysis with a maximum heap size of 15g. However, the rss lines from crunchstat report that the maximum RAM used was about .5GiB (grep -F 'memory.stat rss' qr1hi-8i9sb-l7vv9qqxozezv38.log.txt | python3 -c 'import sys; print(max(int(line.split()[-1]) for line in sys.stdin))'
).
This reporting seems inconsistent with the results from the first job. I haven't dug deeply, but I'm guessing this has something to do with the way cgroups report memory use in the context of multithreaded programs. If possible, it would be good to find and use a number that more accurately reflects the job's "real" memory use.
- Target version changed from Bug Triage to Arvados Future Sprints
This could be a clue, in https://www.kernel.org/doc/Documentation/cgroups/memory.txt
Only anonymous and swap cache memory is listed as part of 'rss' stat. This should not be confused with the true 'resident set size' or the amount of physical memory used by the cgroup.
While crunchstat and cgroups are not above suspicion, I think it's possible that the "rss" figure is correct. Java allocates a large heap (as requested with -Xmx) but this is "virt" size, not resident size. AIUI it's possible for a Java program to allocate large amounts of memory from the heap without writing to much of it, and in this case Java could reach its heap limit well before Linux devotes a comparable amount of resident RAM to the process.
Watching a java program on compute0.q (qr1hi-8i9sb-brduspenge0fw63)
Wed Oct 22 15:17:18 2014 qr1hi-8i9sb-brduspenge0fw63 9565 1 stderr crunchstat: memory.stat rss 289296384
However:
cat /sys/fs/cgroup/memory/docker/68f7b9b13e4ac6c1644809d0733931f157886aa8b7a6e772461d8d2abe58182f/memory.stat
cache 3121152
rss 9482752000
rss_huge 0
mapped_file 417792
writeback 4096
swap 1263448064
pgpgin 8254441
pgpgout 5938554
pgfault 5931378
pgmajfault 488211
inactive_anon 1126043648
active_anon 8356839424
inactive_file 1638400
active_file 1482752
unevictable 0
hierarchical_memory_limit 18446744073709551615
hierarchical_memsw_limit 18446744073709551615
total_cache 3121152
total_rss 9482752000
total_rss_huge 0
total_mapped_file 417792
total_writeback 4096
total_swap 1263448064
total_pgpgin 8254441
total_pgpgout 5938554
total_pgfault 5931378
total_pgmajfault 488211
total_inactive_anon 1126043648
total_active_anon 8356839424
total_inactive_file 1638400
total_active_file 1482752
total_unevictable 0
OTOH maybe I was looking at the wrong task above, because now top says
3016 1000 20 0 12.9g 387m 1884 S 7.7 2.6 216:25.50 java
3064 1000 20 0 12.9g 534m 3712 S 6.0 3.5 47:12.57 java
3192 1000 20 0 13.0g 1.1g 2152 S 7.3 7.4 60:46.94 java
3194 1000 20 0 12.9g 2.1g 1996 S 6.3 14.3 90:26.65 java
3193 1000 20 0 12.9g 2.1g 2268 S 7.3 14.1 84:10.51 java
3191 1000 20 0 12.9g 2.4g 2164 S 6.7 16.1 123:10.97 java
3063 1000 20 0 12.9g 2.5g 1880 S 7.0 17.1 120:49.54 java
2952 1000 20 0 12.9g 2.9g 2252 S 25.9 19.5 70:22.32 java
while egrep -h ^rss\ /sys/fs/cgroup/memory/docker/*/memory.stat | sort -nk2
says
rss 406056960
rss 562262016
rss 1167360000
rss 2233044992
rss 2270339072
rss 2546380800
rss 2706317312
rss 3091529728
...which looks totally fine if you assume "g" and "m" mean 2^30 and 2^20.
- Status changed from New to In Progress
- Story points set to 0.5
Added "swap" to crunchstat memory stats, to help expose low-memory conditions. ("pgmajfault" was also added, in #3826.)
(Rss has probably been correct the whole time.)
- Status changed from In Progress to Feedback
- Target version changed from Arvados Future Sprints to 2014-11-19 sprint
- Story points deleted (
0.5)
- Assigned To set to Tom Clegg
- Status changed from Feedback to Resolved
Also available in: Atom
PDF
Report "swap" memory statistic if kernel provides it. refs #4185