Project

General

Profile

Actions

Bug #4185

closed

[Crunch] crunchstat memory reports seem suspect for multithreaded programs

Added by Brett Smith about 10 years ago. Updated about 10 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
0.5

Description

qr1hi-8i9sb-zs7g4zo303dhdnu tries to run the GATK HaplotypeCaller with multiple threads and Java's maximum heap size set to 4g. The job ends when GATK aborts, claiming that not enough RAM is available to run the analysis.

qr1hi-8i9sb-l7vv9qqxozezv38 successfully ran the same analysis with a maximum heap size of 15g. However, the rss lines from crunchstat report that the maximum RAM used was about .5GiB (grep -F 'memory.stat rss' qr1hi-8i9sb-l7vv9qqxozezv38.log.txt | python3 -c 'import sys; print(max(int(line.split()[-1]) for line in sys.stdin))').

This reporting seems inconsistent with the results from the first job. I haven't dug deeply, but I'm guessing this has something to do with the way cgroups report memory use in the context of multithreaded programs. If possible, it would be good to find and use a number that more accurately reflects the job's "real" memory use.


Subtasks 1 (0 open1 closed)

Task #4372: Find out whether current stats make sense to usersResolvedTom Clegg10/14/2014Actions

Related issues 2 (0 open2 closed)

Related to Arvados - Task #4086: [Crunch] Not enough memory to run GATKResolvedBrett Smith10/03/2014Actions
Related to Arvados - Idea #3826: [Crunch] Display network activity in crunchstatResolvedTom Clegg10/10/2014Actions
Actions

Also available in: Atom PDF