Project

General

Profile

Bug #8798

Updated by Peter Amstutz about 8 years ago

Recently ran into the "cannot fork, out of memory" error which required a restart of node manager.    The memory profile was approximately 97 MiB resident memory size, and 43 GiB virtual memory size.    This suggests that #8543 was successful in eliminating the egregious memory leak, but there is some other behavior that is causing unbounded growth in the virtual process size.    This isn't quite as bad as before (it doesn't take up all the resident size and crash other processes on the system) but it still reaches a point where the kernel won't fork the process any more (likely due to the page table growing too large). 

 Requires further investigation.    One possible suspect is threading; node manager creates and discards a huge number of threads, if each one bumps up the virtual size by a little bit, it would add up.    If this seems to be the case, consider a thread polling solution to re-use threads.

Back