Idea #4305
closed
[Node Manager] Investigate and try to address RAM use
Added by Brett Smith over 9 years ago.
Updated over 6 years ago.
Description
Node Manager uses ~1GiB of RAM on our production clusters. Investigate why it's so large, and try to bring it down.
When I run it on my desktop in local testing mode with the dummy driver, it takes ~200MiB. This makes me think apache-libcloud's EC2 driver is responsible for a good chunk of this. But that's just a theory that needs proving.
Another idea that occurred to me: arvnodeman.computenode.ec2 loads information about node sizes, images, security groups, etc. by listing them all and finding the one with the matching name. It shouldn't be holding on to the full list, but maybe the underlying ec2 driver is caching it, or maybe it's just hanging around without being garbage collected.
This approach is illustrated in the apache-libcloud tutorials (literally on the project's front page), so I figured it was best, but I didn't investigate very deeply. If there's a way to look up these items directly by their ID, that might make a noticeable dent on RAM use.
Node manager running on Azure consumes significantly less memory. This suggests that maybe the libcloud EC2 drive is doing something silly, or that we are using it in a silly way.
- Status changed from New to Closed
- Target version deleted (
Arvados Future Sprints)
Also available in: Atom
PDF