Bug #9223

[Node manager] Uses huge amount of RAM on AWS

Added by Peter Amstutz over 1 year ago. Updated 19 days ago.

Status:In ProgressStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:Node Manager
Target version:-
Story points-
Velocity based estimate-

Description

When node manager starts on AWS, it quicky uses a huge amount of RAM (2 GiB) despite not doing anything.

I've tracked this down to arvnodeman.computenode.driver.ec2.ComputeNodeDriver._init_image_id which calls libcloud.compute.drivers.ec2.BaseEC2NodeDriver.list_images. It seems that the problem is (a) it is listing a huge number of images (many thousands) and (b) it is retaining a lot of memory even for image records that should be GC'd.

We are currently using libcloud 0.16 on AWS which is an very out of date version.


Related issues

Related to Arvados - Bug #12055: [node manager] ec2 set tags on create Resolved 08/16/2017

Associated revisions

Revision 1ba39510
Added by Lucas Di Pentima 23 days ago

12055: Avoid RAM exhaustion on bootup by asking AWS only the AMI
list owned by 'self'. refs #9223 #12163

Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <>

History

#1 Updated by Peter Amstutz over 1 year ago

  • Description updated (diff)
  • Category set to Node Manager

#2 Updated by Peter Amstutz over 1 year ago

As a first try I suggest updating libcloud to the latest version we have packaged (0.20 I think?)

#3 Updated by Tom Morris 12 months ago

The current version of libcloud is 1.2.1. We should definitely not be using a <1.0 release when 1.0+ is available.

https://pypi.python.org/pypi/apache-libcloud/1.2.1

#4 Updated by Peter Amstutz 12 months ago

The complication is that we're using a fork of libcloud which adds the Azure support that we need. We should try to get that merged upstream but requires allocating some engineering time to move it through the process.

#5 Updated by Nico C├ęsar 19 days ago

  • Status changed from New to In Progress

resolved in #12163

Also available in: Atom PDF