Bug #5095

[FUSE] arv-mount takes up too much memory and occassionally crashes when listing large 'home' directory

Added by Abram Connelly over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
FUSE
Target version:
Start date:
02/16/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
2.0

Description

On lightning-dev2 (qr1hi), in the home directory of my keep mount (/home/abram/keep/home), doing an 'ls' causes arv-mount to balloon to 95% usage and sometimes crash. The 'ls' command itself sometimes takes upwards of 5-15 minutes to complete.

~$ cd keep/home
~/keep/home$ ls

(this could take more than 5 minutes or so).

If it completes without crashing arv-mount (and me needing to remount the keep mount point), then subsequent directory listings run without too much issue. I think arv-mount also settles down to a memory usage that's more reasonable (right now it's at 50% or so).

File access in the keep mount, by going to ~/keep/by_id/<PDH> for example, while the 'ls' is running works fine.

I believe I have around 1.7k elements in my 'home' project.


Subtasks

Task #5216: Review 5095-fuse-ls-takes-foreverResolvedPeter Amstutz


Related issues

Related to Arvados - Bug #4464: [Workbench] Collections tab loads forever on a specific projectResolved02/05/2015

Associated revisions

Revision 8b90f80e
Added by Peter Amstutz over 4 years ago

Merge branch '5095-fuse-ls-takes-forever' closes #5095

History

#1 Updated by Brett Smith over 4 years ago

  • Subject changed from arv-mount takes up too much memory and occassionally crashes when listing the 'home' directory in my keep mount to [FUSE] arv-mount takes up too much memory and occassionally crashes when listing large 'home' directory
  • Category set to Keep

#2 Updated by Tom Clegg over 4 years ago

Possible problems/solutions:
  • arv-mount is retrieving the manifest_text for all those collections, even though it doesn't need to do that in order to show a list of files.
  • arv-mount isn't retrieving the manifest_text, but apiserver is spending a lot of time preprocessing each collection/page anyway.
  • arv-mount is fetching smaller pages than necessary. (Perhaps limit=1000 better if you know you're just going to keep asking for more pages anyway?)

Worth up to 1.0 points for investigating and reporting how much these (and other) factors contribute, even if the corresponding fixes aren't trivial.

Worth 2.0 points if the second point is a significant improvement.

#3 Updated by Tom Clegg over 4 years ago

  • Category changed from Keep to FUSE
  • Story points set to 2.0

#4 Updated by Tom Clegg over 4 years ago

  • Target version changed from Bug Triage to 2015-02-18 sprint

#5 Updated by Peter Amstutz over 4 years ago

  • Assigned To set to Peter Amstutz

#6 Updated by Peter Amstutz over 4 years ago

Related to #4464?

#7 Updated by Brett Smith over 4 years ago

Reviewing 4106786.

  • Directory uses the current time as the default mtime. CollectionDirectory overrides this with 0 for API collections that don't provide a modified_at value, but leaves Directory's default in place for collections retrieved from Keep. Isn't this kind of inconsistent?
  • Sort of following along with the theme of this branch, wouldn't it save both time and code to update self._mtime in CollectionDirectory.new_collection? Then CollectionDirectory wouldn't need to override the mtime method at all, it could just use Directory's.

Thanks.

#8 Updated by Peter Amstutz over 4 years ago

  • Status changed from New to In Progress

#9 Updated by Peter Amstutz over 4 years ago

Brett Smith wrote:

Reviewing 4106786.

  • Directory uses the current time as the default mtime. CollectionDirectory overrides this with 0 for API collections that don't provide a modified_at value, but leaves Directory's default in place for collections retrieved from Keep. Isn't this kind of inconsistent?

I had not thought of that. It now defaults to 0 in both cases.

  • Sort of following along with the theme of this branch, wouldn't it save both time and code to update self._mtime in CollectionDirectory.new_collection? Then CollectionDirectory wouldn't need to override the mtime method at all, it could just use Directory's.

Yes. Fixed.

Now at ef4e4a3

#10 Updated by Brett Smith over 4 years ago

Peter Amstutz wrote:

Now at ef4e4a3

This is good to merge. Thanks.

#11 Updated by Peter Amstutz over 4 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:8b90f80efca772efd2697ffc70d7809c32564171.

Also available in: Atom PDF