Bug #5095
closed
[FUSE] arv-mount takes up too much memory and occassionally crashes when listing large 'home' directory
Added by Abram Connelly almost 10 years ago.
Updated almost 10 years ago.
Description
On lightning-dev2 (qr1hi), in the home directory of my keep mount (/home/abram/keep/home), doing an 'ls' causes arv-mount to balloon to 95% usage and sometimes crash. The 'ls' command itself sometimes takes upwards of 5-15 minutes to complete.
~$ cd keep/home
~/keep/home$ ls
(this could take more than 5 minutes or so).
If it completes without crashing arv-mount (and me needing to remount the keep mount point), then subsequent directory listings run without too much issue. I think arv-mount also settles down to a memory usage that's more reasonable (right now it's at 50% or so).
File access in the keep mount, by going to ~/keep/by_id/<PDH> for example, while the 'ls' is running works fine.
I believe I have around 1.7k elements in my 'home' project.
- Subject changed from arv-mount takes up too much memory and occassionally crashes when listing the 'home' directory in my keep mount to [FUSE] arv-mount takes up too much memory and occassionally crashes when listing large 'home' directory
- Category set to Keep
Possible problems/solutions:
- arv-mount is retrieving the manifest_text for all those collections, even though it doesn't need to do that in order to show a list of files.
- arv-mount isn't retrieving the manifest_text, but apiserver is spending a lot of time preprocessing each collection/page anyway.
- arv-mount is fetching smaller pages than necessary. (Perhaps limit=1000 better if you know you're just going to keep asking for more pages anyway?)
Worth up to 1.0 points for investigating and reporting how much these (and other) factors contribute, even if the corresponding fixes aren't trivial.
Worth 2.0 points if the second point is a significant improvement.
- Category changed from Keep to FUSE
- Story points set to 2.0
- Target version changed from Bug Triage to 2015-02-18 sprint
- Assigned To set to Peter Amstutz
Reviewing 4106786.
- Directory uses the current time as the default mtime. CollectionDirectory overrides this with 0 for API collections that don't provide a modified_at value, but leaves Directory's default in place for collections retrieved from Keep. Isn't this kind of inconsistent?
- Sort of following along with the theme of this branch, wouldn't it save both time and code to update self._mtime in CollectionDirectory.new_collection? Then CollectionDirectory wouldn't need to override the mtime method at all, it could just use Directory's.
Thanks.
- Status changed from New to In Progress
Brett Smith wrote:
Reviewing 4106786.
- Directory uses the current time as the default mtime. CollectionDirectory overrides this with 0 for API collections that don't provide a modified_at value, but leaves Directory's default in place for collections retrieved from Keep. Isn't this kind of inconsistent?
I had not thought of that. It now defaults to 0 in both cases.
- Sort of following along with the theme of this branch, wouldn't it save both time and code to update self._mtime in CollectionDirectory.new_collection? Then CollectionDirectory wouldn't need to override the mtime method at all, it could just use Directory's.
Yes. Fixed.
Now at ef4e4a3
Peter Amstutz wrote:
Now at ef4e4a3
This is good to merge. Thanks.
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:8b90f80efca772efd2697ffc70d7809c32564171.
Also available in: Atom
PDF