Bug #10629


[FUSE] slow enumerating files by collection uuid

Added by Tom Morris almost 7 years ago. Updated over 6 years ago.

Assigned To:
Target version:
Start date:
Due date:
% Done:


Estimated time:
(Total: 0.00 h)
Story points:


It takes 38 minutes (!!) to enumerate 242K files. During this time the arv-mount process is pegged at 100% CPU and no network traffic is being done. The manifest text totals just 9MB (9243914 char), so it's taking over 4 min / MB to parse this.

$ time find keep/by_id/e51c5-4zz18-l3dq8bw20uwz0qd -print | wc -l
real 38m0.969s
user 0m0.224s
sys 0m0.300s

$ wc *.manifest
2497 252893 9243914 e51c5-4zz18-l3dq8bw20uwz0qd.manifest

Subtasks 1 (0 open1 closed)

Task #11114: Review 10629-fuse-listing-perfResolvedPeter Amstutz11/24/2016


Related issues

Related to Arvados - Bug #6019: [FUSE] `ls` in a large home directory (4400+ items) takes too longClosed05/13/2015

Related to Arvados - Bug #9732: [FUSE] performance issues stat() in pythonNew08/04/2016

Actions #1

Updated by Tom Clegg almost 7 years ago

Suggested places to start
  • debug trace - which / how many fuse operations per file
  • make a test manifest big enough to exhibit slowness (10K files?), and try
    • squishing dir hierarchy (is slowness related to dir depth?)
    • all files in one dir (is slowness related to # files per dir?)
  • double the # files and see how that affects timing (is it O(N) or O(N^2)?)
Actions #2

Updated by Tom Clegg almost 7 years ago

  • Story points set to 1.0
Actions #3

Updated by Tom Morris over 6 years ago

  • Assigned To set to Peter Amstutz
  • Target version changed from Arvados Future Sprints to 2017-03-01 sprint
Actions #4

Updated by Peter Amstutz over 6 years ago

peteramstutz@shell:~$ time find keep/by_id/83325435ac6cf1a851f4e1aadf4df0e3+8675570 -print | wc -l

real    0m3.723s
user    0m0.124s
sys    0m0.180s

So the problem is that there is different behavior for collections accessed by UUID vs. by PDH. It seems to be doing some expensive synchronization operation which is elided for PDH (which is immutable).

Actions #5

Updated by Peter Amstutz over 6 years ago

  • Subject changed from arv-mount pathologically slow enumerating files to [FUSE] slow enumerating files by collection uuid
Actions #6

Updated by Peter Amstutz over 6 years ago

  • Status changed from New to In Progress
class Handle(object):
    """Connects a numeric file handle to a File or Directory object that has
    been opened by the client.""" 

    def flush(self):
        if self.obj.writable():
            return self.obj.flush()

Several problems here.

  1. Opendir and releasedir are only ever used to get the directory listing via readdir(). Because a directory handle isn't used to modify the directory, calling flush() is spurious.
  2. If the Operations() object was created with enable_write=False, calling flush() is spurious.
  3. The CollectionDirectory object is considered "writable" despite enable_write=False
  4. Finally, computing committed() (to decide whether to actually send and updated manifest to the server) checks the _committed flag on every object. When there are 240000 files, that is expensive (especially because it makes a function call and increments/decrements a recursive mutex at each node.)


  1. Fix set_committed() to accept True or False and propagate the flag up or down accordingly. Change committed() to only test the local flag.
  2. Don't flush directory handles at all.
Actions #7

Updated by Peter Amstutz over 6 years ago

10629-fuse-listing-perf ready for review.

Actions #8

Updated by Lucas Di Pentima over 6 years ago

Local sdk/python and services/fuse tests ran without issues.
Tried to do some benchmarking using arvbox but wasn't able to start it, I don't want to stall this review any longer, if you have timings after the fix it would be nice to have the comparison here.

Actions #9

Updated by Peter Amstutz over 6 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:9b2aa42213a7d333bbe93e040c2d152a70e9b5af.


Also available in: Atom PDF