Project

General

Profile

Actions

Bug #10629

closed

[FUSE] slow enumerating files by collection uuid

Added by Tom Morris about 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Performance
Target version:
Story points:
1.0

Description

It takes 38 minutes (!!) to enumerate 242K files. During this time the arv-mount process is pegged at 100% CPU and no network traffic is being done. The manifest text totals just 9MB (9243914 char), so it's taking over 4 min / MB to parse this.

$ time find keep/by_id/e51c5-4zz18-l3dq8bw20uwz0qd -print | wc -l
241751
real 38m0.969s
user 0m0.224s
sys 0m0.300s

$ wc *.manifest
2497 252893 9243914 e51c5-4zz18-l3dq8bw20uwz0qd.manifest


Subtasks 1 (0 open1 closed)

Task #11114: Review 10629-fuse-listing-perfResolvedPeter Amstutz11/24/2016Actions

Related issues 2 (1 open1 closed)

Related to Arvados - Bug #6019: [FUSE] `ls` in a large home directory (4400+ items) takes too longClosed05/13/2015Actions
Related to Arvados - Bug #9732: [FUSE] performance issues stat() in pythonNewActions
Actions #1

Updated by Tom Clegg about 8 years ago

Suggested places to start
  • debug trace - which / how many fuse operations per file
  • make a test manifest big enough to exhibit slowness (10K files?), and try
    • squishing dir hierarchy (is slowness related to dir depth?)
    • all files in one dir (is slowness related to # files per dir?)
  • double the # files and see how that affects timing (is it O(N) or O(N^2)?)
Actions #2

Updated by Tom Clegg about 8 years ago

  • Story points set to 1.0
Actions #3

Updated by Tom Morris almost 8 years ago

  • Assigned To set to Peter Amstutz
  • Target version changed from Arvados Future Sprints to 2017-03-01 sprint
Actions #4

Updated by Peter Amstutz almost 8 years ago

peteramstutz@shell:~$ time find keep/by_id/83325435ac6cf1a851f4e1aadf4df0e3+8675570 -print | wc -l
241751

real    0m3.723s
user    0m0.124s
sys    0m0.180s

So the problem is that there is different behavior for collections accessed by UUID vs. by PDH. It seems to be doing some expensive synchronization operation which is elided for PDH (which is immutable).

Actions #5

Updated by Peter Amstutz almost 8 years ago

  • Subject changed from arv-mount pathologically slow enumerating files to [FUSE] slow enumerating files by collection uuid
Actions #6

Updated by Peter Amstutz almost 8 years ago

  • Status changed from New to In Progress
class Handle(object):
    """Connects a numeric file handle to a File or Directory object that has
    been opened by the client.""" 

    def flush(self):
        if self.obj.writable():
            return self.obj.flush()

Several problems here.

  1. Opendir and releasedir are only ever used to get the directory listing via readdir(). Because a directory handle isn't used to modify the directory, calling flush() is spurious.
  2. If the Operations() object was created with enable_write=False, calling flush() is spurious.
  3. The CollectionDirectory object is considered "writable" despite enable_write=False
  4. Finally, computing committed() (to decide whether to actually send and updated manifest to the server) checks the _committed flag on every object. When there are 240000 files, that is expensive (especially because it makes a function call and increments/decrements a recursive mutex at each node.)

Todo:

  1. Fix set_committed() to accept True or False and propagate the flag up or down accordingly. Change committed() to only test the local flag.
  2. Don't flush directory handles at all.
Actions #7

Updated by Peter Amstutz almost 8 years ago

10629-fuse-listing-perf ready for review.

Actions #8

Updated by Lucas Di Pentima almost 8 years ago

LGTM.
Local sdk/python and services/fuse tests ran without issues.
Tried to do some benchmarking using arvbox but wasn't able to start it, I don't want to stall this review any longer, if you have timings after the fix it would be nice to have the comparison here.

Actions #9

Updated by Peter Amstutz almost 8 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:9b2aa42213a7d333bbe93e040c2d152a70e9b5af.

Actions

Also available in: Atom PDF