Bug #8189

[FUSE] Listing a project directory is slow when there are many subprojects

Added by Jiayong Li almost 4 years ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
FUSE
Target version:
Start date:
01/11/2016
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

I created a 2947 sub-projects under the project '1000_genome_exome_raw_reads' with uuid su92l-j7d0g-3c6tenm6q4xn7qm on su92l. As a result, directory operations under that project are slow. For example, 'ls' takes nearly two minutes.


Subtasks

Task #8210: review 8189-handle-large-collections-betterResolvedTom Clegg


Related issues

Related to Arvados - Bug #8183: [Workbench] should not look up every group/project a user has access to on every page loadResolved02/12/2016

Associated revisions

Revision 37a1505b (diff)
Added by Ward Vandewege almost 4 years ago

Make the Python SDK and workbench effectively default to the API
server's MAX_LIMIT when requesting a list of objects, in those cases
where no explicit limit is set in the client code.

closes #8189

Revision 3fb0f464
Added by Tom Clegg almost 4 years ago

Merge branch '8189-handle-large-collections-better' refs #8189

History

#1 Updated by Brett Smith almost 4 years ago

  • Subject changed from [Keep] Directory operations are slow after the creation of a large number of projects. to [FUSE] Listing a project directory is slow when there are many subprojects

#2 Updated by Tom Clegg almost 4 years ago

arvados_fuse's ProjectDirectory class uses arvados.util.list_all:

                contents = arvados.util.list_all(self.api.groups().contents,
                                                 self.num_retries, uuid=self.project_uuid)

arvados.util.list_all doesn't set a limit either, so we get the API's default limit of 100 items per page.

Suggest modifying arvados.util.list_all (in source:sdk/python/arvados/util.py#L365) to do something like

kwargs.setdefault('limit', sys.maxint)

That way, the API server's MAX_LIMIT (currently 1000) will determine the page size.

The rationale is that, once the client is in an API request loop that it won't exit until it gets all of the items, it's never a good idea for it to get fewer items per API request. Getting fewer items per page only makes sense if the client has some chance of doing something else (exiting the loop or processing a subset of results) before receiving MAX_LIMIT results.

(ArvadosResourceList#each_page in source:apps/workbench/app/models/arvados_resource_list.rb#177 needs this fix, too.)

#3 Updated by Ward Vandewege almost 4 years ago

  • Assigned To set to Ward Vandewege
  • Target version set to 2016-01-20 Sprint

#4 Updated by Ward Vandewege almost 4 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:37a1505b607bbf533512f48b47f208c5cde4c435.

#5 Updated by Tom Clegg almost 4 years ago

  • Status changed from Resolved to In Progress

#6 Updated by Jiayong Li almost 4 years ago

Right now su92l workbench is still considerably slower than qr1hi workbench.

On su92l, I also noticed huge performance difference between read-only mount and writable mount (both freshly mounted to reflect recent changes).

read-only mount:
$ time ls keep/home/arvados_genomics_benchmark/1000_genome_exome_raw_reads
real 1m0.422s
user 0m0.020s
sys 0m0.060s

writable mount:
$ time ls mnt/home/arvados_genomics_benchmark/1000_genome_exome_raw_reads
real 94m40.150s
user 0m0.028s
sys 0m0.080s

#7 Updated by Brett Smith almost 4 years ago

  • Target version deleted (2016-01-20 Sprint)

#8 Updated by Brett Smith almost 4 years ago

  • Target version set to Arvados Future Sprints

#9 Updated by Jiayong Li almost 4 years ago

I tried running a pipeline on su92l, but the "Run a pipeline" button on the workbench homepage is not clickable now.

#10 Updated by Ward Vandewege 7 months ago

  • Status changed from In Progress to Resolved
  • Target version changed from Arvados Future Sprints to 2019-05-22 Sprint

This was resolved long ago, here's the performance today:


wardv@shell:~$ time ls keep/by_id/su92l-j7d0g-3c6tenm6q4xn7qm

...

real    0m8.187s
user    0m0.021s
sys    0m0.086s

#11 Updated by Tom Morris 6 months ago

  • Release set to 15

Also available in: Atom PDF