Project

General

Profile

Actions

Bug #21422

open

Pathological arv-mount performance listing the contents of one project under a large filter group

Added by Brett Smith 3 months ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
FUSE
Target version:
Story points:
-

Description

I created a filter group on pirca with this code:

import arvados
arv_client = arvados.api('v1')
owner_uuid = arv_client.users().current().execute()['uuid']
filters = [
    ['uuid', 'is_a', ['arvados#group', 'arvados#collection']],
    ['collections.properties.type', 'not in', ['intermediate', 'log']],
    ['groups.owner_uuid', '=', owner_uuid],
]
filter_group = arv_client.groups().create(body={
    'name': 'Filter Group',
    'owner_uuid': owner_uuid,
    'group_class': 'filter',
    'properties': {
        'filters': filters,
    },
}).execute()

Note that I have an admin account, so this filter group includes all collections on pirca, minus those with the excluded types.

Then I mounted the filter group on my laptop (not in the cloud) with arv-mount --project FILTER_GROUP_UUID "$XDG_RUNTIME_DIR/keep".

Then I ran ls "$XDG_RUNTIME_DIR/keep". This took a few minutes, which is not great but at least understandable given the number of items to list.

This I ran ls "$XDG_RUNTIME_DIR/keep/ProjectName", where ProjectName is the name of a project in my home project so should pass all the filters. This has been running for two hours and still has yet to return any results. This makes less sense. The project has a small number of items, even unfiltered, so listing its contents shouldn't take more than one API call. And shouldn't arv-mount already have the UUID of the project to query, since it would've gotten that during the previous ls? Even if it doesn't cache that, it seems like getting the UUID should also be a single API call: just list the project contents filtering on name=basename. I don't understand why this should take hours.

The problem could be in the API calls arv-mount is making, the performance of those calls on the controller/RailsAPI end, or both.

Actions #1

Updated by Peter Amstutz 3 months ago

If nothing is returned after an hour that makes me suspicious that arv-mount might have hit a deadlock rather than the API server returning slowly.

Actions

Also available in: Atom PDF