Project

General

Profile

Actions

Bug #12990

open

[FUSE] Access shared/ is inefficient

Added by Peter Amstutz over 4 years ago. Updated 12 months ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

The shared/ directory of FUSE has several issues:

  1. no update lock, may start overlapping updates in separate threads
  2. no incremental lookup of individual names, always loads full list, bad for scaling
  3. fetches full record, which may include description or properties payload which is not used by wastes bandwith

Related issues

Related to Arvados - Story #13146: [API] Endpoint to get projects shared with meResolvedPeter Amstutz08/15/2018

Actions
Actions #1

Updated by Peter Amstutz over 4 years ago

  • Status changed from New to In Progress
Actions #2

Updated by Peter Amstutz over 4 years ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz over 4 years ago

12990-fuse-shared @ 0dcf9daff8fce376f20f125c3ef867333976c18c

Addresses points (1) and (3) but not incremental lookup (this turns out to be hard due to the way the contents of shared/ is determined).

Actions #4

Updated by Tom Clegg over 4 years ago

LGTM

This looks like it should fix the "flood the apiserver with many threads of groups#list requests" issue we're seeing.

I'm not certain, but I see a couple of other issues that (if they're real) are probably worth fixing:
  • ProjectDirectory and SharedDirectory don't seem to call fresh() after updating, like CollectionDirectory does. Does this mean once they go stale, they stay stale forever, and every lookup triggers a refresh?
  • If N threads decide that self is stale, they all line up for updating_lock, and do their updates serially. But the first one should (according to the previous point, at least) set fresh(), which means the next N-1 threads will dutifully do their laborious updates even though self is already fresher than they could possibly have wanted it to be back when they decided to update. Perhaps it would be better to do one of
    • Check stale() after acquiring _updating_lock, so the last N-1 threads just wait for the update that's already in progress to finish, and don't bother doing their own.
    • Use acquire(false) to do a non-blocking lock. This is a bit different in that it knowingly returns stale results, but in the case of SharedDirectory maybe this kind of race is OK, since we generally only detect staleness using a race-prone timer anyway?
Actions #5

Updated by Tom Clegg over 4 years ago

Tom Clegg wrote:

I'm not certain, but I see a couple of other issues that (if they're real) are probably worth fixing:

(from irc) Not real. merge() sets fresh flag. "Check stale() after acquiring" already happens.

Actions #7

Updated by Tom Morris over 4 years ago

  • Assigned To set to Peter Amstutz
Actions #8

Updated by Peter Amstutz over 4 years ago

To do this more efficient likely requires a new API endpoint. The way arv-mount currently determines what to list in "shared" currently requires looking at all projects and finding the ones where owner_uuid is not another project which is visible to us (meaning: users, non-project groups, or shared subprojects where the parent is not visible). This is expensive to compute on the client, but can probably be accomplished with a single query on the API server.

Actions #9

Updated by Peter Amstutz over 4 years ago

  • Target version changed from 2018-01-31 Sprint to Arvados Future Sprints
Actions #10

Updated by Peter Amstutz over 4 years ago

  • Related to Story #13146: [API] Endpoint to get projects shared with me added
Actions #11

Updated by Peter Amstutz over 4 years ago

Discussion about API endpoint moved to #13146

Actions #12

Updated by Peter Amstutz 12 months ago

  • Target version deleted (Arvados Future Sprints)
Actions

Also available in: Atom PDF