Bug #12990
closed[FUSE] Access shared/ is inefficient
Added by Peter Amstutz almost 7 years ago. Updated 9 months ago.
Description
The shared/ directory of FUSE has several issues:
- no update lock, may start overlapping updates in separate threads
- no incremental lookup of individual names, always loads full list, bad for scaling
- fetches full record, which may include description or properties payload which is not used by wastes bandwith
Updated by Peter Amstutz almost 7 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz almost 7 years ago
12990-fuse-shared @ 0dcf9daff8fce376f20f125c3ef867333976c18c
Addresses points (1) and (3) but not incremental lookup (this turns out to be hard due to the way the contents of shared/ is determined).
Updated by Tom Clegg almost 7 years ago
LGTM
This looks like it should fix the "flood the apiserver with many threads of groups#list requests" issue we're seeing.
I'm not certain, but I see a couple of other issues that (if they're real) are probably worth fixing:- ProjectDirectory and SharedDirectory don't seem to call fresh() after updating, like CollectionDirectory does. Does this mean once they go stale, they stay stale forever, and every lookup triggers a refresh?
- If N threads decide that self is stale, they all line up for updating_lock, and do their updates serially. But the first one should (according to the previous point, at least) set fresh(), which means the next N-1 threads will dutifully do their laborious updates even though self is already fresher than they could possibly have wanted it to be back when they decided to update. Perhaps it would be better to do one of
- Check stale() after acquiring _updating_lock, so the last N-1 threads just wait for the update that's already in progress to finish, and don't bother doing their own.
- Use acquire(false) to do a non-blocking lock. This is a bit different in that it knowingly returns stale results, but in the case of SharedDirectory maybe this kind of race is OK, since we generally only detect staleness using a race-prone timer anyway?
Updated by Tom Clegg almost 7 years ago
Tom Clegg wrote:
I'm not certain, but I see a couple of other issues that (if they're real) are probably worth fixing:
(from irc) Not real. merge() sets fresh flag. "Check stale() after acquiring" already happens.
Updated by Peter Amstutz almost 7 years ago
Passed tests here https://ci.curoverse.com/job/developer-run-tests-services-fuse/564/
Updated by Peter Amstutz almost 7 years ago
To do this more efficient likely requires a new API endpoint. The way arv-mount currently determines what to list in "shared" currently requires looking at all projects and finding the ones where owner_uuid is not another project which is visible to us (meaning: users, non-project groups, or shared subprojects where the parent is not visible). This is expensive to compute on the client, but can probably be accomplished with a single query on the API server.
Updated by Peter Amstutz almost 7 years ago
- Target version changed from 2018-01-31 Sprint to Arvados Future Sprints
Updated by Peter Amstutz almost 7 years ago
- Related to Idea #13146: [API] Endpoint to get projects shared with me added
Updated by Peter Amstutz almost 7 years ago
Discussion about API endpoint moved to #13146
Updated by Peter Amstutz over 3 years ago
- Target version deleted (
Arvados Future Sprints)
Updated by Peter Amstutz 9 months ago
- Release deleted (
60) - Target version deleted (
Future) - Status changed from In Progress to Resolved