[FUSE] Access shared/ is inefficient
The shared/ directory of FUSE has several issues:
- no update lock, may start overlapping updates in separate threads
- no incremental lookup of individual names, always loads full list, bad for scaling
- fetches full record, which may include description or properties payload which is not used by wastes bandwith
#4 Updated by Tom Clegg over 3 years ago
This looks like it should fix the "flood the apiserver with many threads of groups#list requests" issue we're seeing.I'm not certain, but I see a couple of other issues that (if they're real) are probably worth fixing:
- ProjectDirectory and SharedDirectory don't seem to call fresh() after updating, like CollectionDirectory does. Does this mean once they go stale, they stay stale forever, and every lookup triggers a refresh?
- If N threads decide that self is stale, they all line up for updating_lock, and do their updates serially. But the first one should (according to the previous point, at least) set fresh(), which means the next N-1 threads will dutifully do their laborious updates even though self is already fresher than they could possibly have wanted it to be back when they decided to update. Perhaps it would be better to do one of
- Check stale() after acquiring _updating_lock, so the last N-1 threads just wait for the update that's already in progress to finish, and don't bother doing their own.
- Use acquire(false) to do a non-blocking lock. This is a bit different in that it knowingly returns stale results, but in the case of SharedDirectory maybe this kind of race is OK, since we generally only detect staleness using a race-prone timer anyway?
#6 Updated by Peter Amstutz over 3 years ago
Passed tests here https://ci.curoverse.com/job/developer-run-tests-services-fuse/564/
#8 Updated by Peter Amstutz over 3 years ago
To do this more efficient likely requires a new API endpoint. The way arv-mount currently determines what to list in "shared" currently requires looking at all projects and finding the ones where owner_uuid is not another project which is visible to us (meaning: users, non-project groups, or shared subprojects where the parent is not visible). This is expensive to compute on the client, but can probably be accomplished with a single query on the API server.