Bug #11158
closed[FUSE] Hang on simple FUSE operation and when logging in again later
Description
I changed directory on a shell node to a ~/keep/home/... project which each had a single sub project, which each had a single collection, which each had the file I was looking for one directory down and did:
ls */*/*/SampleSheet.csv
which should have in <1 second returned the ~20 files that I was looking for. Instead, it went away for hours, and when I lost the SSH connection and tried to log in again later, the login hung trying to do the arv-mount.
Updated by Peter Amstutz almost 8 years ago
- Target version changed from 2017-03-15 sprint to 2017-03-29 sprint
Updated by Peter Amstutz almost 8 years ago
- Target version changed from 2017-03-29 sprint to 2017-04-12 sprint
Updated by Peter Amstutz over 7 years ago
- Target version changed from 2017-04-12 sprint to 2017-04-26 sprint
Updated by Tom Morris over 7 years ago
- Target version changed from 2017-04-26 sprint to 2017-05-10 sprint
Updated by Peter Amstutz over 7 years ago
- Target version changed from 2017-05-10 sprint to 2017-05-24 sprint
Updated by Peter Amstutz over 7 years ago
- Target version changed from 2017-05-24 sprint to Arvados Future Sprints
Updated by Peter Amstutz over 7 years ago
- Status changed from New to In Progress
- Target version changed from Arvados Future Sprints to 2017-07-05 sprint
Updated by Peter Amstutz over 7 years ago
New behavior for project directories. Allow 'cd' into any collection or subproject that exists on the server, even if it isn't returned in the listing. Now applies listing limit, ordered by most recent. No longer lists other object types (pipelines, links etc.)
The problem seems to be that listing the entire contents of the project via the "group contents" API gets stuck. There may be a bug in the interaction of paging via "offset" with the "contents" API.
Updated by Tom Clegg over 7 years ago
looking at 11158-fuse-projects @ 62f1bc5...
It looks like list_all(..., limit=1000)
makes list_all()
ask for 1000 items per page, instead of the apiserver's maximum page size. So, either the apiserver's limit is 1000 or less and limit=1000 makes no difference, or the apiserver's limit is higher than 1000 and limit=1000 fetches the same number of records (all of them) but uses more API calls because it gets fewer results per page. Am I following this correctly? I expect this will make arv-mount slower -- what's the advantage?
Many tests are failing for me. Most of them are "ERROR" and look like they might be unrelated "fuse is having a bad day on my machine today" things, but this looks like the sort of thing that would be caused by the "don't list_all() until someone calls readdir" change:
====================================================================== FAIL: runTest (tests.test_mount.FuseSharedTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/tom/src/arvados/services/fuse/tests/test_mount.py", line 244, in runTest ], fuse_user_objs) AssertionError: Lists differ: ['FUSE Test Project', 'collect... != ['FUSE Test Project', 'collect... First list contains 1 additional elements. First extra element 3: pipeline instance owned by FUSE.pipelineInstance ['FUSE Test Project', 'collection #1 owned by FUSE', - 'collection #2 owned by FUSE', ? ^ + 'collection #2 owned by FUSE'] ? ^ - 'pipeline instance owned by FUSE.pipelineInstance']
The list_all() bugfix looks good, and seems convincing as a fix for the reported issue.
Updated by Peter Amstutz over 7 years ago
- Target version changed from 2017-07-05 sprint to 2017-07-19 sprint
Updated by Peter Amstutz over 7 years ago
Sorry, that branch was a little half baked. Tests should pass now.
Updated by Tom Clegg over 7 years ago
In this version of list_all()
, the "filters" logic assumes the response is ordered by uuid, but it requests whatever order was given by the caller. Won't it miss a lot of records if called with an order other than uuid? It might also behave badly if the returned items don't include a uuid (e.g., "select" param given).
This branch seems to be taking a turn toward "improve various things in arv-mount and its tests" -- or maybe I just don't see how a lot of the changes are related to the bugfix?
If I'm following correctly:- The bug ("hang on simple FUSE operation") was caused by the groups#contents API returning the wrong (too high) number for items_available, and list_all() being unable to terminate its loop as a result.
- We can make list_all() faster, and work around the groups#contents bug at the same time, by sending count=none and terminating the loop on "returned items < returned limit" instead.
- We haven't explained or reproduced the groups#contents bug.
- There's a bunch of other work to do in arv-mount.
Assuming this is correct, could we focus on merging a list_all() improvement/workaround, and put the other changes (like lazy loading) aside for a moment? In particular I think it would be good to track down the API bug.
Updated by Peter Amstutz over 7 years ago
I'll split the Python SDK and FUSE changes into separate branches.
Updated by Peter Amstutz over 7 years ago
items_available as reported:
$ arv group contents --uuid 34t0i-tpzed-1jdzge3ezjpou1o "items_available":16023,
Paging, using [uuid > last uuid on previous page]
and stopping when getting an incomplete page:
val = arvados.util.list_all(api.groups().contents, 0, uuid="34t0i-tpzed-1jdzge3ezjpou1o") >>> len(val) 6146
Paging, using [uuid > last uuid on previous page]
and stopping when getting an empty page:
val = arvados.util.list_all(api.groups().contents, 0, uuid="34t0i-tpzed-1jdzge3ezjpou1o") >>> len(val) 6624
Paging, using offset = result["offset"] + len(result["items"])
and count=none, and stopping after getting an empty page:
>>> val = arvados.util.list_all(api.groups().contents, 0, uuid="34t0i-tpzed-1jdzge3ezjpou1o") {'count': 'none', 'offset': 0, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 0, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 1000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 1000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 2000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 2000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 3000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 3000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 4000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 4000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 5000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 5000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 6000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 6000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 7000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 7000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 8000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 8000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 9000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 9000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 10000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 10000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 11000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 11000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 12000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 12000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 13000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 13000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 14000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 14000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 15000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 15000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 16000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 16000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 17000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 17000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 18000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 18000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 19000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 19000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 20000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 20000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 21000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 21000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 22000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 22000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 23000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 23000, u'self_link': u'', u'items_available': 0} {'count': 'none', 'offset': 24000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} (continues forever)
Paging, using offset = result["offset"] + len(result["items"])
and count=exact, and stopping after getting an empty page:
>>> val = arvados.util.list_all(api.groups().contents, 0, uuid="34t0i-tpzed-1jdzge3ezjpou1o") {'count': 'exact', 'offset': 0, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 0, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 1000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 1000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 2000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 2000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 3000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 3000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 4000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 4000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 5000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 5000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 6000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 6000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 7000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 7000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 8000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 8000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 9000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 9000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 10000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 10000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 11000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 11000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 12000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 12000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 13000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 13000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 14000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 14000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 15000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 15000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 16000, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 16000, u'self_link': u'', u'items_available': 16023} {'count': 'exact', 'offset': 16023, 'limit': 9223372036854775807, 'uuid': '34t0i-tpzed-1jdzge3ezjpou1o', 'order': 'uuid asc'} {u'kind': u'arvados#objectList', u'etag': u'', u'limit': 1000, u'offset': 16023, u'self_link': u'', u'items_available': 16023} >>> len(val) 16023
Findings:
- Paging using uuid doesn't work at all group#contents
- group#contents with offset requires count=exact
- Doing some tracing, it appears that it is actually calling list_all() over and over again.
Updated by Peter Amstutz over 7 years ago
On further examination, it seems what is actually happening is that it is getting a steady stream of events from websockets, which are handled by invalidating the whole project directory (which sets the "stale" flag). As a result, every access to the ProjectDirectory sees the "stale" flag and triggers a full reload (which takes ~60s).
Disabling event listening (--disable-event-listening) makes it possible to enter and list home directory contents.
The fix is for the ProjectDirectory class to handle incremental updates.
Updated by Tom Morris over 7 years ago
- Target version changed from 2017-07-19 sprint to 2017-08-02 sprint
Updated by Peter Amstutz over 7 years ago
11158-fuse-projects 26e095318c7c368089ea368e6e4854b62a6c7830
Based on findings in note 16:
- Reverted changes to util.py
- Implemented incremental update of project contents from on websocket events
Updated by Tom Clegg over 7 years ago
LGTM
Happened to notice "if new_name" should probably also be changed to "if new_name is not None" here in fusedir.py, just in case someone has renamed something to "". Except that I suppose renaming to "" is as good as deleting, in the context of a fuse mount.
if new_name:
- if ent:
+ if ent is not None:
self._entries[new_name] = ent
else:
self._add_entry(new_attrs, new_name)
Updated by Peter Amstutz over 7 years ago
- Status changed from In Progress to Resolved