Project

General

Profile

Actions

Bug #21541

closed

arv-mount KeyError during cap_cache - Seemingly lost track of parent inode

Added by Brett Smith 10 months ago. Updated 9 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
FUSE
Story points:
-
Release relationship:
Auto

Description

User's arv-mount process crashed with this traceback. Afterward trying to list files in the mount root returned EIO.

2024-02-23 23:36:17 arvados.arvados_fuse[2803055] ERROR: Unhandled exception during FUSE operation
Traceback (most recent call last):
  File "venv/lib/python3.10/site-packages/arvados_fuse/__init__.py", line 327, in catch_exceptions_wrapper
    return orig_func(self, *args, **kwargs)
  File "venv/lib/python3.10/site-packages/arvados_fuse/__init__.py", line 570, in lookup
    self.inodes.touch(p)
  File "venv/lib/python3.10/site-packages/arvados_fuse/__init__.py", line 276, in touch
    self.inode_cache.touch(entry)
  File "venv/lib/python3.10/site-packages/arvados_fuse/__init__.py", line 234, in touch
    self.manage(obj)
  File "venv/lib/python3.10/site-packages/arvados_fuse/__init__.py", line 228, in manage
    self.cap_cache()
  File "venv/lib/python3.10/site-packages/arvados_fuse/__init__.py", line 212, in cap_cache
    self._remove(ent, True)
  File "venv/lib/python3.10/site-packages/arvados_fuse/__init__.py", line 186, in _remove
    obj.kernel_invalidate()
  File "venv/lib/python3.10/site-packages/arvados_fuse/fusedir.py", line 220, in kernel_invalidate
    parent = self.inodes[self.parent_inode]
  File "venv/lib/python3.10/site-packages/arvados_fuse/__init__.py", line 260, in __getitem__
    return self._entries[item]
KeyError: 865

This exact same traceback appeared seven times in one second. It's not clear whether that's multiple threads running into the same issue, or the error recurring because of different accesses.

Note this mount is intentionally accessible to multiple users on the host. You can assume there was concurrent access. Unfortunately for the same reason it's hard to know whether a specific operation caused the error.


Files

arv-mount-stress-test.py (5.44 KB) arv-mount-stress-test.py Brett Smith, 03/01/2024 11:54 PM
arv-mount-stress-test.py (5.93 KB) arv-mount-stress-test.py Version 3 Brett Smith, 03/22/2024 03:13 PM

Subtasks 2 (0 open2 closed)

Task #21555: Review 21541-arv-mount-keyerror-rebaseResolvedPeter Amstutz04/02/2024Actions
Task #21596: Review https://github.com/arvados/python-llfuseResolvedPeter Amstutz03/18/2024Actions

Related issues 2 (0 open2 closed)

Related to Arvados - Bug #21568: arv-mount double free or corruption with many concurrent accessesResolvedPeter AmstutzActions
Related to Arvados - Bug #21607: arv-mount memory usage grows over timeResolvedActions
Actions

Also available in: Atom PDF