Actions
Idea #17849
openFUSE driver v2
Start date:
Due date:
Story points:
-
Release:
Release relationship:
Auto
Description
Background:
Python+llfuse was expedient and has done lots of good work for us, but it's not promising as a long term (fast+reliable+maintainable) solution.
Implementation:- collection-backed filesystem from #12483, plus more general arvados-backed filesystem ("by_id" directory, etc, same as the one exported via webdav) from #13111
- present as fuse using a library like https://godoc.org/bazil.org/fuse or https://godoc.org/github.com/billziss-gh/cgofuse/fuse
- package as a subcommand ("mount") of the source:cmd/arvados-client program
- Approach for handling websocket "update" events
- Selectable mechanisms/options for syncing to server (fflush, fsync, close) (on a shell node, flush-on-close, flush-periodically, or flush-after-idle-time might be best; in crunch-run, flush-on-exit might be best)
- Desired behavior when updates conflict (write error? clobber? create "oops,clobbered" file?)
- Old keep block signatures don't get refreshed, so reading a collection that's been cached for too long returns an I/O error
- Not command-line compatible with arv-mount
- Logging is not great
- No docs
- No way to control overall cache size (currently collectionfs can use lots of RAM in certain non-sequential write scenarios; we need the ability to trade speed for space efficiency in memory-constrained environments)
- No warnings given when cache is thrashing
- No application level instrumentation (just optional Go pprof)
- Special
.arvados#collection
file is incomplete (has manifest_text but not uuid, pdh) - No automatic flush on sigint/sigterm
- No warning given when trying to exit but filesystem can't be unmounted yet (filehandle is open, or a process's cwd is in the mount)
- Mac port has a race bug (see notes below)
- Windows port is untested
- Cross-compiling recipe for Mac/Windows ports is fragile
- chmod is a no-op (chmod 0700 succeeds, but the file mode will still be 0755)
Related issues
Updated by Peter Amstutz over 3 years ago
- Related to Bug #16727: [FUSE] [cgofuse] Refresh signatures / reload collection instead of using expired blob signatures added
Updated by Peter Amstutz over 3 years ago
- Related to Idea #12308: [FUSE] Golang-based fuse driver added
Updated by Peter Amstutz over 3 years ago
- Start date set to 01/01/2022
- Due date set to 03/31/2022
Updated by Peter Amstutz about 3 years ago
- Start date changed from 01/01/2022 to 04/01/2022
- Due date changed from 03/31/2022 to 07/31/2022
Updated by Peter Amstutz over 2 years ago
- Start date changed from 04/01/2022 to 05/01/2022
- Due date changed from 07/31/2022 to 08/31/2022
Updated by Peter Amstutz over 2 years ago
- Related to Feature #18960: Config option to make crunch-run use Go FUSE driver when all mounts are read-only added
Updated by Peter Amstutz over 2 years ago
- Related to Feature #18961: Go FileSystem / FUSE mount supports block prefetch added
Updated by Peter Amstutz over 2 years ago
- Start date changed from 05/01/2022 to 06/01/2022
- Due date changed from 08/31/2022 to 09/30/2022
Updated by Peter Amstutz over 2 years ago
- Start date changed from 06/01/2022 to 08/31/2022
- Due date changed from 09/30/2022 to 11/30/2022
Updated by Peter Amstutz over 2 years ago
- Start date changed from 08/31/2022 to 09/01/2022
Updated by Peter Amstutz over 2 years ago
- Start date changed from 09/01/2022 to 10/01/2022
- Due date changed from 11/30/2022 to 12/31/2022
Updated by Peter Amstutz about 2 years ago
- Start date changed from 10/01/2022 to 12/01/2022
- Due date changed from 12/31/2022 to 02/28/2023
Updated by Peter Amstutz almost 2 years ago
- Start date changed from 12/01/2022 to 03/01/2023
- Due date changed from 02/28/2023 to 07/31/2023
Updated by Peter Amstutz over 1 year ago
- Start date changed from 03/01/2023 to 06/01/2023
- Due date changed from 07/31/2023 to 10/31/2023
Updated by Peter Amstutz over 1 year ago
- Due date changed from 10/31/2023 to 09/30/2023
Updated by Peter Amstutz over 1 year ago
- Due date changed from 09/30/2023 to 07/31/2023
Updated by Peter Amstutz over 1 year ago
- Start date changed from 06/01/2023 to 10/01/2023
- Due date changed from 07/31/2023 to 12/31/2023
Updated by Peter Amstutz about 1 year ago
- Start date changed from 10/01/2023 to 01/01/2024
- Due date changed from 12/31/2023 to 03/31/2024
Updated by Peter Amstutz 11 months ago
- Start date changed from 01/01/2024 to 05/01/2024
- Due date changed from 03/31/2024 to 10/31/2024
Updated by Peter Amstutz 11 months ago
- Start date changed from 05/01/2024 to 04/01/2024
- Due date changed from 10/31/2024 to 08/31/2024
Updated by Brett Smith 9 months ago
Notes from recent discussions, especially the 2024-03-06 engineering meeting
Reasons to start prioritizing this:
- Heavier user of arv-mount is consistently hitting concurrency and performance issues
- python-llfuse is in maintenance mode - They did a release in November 2023 that should keep us set for a while but the writing's on the wall.
Potential milestones:
- Go mount can serve crunch-run's purposes
- Go mount can serve users' purposes for read-only mounts
- Support as many command line options as practical (see below)
- Bounded memory use
- Disk caching of data
- Go mount adds
--read-write
support
Quoting lib/crunchrun/crunchrun.go
, here are the exact options crunch-run
can currently call arv-mount
with:
--foreground
--read-write
(I think because of--mount-tmp
)--storage-classes
--crunchstat-interval
--allow-other
(since the compute work may run as another user inside the container)--disk-cache
--disk-cache-dir
--file-cache
--ram-cache
--mount-tmp
--mount-by-pdh
--disable-event-listening
(not totally clear why, just trying to reduce network traffic?)--mount-by-id
--unmount-timeout
,--unmount
(cleanup after the job is done, IMO I think we could implement this with standard tools, see below)--version
(basic "can run" check)
Implementation notes about specific options:
- Please use a GNU-style argument parser so
--long-options
still work. Our users use Linux, not Plan 9, there's no good reason to force them tos/--/-/g
in all their tooling. - By the time we get to phase #2 and start offering this to users, for a combination of reasons of "it's required" or "it's too useful not too have" or "it's very low effort to implement after you have the previous options," we expect to support everything except possibly the following:
--read-write
(you'll need to support writable tmp mounts for crunch-run but updating projects and collections is the next phase)--foreground
: IMO the new mount should always run foreground, this should become a noop option for compatibility and we tell users "if you want it to daemonize run it withsystemd-run
"--replace
,--subtype
,--unmount-*
: I think the main reason these options exist is because the code exists to support--exec
so we might as well expose it. If we end up reimplementing that same pattern, that's fine. But if there's no need, I think it's low risk to remove these options and tell users to usefusermount -u
,findmnt
, and other existing tools. I don't think we need to reimplement those just for compatibility.
- Consider this alternative, more ergonomic spelling for
--exec
:arvados-client mount MOUNTDIR [COMMAND ARG ...]
- Consider #19934 for
--exec
(start the process with the mount point as the working directory)
Updated by Tom Clegg 8 months ago
- Related to Feature #21578: Add debug logging option to arvados-client mount added
Actions