Project

General

Profile

Actions

Feature #5778

open

[FUSE] Support efficient copy at command line

Added by Peter Amstutz over 9 years ago. Updated 9 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
FUSE
Target version:
Story points:
2.0
Release:
Release relationship:
Auto

Description

Keep can perform efficient copy-on-write of files and directories, but POSIX doesn't provide an API for this. We've decided not to abuse standard hardlinks: while similar (in the "fast copy" sense), hardlinks offer incompatible semantics ("two filenames refer to the same data; writes to either file are reflected in both").

Possible approaches for exposing COW capability through arv-mount:

  • Use BTRFS clone ioctl() (requires support for handling ioctl() in llfuse). User can use cp --reflink
  • Use s3fs approach of writing a special xattr() to a special place to request a COW link. User uses a custom command to communicate with the file system.

Meanwhile, the following workaround is possible without modifying the FUSE driver (and could be provided as a "copy" CLI program):

  • Determine source and target collections, perform the operation using Arvados SDK. Results show up in target directory on refresh.
Actions #1

Updated by Peter Amstutz over 9 years ago

  • Description updated (diff)
Actions #2

Updated by Peter Amstutz over 9 years ago

  • Category set to FUSE
Actions #3

Updated by Tom Clegg over 9 years ago

  • Description updated (diff)
  • Target version set to Arvados Future Sprints
Actions #4

Updated by Ward Vandewege over 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions #5

Updated by Joshua Randall over 3 years ago

For a limited use-case in which you want to use arv-mount to drive the actual copying (i.e. which file(s) to copy from one collection to another), I guess the (partial) workaround might be:
- duplicate input collections using the CLI or SDK into temporary collections
- use arv-mount read-write with the input collections mounted by ID
- mv (rename) the files of interest from the input collections to the output collection
- (optionally) delete the temporary duplicated collections

Does this make sense, or is a simpler workaround possible today?

It seems like another option to consider to enable this use-case without the external duplication step might be to have some sort of flag for arv-mount that allows renames to succeed against sources on read-only collections (i.e. when the input is specified by PDH)?

Currently an attempt to do that fails with "Operation not permitted" - that makes sense as the PDH mount point is read-only even when using `--read-write`, and clearly that is the correct default behaviour, but I thought it might be a compromise to offer an arv-mount option that would allow a user to opt-in to allowing an `mv` command to succeed against a fundamentally read only source without actually modifying that source (obviously).

I guess of the other options mentioned in this story, the one that enables `cp --reflink` seems the most user-friendly. Is it possible with llfuse today?

Actions #6

Updated by Peter Amstutz over 3 years ago

It's been a rather long time since we looked into this, but the issue at the time was that the way cp --reflink was communicated to the file system wasn't propagated to FUSE.

I don't know if that was a limitation of the FUSE kernel interface, libfuse, or llfuse (probably not the last one). It is quite possible the situation has improved at some point in the last 5 years.

My preferred solution is still to reinterpret hard link requests as copy-on-write, it seems like a program that relies on POSIX semantics that closely is going to run into other more fundamental problems running on top of arv-mount before "expected modifications made to a hard linked file to show up in both files" becomes a problem.

Do you have a use case for this?

Actions #7

Updated by Peter Amstutz almost 2 years ago

  • Release set to 60
Actions #8

Updated by Peter Amstutz 9 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF