Project

General

Profile

Actions

Bug #12764

closed

[Crunch2] Support writable file staging

Added by Peter Amstutz over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-

Description

2. Mount points underneath output_path must not use "writable":true. If any of them are set as writable, the API will refuse to create/update the container request, and crunch-run will fail the container.

Proposed alternate behavior, to support CWL writable file staging.

2. Mount points underneath output_path which have "writable":true will be copied into the target output_path and may be updated by the container.

The implementation should be easy - if a file or directory is marked writable and mounted under the output path, it should be copied into place after arv-mount is set up but before the container starts.


Subtasks 3 (0 open3 closed)

Task #12952: Review 12764-writable-fileResolvedPeter Amstutz01/12/2018Actions
Task #13017: crunch-run uses restricted permission parent directoryResolvedPeter Amstutz01/12/2018Actions
Task #13019: Review 12764-secure-tmp-parentResolvedPeter Amstutz01/12/2018Actions
Actions #1

Updated by Peter Amstutz over 6 years ago

  • Status changed from New to In Progress
Actions #2

Updated by Peter Amstutz over 6 years ago

  • Subject changed from [CWL] Attempt to mount file mounted from empty collection PDH to [CWL] writable files not supported, not detected
  • Description updated (diff)
Actions #3

Updated by Peter Amstutz over 6 years ago

  • Subject changed from [CWL] writable files not supported, not detected to [Crunch2] Support writable file staging
  • Description updated (diff)
Actions #4

Updated by Peter Amstutz over 6 years ago

  • Description updated (diff)
Actions #5

Updated by Peter Amstutz over 6 years ago

  • Assigned To set to Peter Amstutz
  • Target version set to 2017-12-20 Sprint
Actions #6

Updated by Peter Amstutz over 6 years ago

  • Status changed from In Progress to New
  • Target version changed from 2017-12-20 Sprint to Arvados Future Sprints
Actions #7

Updated by Peter Amstutz about 6 years ago

  • Status changed from New to In Progress
Actions #8

Updated by Peter Amstutz about 6 years ago

  • Target version changed from Arvados Future Sprints to 2018-01-17 Sprint
Actions #9

Updated by Peter Amstutz about 6 years ago

  • Target version changed from 2018-01-17 Sprint to 2018-01-31 Sprint
Actions #10

Updated by Tom Clegg about 6 years ago

AFAICT the current documented+enforced restriction ("must not be writable") comes from #9397 and its sole purpose was to get the then-required feature out faster by deferring the "copy" implementation.

Mount points underneath output_path which have "writable":true will be copied into the target output_path and may be updated by the container

This sounds like a sensible implementation.

Just a suggestion, we might want to rephrase that a bit for the docs so it describes the user-visible behavior rather than the implementation. The meaning of writable=true ("can modify files, but modifications are not saved") is already mentioned elsewhere but might bear repeating. I wonder if it would be better to mention the consequences of the "copied into the target" part rather than just stating how it's implemented -- e.g., if your output path is in a tmpfs, you need to make sure it's big enough to accommodate copies of the inner writable mounts.

Actions #11

Updated by Peter Amstutz about 6 years ago

Tom Clegg wrote:

Just a suggestion, we might want to rephrase that a bit for the docs so it describes the user-visible behavior rather than the implementation. The meaning of writable=true ("can modify files, but modifications are not saved") is already mentioned elsewhere but might bear repeating. I wonder if it would be better to mention the consequences of the "copied into the target" part rather than just stating how it's implemented -- e.g., if your output path is in a tmpfs, you need to make sure it's big enough to accommodate copies of the inner writable mounts.

Updated text:

"Mount points underneath output_path which have "writable":true are copied into output_path during container initialization and may be updated, renamed, or deleted by the running container. The original collection is not modified. On container completion, files remaining in the output are saved to the output collection. The mount at output_path must be big enough to accommodate copies of the inner writable mounts."

I've rebased on latest and am going to re-run tests, assuming they pass, any other comments prior to merge?

Actions #13

Updated by Tom Clegg about 6 years ago

I'm not keen on this pattern

err = something()
if err == nil {
    // happy path
    if foo {
        err = somethingelse()
    } else {
        err = somethingdifferent()
    }
}
if err != nil {
    return err
}

Probably better to stick with the Go style "don't pyramid / handle errors first", even if that means using the same error fmt string twice.

Might help debugging if the error message mentions which mount was being processed. I guess that would involve adding "bind" to the copyFile struct?

The error handling in the WalkFunc seems a bit suspect. Is it intentional that it silently ignores non-nil incoming walkerr, and non-regular (pipe/socket/device/symlink) files? I would think all of those should be fatal errors, since they are unexpected/"impossible".

Actions #14

Updated by Peter Amstutz about 6 years ago

Tom Clegg wrote:

Might help debugging if the error message mentions which mount was being processed. I guess that would involve adding "bind" to the copyFile struct?

The error handling in the WalkFunc seems a bit suspect. Is it intentional that it silently ignores non-nil incoming walkerr, and non-regular (pipe/socket/device/symlink) files? I would think all of those should be fatal errors, since they are unexpected/"impossible".

Cleaned up error handling a bit:

12764-writable-file @ c8891a5eeec540404c7aff02a200fd95f6dbe64b

Actions #15

Updated by Tom Clegg about 6 years ago

LGTM

Actions #16

Updated by Peter Amstutz about 6 years ago

Tests at https://ci.curoverse.com/job/developer-run-tests/560/

... looks like there's a couple things left to fix.

Actions #17

Updated by Tom Clegg about 6 years ago

reviewing 12764-secure-tmp-parent @ 2eb85f70168d331e49e9d38bfbd2292d586dfcc1

Looks like runner.ParentTemp makes all the runner.CleanupTempDir stuff superfluous, so we should remove it.

apart from that, LGTM

I wonder if it would be worthwhile to call the tempdir "zzzzz-dz642-asdfasdfasdfasd-123456" instead of "crunch-run123456"?

Actions #18

Updated by Peter Amstutz about 6 years ago

  • Target version changed from 2018-01-31 Sprint to 2018-02-14 Sprint
Actions #19

Updated by Peter Amstutz about 6 years ago

  • Status changed from In Progress to Resolved
  • Target version changed from 2018-02-14 Sprint to 2018-01-31 Sprint
Actions

Also available in: Atom PDF