Feature #8015
closed[Crunch2] Keep-backed mount points: tmp/output dir, read-only collection, read-only by-id dir
Description
Implement mount points in crunch-run
as described here: https://dev.arvados.org/projects/arvados/wiki/Containers_API#Mount-types
Scope¶
Support the following mount types:- kind=collection, writable=false, uuid=X
- kind=collection, writable=false, portable_data_hash=X
- kind=collection, writable=true, no uuid or portable_data_hash
- kind=tmp, device_type={disk or network} (for now this can be a plain docker data volume, although at some point we might need something like a loopback device so we can reserve/limit capacity)
Note the "kind=collection, writable=true, and uuid-or-portable_data_hash provided" feature is not included here because it requires additional support from arv-mount that isn't implemented yet.
Approach¶
- Before starting the docker container, build an arv-mount command line and run arv-mount in a new temporary directory
- Add host bind mounts to the container, mapping the arv-mount directories to the desired targets inside the container.
- Run the docker container, wait for it to finish
- If the container used one of its mount points as an output directory, record its output (i.e., get the manifest_text from the JSON object in the magic
.arvados#collection
file, create a new collection, and record the resulting PDH) - Run "fusermount -u" to unmount the arv-mount directory
- Delete temporary directories
Updated by Tom Clegg almost 9 years ago
- Subject changed from [Crunch2] Set up mount points to [Crunch2] Keep-backed mount points: tmp/output dir, read-only collection, read-only by-id dir
Updated by Tom Clegg almost 9 years ago
- Description updated (diff)
- Category set to Crunch
Updated by Peter Amstutz almost 9 years ago
Note that kind=collection, writable=false, uuid=X
will make the job impure, because the collection could change. I thought we were trying to avoid that sort of thing?
Updated by Brett Smith almost 9 years ago
- Target version deleted (
2016-02-17 Sprint) - Assigned To deleted (
Peter Amstutz)
Updated by Peter Amstutz almost 9 years ago
This is ready for review on @ 8015-crunch2-mount
Updated by Brett Smith almost 9 years ago
- Status changed from New to In Progress
- Assigned To set to Peter Amstutz
- Target version set to 2016-02-17 Sprint
Updated by Tom Clegg almost 9 years ago
Peter Amstutz wrote:
We want toNote that
kind=collection, writable=false, uuid=X
will make the job impure, because the collection could change. I thought we were trying to avoid that sort of thing?
- permit impure containers
- make it harder to accidentally and unknowingly run impure containers
Use of this kind of mount is an example of how clients like Workbench (and job re-use features) can detect/predict/guess that a container is impure. But I don't think crunch-run needs to care one way or the other...
Updated by Brett Smith almost 9 years ago
- Target version changed from 2016-02-17 Sprint to 2016-03-02 sprint
- Story points changed from 2.0 to 0.5
Updated by Peter Amstutz almost 9 years ago
A couple notes:
This branch ended up bringing #8020 (handle outputs) along for the ride, so some of the bulk is from adding the ability to upload results (in addition to using writable keep). The uploading code was adapted from the existing crunchrunner code so it's not completely new.
I after the actual arv-mount handling, most of the rest of the changes are from general improvements/refactoring around logging. I went to the Docker API and figured how to correctly handle stdout/stderr multiplexed on the single stream.
Updated by Radhika Chippada almost 9 years ago
crunchrun.go
- It does not appear that SetupSignals would return any non-nil error?
- Should runner.CrunchLog.Printf be using closeerr instead of readerr in the following? (one more like this in the next if statement as well)
if closeerr != nil {
runner.CrunchLog.Printf("While closing stdout logs: %v", readerr)
}
- AttachStreams comment says “AttachLogs”
- Line 644 (CrunchLog.Close()): Would this result in writing the log (with any errors in steps 8 and 9) to keep again ? Would this result in changes to collection such as pdh?
- Line 650: Should this be “err = waiterr” instead of runerr?
upload.go
- func ReadFrom reads from reader and writes to keep. Should it be better named to reflect this? ReadAndWrite or something like that
- The comment “CollectionWriter makes implements creating new Keep collections” seems like too many verbs makes and implements
- 6 tests are failing (it could be my env; if they are passing for you, you can ignore this comment)
Updated by Peter Amstutz almost 9 years ago
Radhika Chippada wrote:
crunchrun.go
- It does not appear that SetupSignals would return any non-nil error?
Fixed to not return anything now (because signal.Notify() doesn't return an error either).
- Should runner.CrunchLog.Printf be using closeerr instead of readerr in the following? (one more like this in the next if statement as well)
if closeerr != nil {
runner.CrunchLog.Printf("While closing stdout logs: %v", readerr)
}
Fixed.
- AttachStreams comment says “AttachLogs”
Fixed. Also fixed error handling around that code path.
- Line 644 (CrunchLog.Close()): Would this result in writing the log (with any errors in steps 8 and 9) to keep again ? Would this result in changes to collection such as pdh?
The code is correct, after committing to keep in CommitLogs() it creates a new logger (so that any late shutdown errors can be logged to the API server) but without a CollectionFileWriter attached.
- Line 650: Should this be “err = waiterr” instead of runerr?
Fixed.
upload.go
- func ReadFrom reads from reader and writes to keep. Should it be better named to reflect this? ReadAndWrite or something like that
It's named that way to fulfill the "ReaderFrom" interface (https://golang.org/pkg/io/)
- The comment “CollectionWriter makes implements creating new Keep collections” seems like too many verbs makes and implements
Fixed.
- 6 tests are failing (it could be my env; if they are passing for you, you can ignore this comment)
With arvbox restart test --only services/crunch-run
I get 1 error. Looking into it.
Updated by Radhika Chippada almost 9 years ago
I am still not using arvbox. All tests passed for me. LGTM
Updated by Peter Amstutz almost 9 years ago
- Status changed from In Progress to Resolved
- % Done changed from 67 to 100
Applied in changeset arvados|commit:0f8dc3d824f03b82b8db9f61bbcc0592b62b998f.