Feature #12483
Updated by Tom Clegg about 7 years ago
The filesystem returned by (*arvados.Collection)FileSystem() needs Create() and OpenFile() methods.
* the file object returned by Open/Create/OpenFile should be an io.Writer (in addition to io.Seeker and io.ReadCloser as it is now)
* the filesystem object returned by FileSystem() should have a Collection() method that returns a new arvados.Collection with the same UUID as the original collection's UUID, and a manifest text and PDH that reflect the modified filesystem.
keep-web will require some changes:
* webdav code should call OpenFile with the requested flag argument, and pass through Write calls, instead of calling Open and stubbing Write.)
* the collection uuid→pdh cache needs to be sufficiently write-aware that a sequence of writes (by a single client) behaves predictably
* the collection cache needs to be sufficiently write-aware that writing to a collection and then reading from it using its old PDH does not return the modified data
This story does not require (and should not be held up by):
* De-duplicating existing code that writes to Keep, like crunch-run
* Optimizing generalized write performance (block packing for small files, performance when doing many short writes to many files at once)
* Optimizing keep-web performance when writing a large file using many small writes in separate webdav requests
* Avoiding lost updates when saving updated collections
This story does require:
* Good block packing in the most common/easy cases (sequential short writes to a single large file result in 64 MiB blocks)
* Correct behavior for arbitrary sequences of read/write/seek on multiple files in a single collection
* Correct behavior when multiple goroutines are concurrently updating one or more files in a collection when each goroutine has its own file object (however, callers are responsible for their own goroutine safety when sharing a single file object)