Feature #12483

[keep-web] writable webdav

Added by Ward Vandewege over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
10/25/2017
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
0.5

Description

The filesystem returned by (*arvados.Collection)FileSystem() needs Create() and OpenFile() methods.
  • the file object returned by Open/Create/OpenFile should be an io.Writer (in addition to io.Seeker and io.ReadCloser as it is now)
  • the filesystem object returned by FileSystem() should have a Collection() method that returns a new arvados.Collection with the same UUID as the original collection's UUID, and a manifest text and PDH that reflect the modified filesystem.
  • modifying the filesystem (by opening/writing files) does not modify the original collection at all.
keep-web will require some changes:
  • webdav code should call OpenFile with the requested flag argument, and pass through Write calls, instead of calling Open and stubbing Write.)
  • the collection uuid→pdh cache needs to be sufficiently write-aware that a sequence of writes (by a single client) behaves predictably
  • the collection cache needs to be sufficiently write-aware that writing to a collection and then reading from it using its old PDH does not return the modified data
This story does not require (and should not be held up by):
  • De-duplicating existing code that writes to Keep, like crunch-run
  • Optimizing generalized write performance (block packing for small files, performance when doing many short writes to many files at once)
  • Optimizing keep-web performance when writing a large file using many small writes in separate webdav requests
  • Avoiding lost updates when saving updated collections
This story does require:
  • Good block packing in the most common/easy cases (sequential short writes to a single large file result in 64 MiB blocks)
  • Correct behavior for arbitrary sequences of read/write/seek on multiple files in a single filesystem
  • Correct behavior when multiple goroutines are concurrently updating one or more files in a filesystem when each goroutine has its own file object (however, callers are responsible for their own goroutine safety when sharing a single file object)

Subtasks

Task #12578: Review 12483-writable-fsResolvedTom Clegg


Related issues

Related to Arvados - Story #11876: [R SDK] Create a Bioconductor/R SDKClosed06/20/2017

Associated revisions

Revision 195deb7a
Added by Tom Clegg over 2 years ago

Merge branch '12483-writable-fs'

refs #12483

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Ward Vandewege over 2 years ago

  • Target version set to To Be Groomed

#2 Updated by Tom Clegg over 2 years ago

  • Description updated (diff)

#3 Updated by Tom Clegg over 2 years ago

  • Description updated (diff)

#4 Updated by Tom Clegg over 2 years ago

  • Description updated (diff)

#5 Updated by Tom Morris over 2 years ago

  • Target version changed from To Be Groomed to Arvados Future Sprints
  • Story points set to 5.0

#6 Updated by Tom Morris over 2 years ago

  • Assigned To set to Tom Clegg
  • Target version changed from Arvados Future Sprints to 2017-11-22 Sprint

#7 Updated by Tom Clegg over 2 years ago

  • Status changed from New to In Progress

#8 Updated by Tom Clegg over 2 years ago

12483-writable-fs @ 9cee2680c3b1f74d2a43d9a0e2b572168e3b488d

working:
  • writable filesystem
  • webdav client can write
todo:
  • collectionfs: obey (and test) O_APPEND and O_TRUNC; fail on O_SYNC
  • collectionfs: rename
  • marshal: compress adjacent segments ("0:100:foo 100:100:foo 200:100:foo ..." is not ideal)
  • webdav: rename, remove, mkdir, rmdir
  • webdav tests: overwrite existing file
  • webdav tests: unhappy paths -- write without perm, write via PDH
expect to defer:
  • SDK convenience method for updating an existing collection (clients may want different things, none of which are very hard, and keep-web needs to coordinate this with its cache anyway)
  • locking (I think we need, at least, an atomic update operation on the API side to do this properly)

#9 Updated by Peter Amstutz over 2 years ago

Is there any chance you'd consider renaming "extent" to "segment" in your code for consistency with the Python SDK?

#10 Updated by Tom Clegg over 2 years ago

So far I've been using "segments" to mean portions of a stream (a "manifest v1" concept). Those are related to so-called extents -- especially now, because "merge adjacent segments" doesn't happen yet -- but I was hoping to make a cleaner break between the manifest v1 format and the in-memory representation. However...
  • "Extent" occurred to me merely because that's what LVM calls its sector-like things.
  • Looking again at existing code, it seems "segments" is indeed the right word and "portions of a stream" is where I should have avoided the term "segments". Keep manifest format seems to call those "filetokens" so maybe I should stick with that...

#11 Updated by Peter Amstutz over 2 years ago

Tom Clegg wrote:

So far I've been using "segments" to mean portions of a stream (a "manifest v1" concept). Those are related to so-called extents -- especially now, because "merge adjacent segments" doesn't happen yet -- but I was hoping to make a cleaner break between the manifest v1 format and the in-memory representation. However...
  • "Extent" occurred to me merely because that's what LVM calls its sector-like things.
  • Looking again at existing code, it seems "segments" is indeed the right word and "portions of a stream" is where I should have avoided the term "segments". Keep manifest format seems to call those "filetokens" so maybe I should stick with that...

In the Python SDK, the term segment is used to mean a slice of a single block which corresponds to some contiguous span of file data. Segments are what you get after you apply the file spans (file tokens?) to the stream and split on block boundaries. This already is a place where there's a break between the manifest v1 format and in-memory representation.

#12 Updated by Peter Amstutz over 2 years ago

This method needs documentation:

func (dn *dirnode) OpenFile(name string, flag int, perm os.FileMode) (*file, error) {

By my reading, it can be used to open a file, open a directory, create a file, or create a directory. The open/create behavior is based on "flag" and the file/directory behavior is based on "perm". However, if entry already exists, then it returns the actual inode for the entry and ignores "perm". It supports creating leaf directories but not creating all directories along a path (eg mkdir -p).

I see loadManifest() pulls the entire manifest into memory. That's the simplest and most obvious way to do it, unfortunately experience in the Python SDK has shown it becomes a problem with large (100,000-1,000,000 files) manifests. I'm curious if Go is sufficiently more efficient with memory that it won't be as big a problem, or we should start considering strategies for incremental loading (eg per-directory) now.

Suggest renaming type file to type filehandle.

#13 Updated by Tom Clegg over 2 years ago

OpenFile() is analogous to os.OpenFile(), which is essentially open(). (Added a comment)

I've thought about incremental loading too but yes, it seems like a safe bet that Go won't explode as quickly as Python. Some manual big-collection cpu/mem testing is a good todo, to get a sense of where we stand with the easy way.

I considered fileHandle/fileDescriptor but "file" seemed to match the rest of the Go world better (os.File). OTOH, since it's a private type that might not be a real concern.

#14 Updated by Peter Amstutz over 2 years ago

Does os.OpenFile with the Dir mode bit set actually create a directory? (If not, then the behavior is different from os.OpenFile and should be documented a such).

Maybe "pruneMemExtents" be clearer as "flushMemExtents" or "commitMemExtents"?

filenode.pruneMemExtents() and dirnode.sync() have overlapping functionality. Although it seems to be in a gray area between "DRY, should refactor" and "different behavior requires different code".

In Seek(), it looks like it clamps offsets past the end of file. This is (unfortunately) wrong. POSIX allows seeking past the end of file, and when the file pointer is past the end of file the behavior is something like:

  • read returns EOF and no data
  • write resizes the file to fit the current position + newly written data; gaps between the end of the file and the new data is logically filled with zeros (but is allowed to be stored as a "sparse file" on disk)

Since you already have support for extending the file size, this should just be a matter of adding logic to Write() to test if the file pointer is off the end of the file and extending the file to fit.

For marshalManifest() you may want a "strip tokens" flag to generate manifest text suitable for computing the PDH.

  // This is assured by locking the path from root to
  // newdir, then locking the path from root to olddir, skipping
  // any already-locked nodes.

I think this comment has it backwards, I think it is following the path from olddir to root, then newdir to root?

#15 Updated by Peter Amstutz over 2 years ago

Failing test:

FAIL: cadaver_test.go:20: IntegrationSuite.TestWebdavWithCadaver

http://127.0.0.1:40015 {path:/c=zzzzz-4zz18-foonbarfilesdir/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/ cmd:ls
 match:(?ms).*dir1 *0 .* data:[]}
http://127.0.0.1:40015 {path:/c=zzzzz-4zz18-foonbarfilesdir/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/ cmd:ls dir1
 match:(?ms).*bar *3.*foo *3 .* data:[]}
http://127.0.0.1:40015 {path:/c=zzzzz-4zz18-foonbarfilesdir/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/_/dir1 cmd:ls
 match:(?ms).*bar *3.*foo *3 .* data:[]}
http://127.0.0.1:40015 {path:/c=zzzzz-4zz18-foonbarfilesdir/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/dir1/ cmd:ls
 match:(?ms).*bar *3.*foo *3 .* data:[]}
http://127.0.0.1:40015 {path:/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/ cmd:get emptyfile '/tmp/checkfile906966746'
 match:(?ms).*Not Found.* data:[]}
http://127.0.0.1:40015 {path:/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/ cmd:put '/tmp/emptyfile414427047' emptyfile
 match:(?ms).*Uploading .* succeeded.* data:[]}
http://127.0.0.1:40015 {path:/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/ cmd:get emptyfile '/tmp/checkfile906966746'
 match:(?ms).*great success.* data:[]}
cadaver_test.go:115:
    c.Check(buf.String(), check.Matches, trial.match)
... value string = "" +
...     "dav:/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k\rk1jqdmi/> get emptyfile '/tmp/checkfile906966746'\n" +
...     "Downloading `/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/emptyfile' to /tmp/checkfile906966746: [.] succeeded.\n" +
...     "dav:/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k\rk1jqdmi/> \n" +
...     "Connection to `127.0.0.1' closed.\n" 
... regex string = "(?ms).*great success.*" 

http://127.0.0.1:40015 {path:/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/ cmd:put '/tmp/localfile305967464' testfile
 match:(?ms).*Uploading .* succeeded.* data:[]}
http://127.0.0.1:40015 {path:/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/ cmd:get testfile '/tmp/checkfile906966746'
 match:(?ms).*succeeded.* data:[116 104 101 32 104 117 109 97 110 32 116 114 97 103 101 100 121 32 99 111 110 115 105 115 116 115 32 105 110 32 116 104 101 32 110 101 99 101 115 115 105 116 121 32 111 102 32 108 105 118 105 110 103 32 119 105 116 104 32 116 104 101 32 99 111 110 115 101 113 117 101 110 99 101 115 32 111 102 32 97 99 116 105 111 110 115 32 112 101 114 102 111 114 109 101 100 32 117 110 100 101 114 32 116 104 101 32 112 114 101 115 115 117 114 101 32 111 102 32 99 111 109 112 117 108 115 105 111 110 115 32 119 101 32 100 111 32 110 111 116 32 117 110 100 101 114 115 116 97 110 100]}
cadaver_test.go:115:
    c.Check(buf.String(), check.Matches, trial.match)
... value string = "" +
...     "dav:/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k\rk1jqdmi/> get testfile '/tmp/checkfile906966746'\n" +
...     "Downloading `/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi/testfile' to /tmp/checkfile906966746: [.] failed:\n" +
...     "Could not read response body: connection was closed by server\n" +
...     "dav:/c=zzzzz-4zz18-ktkedzn4xxpn1ze/t=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k\rk1jqdmi/> \n" +
...     "Connection to `127.0.0.1' closed.\n" 
... regex string = "(?ms).*succeeded.*" 

cadaver_test.go:124:
    c.Check(got, check.DeepEquals, trial.data)
... obtained []uint8 = []byte{}
... expected []uint8 = []byte{0x74, 0x68, 0x65, 0x20, 0x68, 0x75, 0x6d, 0x61, 0x6e, 0x20, 0x74, 0x72, 0x61, 0x67, 0x65, 0x64, 0x79, 0x20, 0x63, 0x6f, 0x6e, 0x73, 0x69, 0x73, 0x74, 0x73, 0x20, 0x69, 0x6e, 0x20, 0x74, 0x68, 0x65, 0x20, 0x6e, 0x65, 0x63, 0x65, 0x73, 0x73, 0x69, 0x74, 0x79, 0x20, 0x6f, 0x66, 0x20, 0x6c, 0x69, 0x76, 0x69, 0x6e, 0x67, 0x20, 0x77, 0x69, 0x74, 0x68, 0x20, 0x74, 0x68, 0x65, 0x20, 0x63, 0x6f, 0x6e, 0x73, 0x65, 0x71, 0x75, 0x65, 0x6e, 0x63, 0x65, 0x73, 0x20, 0x6f, 0x66, 0x20, 0x61, 0x63, 0x74, 0x69, 0x6f, 0x6e, 0x73, 0x20, 0x70, 0x65, 0x72, 0x66, 0x6f, 0x72, 0x6d, 0x65, 0x64, 0x20, 0x75, 0x6e, 0x64, 0x65, 0x72, 0x20, 0x74, 0x68, 0x65, 0x20, 0x70, 0x72, 0x65, 0x73, 0x73, 0x75, 0x72, 0x65, 0x20, 0x6f, 0x66, 0x20, 0x63, 0x6f, 0x6d, 0x70, 0x75, 0x6c, 0x73, 0x69, 0x6f, 0x6e, 0x73, 0x20, 0x77, 0x65, 0x20, 0x64, 0x6f, 0x20, 0x6e, 0x6f, 0x74, 0x20, 0x75, 0x6e, 0x64, 0x65, 0x72, 0x73, 0x74, 0x61, 0x6e, 0x64}

2017/11/17 21:38:38 "" 200 "OK" 3 "POST" "example.com" "/c=zzzzz-4zz18-fy296fx3hot09f7/foo" "" 
Sent SIGTERM to 1867 (/usr/src/arvados/tmp/keep0.pid)
[keep0] {"level":"info","msg":"caught signal: terminated","time":"2017-11-17T21:38:38.585503632Z"}
[keep0] {"level":"info","msg":"keepstore exiting, pid 1867","time":"2017-11-17T21:38:38.585571119Z"}
Sent SIGTERM to 1878 (/usr/src/arvados/tmp/keep1.pid)
[keep1] {"level":"info","msg":"caught signal: terminated","time":"2017-11-17T21:38:38.690276184Z"}
[keep1] {"level":"info","msg":"keepstore exiting, pid 1878","time":"2017-11-17T21:38:38.690341896Z"}
OOPS: 35 passed, 1 FAILED

#16 Updated by Tom Clegg over 2 years ago

Peter Amstutz wrote:

Does os.OpenFile with the Dir mode bit set actually create a directory? (If not, then the behavior is different from os.OpenFile and should be documented a such).

Yes, this is how mkdir() is implemented.

Maybe "pruneMemExtents" be clearer as "flushMemExtents" or "commitMemExtents"?

"prune" isn't sacred but I would like the name to suggest the "not everything, just some stuff" idea. Flush/commit sound to me like they store everything, like sync() does. Prune is more like "try to keep memory use reasonable" + "try to reduce the latency of the eventual call to sync() by writing out some stuff earlier".

filenode.pruneMemExtents() and dirnode.sync() have overlapping functionality. Although it seems to be in a gray area between "DRY, should refactor" and "different behavior requires different code".

Yeah, that's why I left the "TODO: share code" comment there. Ultimately I'd like to unify the writing and reading sides of the cache. For now, "prune" is just addressing the "don't boil the ocean" goal: I'm expecting it to behave well with everything webdav throws at it. It wouldn't be hard to come up with a series of seeks and writes that uses arbitrary amounts of memory, but I don't want to get hung up on that if webdav can't easily be induced to do it.

In Seek(), it looks like it clamps offsets past the end of file. This is (unfortunately) wrong. POSIX allows seeking past the end of file, and when the file pointer is past the end of file the behavior is something like:

  • read returns EOF and no data
  • write resizes the file to fit the current position + newly written data; gaps between the end of the file and the new data is logically filled with zeros (but is allowed to be stored as a "sparse file" on disk)

Good catch, let's not repeat #11510

For marshalManifest() you may want a "strip tokens" flag to generate manifest text suitable for computing the PDH.

I think that would be a feature of Collection rather than filesystem. Anyway, not needed now, right?

I think this comment has it backwards, it is following the path from olddir to root, then newdir to root.

The list of nodes is built by following leaf to root, but the locking order is root to leaf. The comment was supposed to make this less mysterious/subtle... how about this:

// This is assured by enumerating the nodes on the path from olddir to root
// and the path from newdir to root, and then locking all of those nodes in
// the reverse order, skipping any already-locked nodes.

#17 Updated by Tom Morris over 2 years ago

re: Note-12/13

If manifest "loading" is code for parsing entire text-based manifests which are 10s or 100s of MB even when we're only interested in a few snippets of info, that seems like a dangerous way to bias things. Having the amount processing be proportionate to the amount of information needed seems like a much more scalable approach. Go may be magical, but it's not that magical, and we know that we have users with very, very large manifests.

#18 Updated by Tom Clegg over 2 years ago

12483-writable-fs @ 127e916894bcd16f6978aa6488c81e79a9ca2812
  • Fix seek+write-beyond-EOF semantics
  • Support RemoveAll() and Rename() in collectionfs (but still need support/testing on the webdav side)
  • Support o_append and o_trunc; refuse o_sync
  • Compress adjacent segments (no more 0:100:foo 100:100:foo ...)

I also made some performance improvements in the collection-loading code (on dev vm: 100M+<1.5s to load a manifest with 256K files in 512K dirs; 340M+<0.5s with 256K files in one dir).

I shrank the test case a bit (256K files in 512 dirs) and adjusted the parameters of the Python "medium-size manifest" test to the same (and skipped the "read file data" part) to get a rough comparison:

python2 70s 458M
python3 55s 220M
go 1.09s 176M

#19 Updated by Tom Clegg over 2 years ago

12483-writable-fs @ dc56b929215f826fb057ee5b9b7dfa58ff5ab3ed
  • connects webdav move + delete functions
  • changes "update" behavior so modifications to a collection only happen if/when webdav sends a successful response (this is faster than saving updates on every Close() when multiple files are involved, and it avoids saving collections in unwanted states in scenarios like "deleted dest dir, but then copying/moving source dir failed".
remaining todo:
  • webdav tests: unhappy paths -- write without perm, write via PDH
  • rename extent → segment
  • rename file → filehandle
  • (maybe) combine *dirnode and *filenode into one type, replace type assertions with IsDir() method

#20 Updated by Peter Amstutz over 2 years ago

On the topic of WebDAV, I noticed this request against the current keep-web master is very slow to respond:

curl -v -X PROPFIND -H "Authorization: OAuth2 4invqy35tf70t7hmvdc83ges8ug9cklhgqq1l8gj2cjn18teuq" https://collections.4xphq.arvadosapi.com/c=4xphq-4zz18-9d5b0qm4fgijeyi/_/

We should reproduce and determine if that is fixed by this branch or not.

#21 Updated by Tom Clegg over 2 years ago

12483-writable-fs @ 2fcbfc106c807aa17d2f73ce40fe2a64ed4c5b13
  • rename extent→segment, file→filehandle
  • fixup "update" behavior some more (don't update after readonly webdav operations like PROPFIND)

#22 Updated by Tom Clegg over 2 years ago

  • Target version changed from 2017-11-22 Sprint to 2017-12-06 Sprint

#23 Updated by Tom Clegg over 2 years ago

  • Story points changed from 5.0 to 0.5

#24 Updated by Tom Clegg over 2 years ago

12483-writable-fs @ 95ec747218f048e5bbfb986ff4eaeba2d3d2f80b
  • webdav tests: write without permission, write via PDH

#25 Updated by Peter Amstutz over 2 years ago

The COPY webdav method is unimplemented:

2017-11-27_22:58:49.65804 2017/11/27 22:58:49 "192.168.5.1:51620" 405 "COPY" 0 "COPY" "192.168.5.2:9002" "/c=2tlax-4zz18-vvda5amsqnv4bho/envvar-global.yml" "" 

I don't know if it is needed but I thought I'd mention it.

#26 Updated by Tom Clegg over 2 years ago

12483-writable-fs @ 65495abc7634b2a7cdccd01f7b6b3188b80bf052
  • fix keepclient.CollectionFileReader test that relied on broken seek-past-EOF behavior
  • fix large buffer allocation for small blocks in block_cache.go ("manyblocks" test now takes 6s instead of 3m)
  • fix os.SEEK_CUR→io.SeekCurrent etc. in collection_fs and tests
  • add COPY method

#27 Updated by Peter Amstutz over 2 years ago

Notes on collection_fs.go (non-blockers)

Keeping a pointer to keepClient on every storedSegment seems to be a bit of memory bloat at the margins (increases struct size by 20% multiplied by many thousands of segments).

Would be nice to have comments describing the semantics of each field the on various struct definitons (for example, in storedSegment it's not clear what is the difference between "size" and "length").

Notes on DAV

dav:/c=2tlax-4zz18-vvda5amsqnv4bho/> propget .
Fetching properties for `.':
DAV: supportedlock = <DAV:lockentry xmlns:D='DAV:'><DAV:lockscope><DAV:exclusive></DAV:exclusive></DAV:lockscope><DAV:locktype><DAV:write></DAV:write></DAV:locktype></DAV:lockentry>
DAV: resourcetype = <DAV:collection xmlns:D='DAV:'></DAV:collection>
DAV: displayname = 
DAV: getlastmodified = Mon, 01 Jan 0001 00:00:00 GMT

getlastmodified is bogus, ought to be last_modified_at.

displayname property for collections should be the collection name.

doc.go:

// Keep-web provides read-only HTTP access to files stored in Keep. It

Should be updated for expanded WebDAV support.

Supposedly Windows can mount WebDAV resources. Have we tried it? I know customers requested something similar to arv-mount for Windows, and WebDAV could fulfill that (will probably need project browsing to satisfy that use case, though).

It works in GNOME 3 using the dav:// scheme. Gnome apps like GIMP can see it as a network drive and read/write files. Cool!

Dashes are being quoted unexpectedly. The file span 0:99:envvar-global.yml got turned into 0:99:envvar\055global.yml. It seems like the Python SDK doesn't de-quote this properly, so while it shows up correctly as "envvar-global.yml" in Workbench and WebDAV, it shows up as "envvar\055global.yml" in FUSE . While the Python SDK should be fixed, also consider limiting quoting to whitespace (\040) to avoid unnecessary backwards-incompatible behavior.

The "DELETE" method on a collection (by portable data hash) returns 204 (No content) which is interpreted as success.

Using "MOVE" on a collection by PDH returns 403 (Forbidden).

Using "COPY" on a collection by PDH returns 500 (Internal server error).

The above operations should return a consistent status code (403 or 405).

Should use Cache-control: immutable header for collections accessed by PDH (https://tools.ietf.org/html/rfc8246)

#28 Updated by Peter Amstutz over 2 years ago

Tell me if I have correct understanding of the caching and flush behavior:

When requesting from the cache, get the collection record from API and determine the collection PDH. Then check if the collection PDH exists in the cache. If not, get the manifest by PDH and cache it.

On completion of a write operation to the local Collection object, send the updated manifest to the API server. The client request does not receive a response until both the write and update operations are complete.

On successful write, record the manifest text of the updated collection from the API server response and add a new cache entry.

The cache only saves manifest text. The collection is re-parsed on each operation.

If two concurrent operations write to the same collection, the second one is likely to step on the first one.

#29 Updated by Peter Amstutz over 2 years ago

When embedding the token into the URL, only /c=ID/t=TOKEN/PATH... is allowed. There was a request to support the swapped /t=TOKEN/c=ID/PATH... form as well to make it easier to construct URLs by concatenation.

#30 Updated by Tom Clegg over 2 years ago

Peter Amstutz wrote:

Keeping a pointer to keepClient on every storedSegment seems to be a bit of memory bloat at the margins (increases struct size by 20% multiplied by many thousands of segments).

Locators are typically ~100 bytes so this might be closer to 8%, but yes, it would be more efficient to pass that in ReadAt().

Would be nice to have comments describing the semantics of each field the on various struct definitons (for example, in storedSegment it's not clear what is the difference between "size" and "length").

Added some comments here. Probably lots of other places that would benefit from comments, too.

getlastmodified is bogus, ought to be last_modified_at.

Oops, I forgot to come back to that. Fixed.

displayname property for collections should be the collection name.

It is a conscious decision in the webdav library to return "" for the name of the top-level collection. I'm sure there's some way around it, but I wonder if we can leave it alone for now and revisit when we implement a project/collection hierarchy. The issue might just go away if the top-level is a virtual dir with by_id and stuff like that.

doc.go:

Should be updated for expanded WebDAV support.

Updated.

Supposedly Windows can mount WebDAV resources. Have we tried it?

I have not tried any versions of Windows myself. That possibility certainly helped motivate this.

(will probably need project browsing to satisfy that use case, though).

Yes, many use cases (regardless of OS) will benefit from the ability to browse projects and collections.

Dashes are being quoted unexpectedly. The file span 0:99:envvar-global.yml got turned into 0:99:envvar\055global.yml. It seems like the Python SDK doesn't de-quote this properly, so while it shows up correctly as "envvar-global.yml" in Workbench and WebDAV, it shows up as "envvar\055global.yml" in FUSE . While the Python SDK should be fixed, also consider limiting quoting to whitespace (\040) to avoid unnecessary backwards-incompatible behavior.

Fixed.

The "DELETE" method on a collection (by portable data hash) returns 204 (No content) which is interpreted as success.

Using "MOVE" on a collection by PDH returns 403 (Forbidden).

Using "COPY" on a collection by PDH returns 500 (Internal server error).

The above operations should return a consistent status code (403 or 405).

Fixed and added tests. All return 405 now.

Should use Cache-control: immutable header for collections accessed by PDH (https://tools.ietf.org/html/rfc8246)

This does seem like a good idea, and seems like it should be trivial, but still... I would say it's orthogonal and non-essential here so it seems like more of a "phase 2" wish. Or a separate ticket if we also want to consider other cache optimizations, like returning and checking ETags based on locators, so If-None-Match can work on files in UUID-addressed collections in which other files have changed.

12483-writable-fs @ 1f29f3766ec760168695c8f5bc64aa5ca0c497f1

#31 Updated by Tom Clegg over 2 years ago

Peter Amstutz wrote:

Tell me if I have correct understanding of the caching and flush behavior:

When requesting from the cache, get the collection record from API and determine the collection PDH. Then check if the collection PDH exists in the cache. If not, get the manifest by PDH and cache it.

On completion of a write operation to the local Collection object, send the updated manifest to the API server. The client request does not receive a response until both the write and update operations are complete.

On successful write, record the manifest text of the updated collection from the API server response and add a new cache entry.

The cache only saves manifest text. The collection is re-parsed on each operation.

If two concurrent operations write to the same collection, the second one is likely to step on the first one.

Yes, that all sounds right. (The lost-update problem is not ideal, but "This story does not require (and should not be held up by): [...] Avoiding lost updates when saving updated collections")

Parsing the manifest repeatedly seems wasteful, but it keeps memory down, so in practice it might even be faster. For now it seems like the simplest/safest option.

#32 Updated by Peter Amstutz over 2 years ago

#12714

I noticed that keep-web is using Transfer-Encoding: chunked for index GET and PROPFIND. Maybe it should prepare the response in a buffer so that it can report Content-Length?

#33 Updated by Peter Amstutz over 2 years ago

So this slow PROPFIND problem is not intrinsic to the writable feature, so from that perspective perhaps we could investigate further in another ticket. On the other hand, the purpose of this feature is to support the R SDK accessing collections, so if it is horribly slow, it hasn't really fulfilled that requirement.

#34 Updated by Peter Amstutz over 2 years ago

Moved PROPFIND issue to #12715

#35 Updated by Peter Amstutz over 2 years ago

Alright. Just one last documentation update to doc/install/install-keep-web.html then I think it will be ready to merge.

#37 Updated by Peter Amstutz over 2 years ago

Alright. Looks good to me!

#38 Updated by Tom Clegg over 2 years ago

Summary of future work / corners of the ocean we didn't boil here:
  • Expose tree of project/collection names, e.g., keep-web-host/home/projectname/collectionname/foo.txt
  • Cache filesystem objects in keep-web, instead of rebuilding from manifest on each request → less time, more RAM
  • Reduce memory use (e.g., pass keepclient pointer into storedSegment methods instead of keeping a copy in the struct)
  • Incremental loading (don't populate a dirnode until someone accesses it)
  • Cache/evict partial data blocks
  • Test/benchmark using a variety of webdav clients (Windows, linux davfs, OSX?)

#39 Updated by Tom Clegg over 2 years ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF