Bug #19362
closedkeep-web notices when collections accessed by uuid have changed
Added by Peter Amstutz over 2 years ago. Updated about 2 years ago.
Description
Users reported issues when opening HTML reports via Browser after updating the collection content via S3. Updated HTML pages are not immediately displayed and require a forced reload.
a) if they are uploading via S3, keep-web itself should be able to know that something changed
b) it seems like it would be a good idea to have keep-web either poll for changes or listen to websockets to find out when collections change instead of waiting for the cache to expire
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-08-31 sprint to 2022-08-17 sprint
Updated by Peter Amstutz over 2 years ago
- Assigned To set to Tom Clegg
- Category set to Keep
Updated by Tom Clegg over 2 years ago
- Status changed from New to In Progress
phase 1: Retire the old per-collection code path in favor of using the per-token session/sitefs for all keep-web requests. That will give better cache behavior, and address the case where a user uploads via S3 and then downloads via web using the same token.
phase 2: Mechanism for sitefs to invalidate cached collections on websocket events. That will also be an important piece of the Go-based fuse mount.
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-08-17 sprint to 2022-08-31 sprint
Updated by Tom Clegg over 2 years ago
- Target version changed from 2022-08-31 sprint to 2022-09-14 sprint
Updated by Tom Clegg over 2 years ago
19362-all-webdav-via-sitefs @ 25162c18b4290f7763cd545f2d94d96650ebc9d9 -- developer-run-tests: #3287
Merges the sitefs and non-sitefs code paths, so now all requests are done with a sitefs on the backend, using the per-token session cache.
Includes some needed improvements to the filesystem code in sdk/go/arvados:- Fix deadlock in Rename (when renaming to/from root dir of a collectionfs inside a sitefs)
- Sync() in a collectionfs checks for upstream changes before trying to save local changes (if both have changes, upstream wins)
- Unexpected errors (forbidden/unauthorized) are returned to caller with HTTPStatus() interface instead of being converted to os.ErrNotExist
Happily there is a net improvement in LOC: "14 files changed, 523 insertions(+), 777 deletions(-)"
(note this doesn't actually fix/test the collection-out-of-sync issue yet)
Updated by Tom Clegg over 2 years ago
- Target version changed from 2022-09-14 sprint to 2022-09-28 sprint
Updated by Lucas Di Pentima over 2 years ago
Some minor comments & questions:
- File
sdk/go/arvados/fs_site.go
- Lines 158-169: The GET documentation at https://doc.arvados.org/v2.4/api/methods/collections.html already say that only
pdh
&manifest_text
are returned when requesting a collection by PDH, has that behavior changed? (that is, do we need to update our docs?) - Changes on
mountByID()
&mountCollection()
signatures didn't require any update on thefs_site_test.go
file, so I'm assuming we're not testing them. Especially the return error code change. Do you think it would be an excellent opportunity to do so?
- Lines 158-169: The GET documentation at https://doc.arvados.org/v2.4/api/methods/collections.html already say that only
- If we're removing
keepweb_collectioncache
prometheus counters/metrics, should we say so on the upgrading docs to warn admins that may be relying on them? - Are
Collections.WebDAVCache.MaxUUIDEntries
andCollections.WebDAVCache.MaxCollectionEntries
still relevant?
The rest LGTM
Updated by Tom Clegg over 2 years ago
Lucas Di Pentima wrote in #note-10:
- Lines 158-169: The GET documentation at https://doc.arvados.org/v2.4/api/methods/collections.html already say that only
pdh
&manifest_text
are returned when requesting a collection by PDH, has that behavior changed? (that is, do we need to update our docs?)
I think the docs are correct, but test stubs don't always behave that way (also, belt+suspenders). I've updated the comment "Even if the controller returns more fields..."
- Changes on
mountByID()
&mountCollection()
signatures didn't require any update on thefs_site_test.go
file, so I'm assuming we're not testing them. Especially the return error code change. Do you think it would be an excellent opportunity to do so?
Those are used as (*vdirnode).create methods. They're exercised in tests, but mountByID is only called from (*vdirnode)Child() via vn.create(vn, name)
, which you can't find by searching for mountByID(
.
- If we're removing
keepweb_collectioncache
prometheus counters/metrics, should we say so on the upgrading docs to warn admins that may be relying on them?
Ah, good call. Added a release note.
- Are
Collections.WebDAVCache.MaxUUIDEntries
andCollections.WebDAVCache.MaxCollectionEntries
still relevant?
Oops, no. Removed from config struct & config.default.yml, and noted that in the release note too.
19362-all-webdav-via-sitefs @ f1ec6199fbe9b6abdba3a9eba95eda7b46eb5265 -- developer-run-tests: #3296
19362-all-webdav-via-sitefs @ a88aa282a8da52589053244f4ef41283d51f629a (f1ec6199 plus api doc fix)
Updated by Tom Clegg over 2 years ago
19362-s3-webdav-sync @ 6289b570a8eb6aea964c14c8e8c5bf11443f726d -- developer-run-tests: #3301
This changes the sitefs implementation so the different paths to any given project or collection dir (e.g., by_id/$coll_uuid, by_id/$proj_uuid/$coll_name/, home/$proj_name/$coll_name) behave like hardlinks to the same collection: if you create/write/delete a file in one of those paths, it is immediately visible in all others too.
In the context of webdav/s3 this means that during a write operation, we can identify the collection that was modified, and splice the updated version into by_id/$coll_uuid in the current token's persistent readonly sitefs, so the update will be visible in a following read operation without a reload.
This is much more efficient than throwing out the entire cached session after each write, which is how we previously achieved read-after-write consistency.
Updated by Lucas Di Pentima over 2 years ago
Some comments/questions:
- File
sdk/go/arvados/fs_site.go
- Lines 161 & 164: Could those
Unlock()
calls be replaced with just one just after then
var assignment? - Lines 245, 247, 250, 263, 264: I find so much locking/unlocking calls from the same lock somewhat confusing. Not sure if it can be expressed in simpler terms, but just wanted to mention it.
- Lines 161 & 164: Could those
- I think some of the above lock/unlock ops can be removed, I'm reading that although map ops are not atomic, doing map reading operations concurrently is safe (https://go.dev/doc/faq#atomic_maps), wdyt?
The rest LGTM.
Updated by Tom Clegg over 2 years ago
- Lines 161 & 164: Could those
Unlock()
calls be replaced with just one just after then
var assignment?
Yes, good point.
- Lines 245, 247, 250, 263, 264: I find so much locking/unlocking calls from the same lock somewhat confusing. Not sure if it can be expressed in simpler terms, but just wanted to mention it.
I've cleaned up and commented these.
- I think some of the above lock/unlock ops can be removed, I'm reading that although map ops are not atomic, doing map reading operations concurrently is safe (https://go.dev/doc/faq#atomic_maps), wdyt?
That's true, but I don't think there's any opportunity to avoid locking on read ops here, because it's always possible another goroutine is writing.
19362-s3-webdav-sync @ bf6539ec9d5f207be4be363720ef118e25e7abbe -- developer-run-tests: #3302
Updated by Lucas Di Pentima over 2 years ago
Thanks for the clarification, LGTM, thanks!
Updated by Tom Clegg over 2 years ago
- % Done changed from 50 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|a9be3117466506dffc39617be1c58406c5914e4b.