Project

General

Profile

Actions

Bug #19362

closed

keep-web notices when collections accessed by uuid have changed

Added by Peter Amstutz 6 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Start date:
09/13/2022
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

Users reported issues when opening HTML reports via Browser after updating the collection content via S3. Updated HTML pages are not immediately displayed and require a forced reload.

a) if they are uploading via S3, keep-web itself should be able to know that something changed

b) it seems like it would be a good idea to have keep-web either poll for changes or listen to websockets to find out when collections change instead of waiting for the cache to expire


Subtasks 2 (0 open2 closed)

Task #19394: Review 19362-all-webdav-via-sitefsResolvedLucas Di Pentima09/13/2022

Actions
Task #19568: Review 19362-s3-webdav-syncResolvedTom Clegg09/21/2022

Actions
Actions #2

Updated by Peter Amstutz 6 months ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz 6 months ago

  • Target version changed from 2022-08-31 sprint to 2022-08-17 sprint
Actions #4

Updated by Peter Amstutz 6 months ago

  • Assigned To set to Tom Clegg
  • Category set to Keep
Actions #5

Updated by Tom Clegg 6 months ago

  • Status changed from New to In Progress

phase 1: Retire the old per-collection code path in favor of using the per-token session/sitefs for all keep-web requests. That will give better cache behavior, and address the case where a user uploads via S3 and then downloads via web using the same token.

phase 2: Mechanism for sitefs to invalidate cached collections on websocket events. That will also be an important piece of the Go-based fuse mount.

Actions #6

Updated by Peter Amstutz 6 months ago

  • Target version changed from 2022-08-17 sprint to 2022-08-31 sprint
Actions #7

Updated by Tom Clegg 5 months ago

  • Target version changed from 2022-08-31 sprint to 2022-09-14 sprint
Actions #8

Updated by Tom Clegg 5 months ago

19362-all-webdav-via-sitefs @ 25162c18b4290f7763cd545f2d94d96650ebc9d9 -- developer-run-tests: #3287

Merges the sitefs and non-sitefs code paths, so now all requests are done with a sitefs on the backend, using the per-token session cache.

Includes some needed improvements to the filesystem code in sdk/go/arvados:
  • Fix deadlock in Rename (when renaming to/from root dir of a collectionfs inside a sitefs)
  • Sync() in a collectionfs checks for upstream changes before trying to save local changes (if both have changes, upstream wins)
  • Unexpected errors (forbidden/unauthorized) are returned to caller with HTTPStatus() interface instead of being converted to os.ErrNotExist

Happily there is a net improvement in LOC: "14 files changed, 523 insertions(+), 777 deletions(-)"

(note this doesn't actually fix/test the collection-out-of-sync issue yet)

Actions #9

Updated by Tom Clegg 5 months ago

  • Target version changed from 2022-09-14 sprint to 2022-09-28 sprint
Actions #10

Updated by Lucas Di Pentima 5 months ago

Some minor comments & questions:

  • File sdk/go/arvados/fs_site.go
    • Lines 158-169: The GET documentation at https://doc.arvados.org/v2.4/api/methods/collections.html already say that only pdh & manifest_text are returned when requesting a collection by PDH, has that behavior changed? (that is, do we need to update our docs?)
    • Changes on mountByID() & mountCollection() signatures didn't require any update on the fs_site_test.go file, so I'm assuming we're not testing them. Especially the return error code change. Do you think it would be an excellent opportunity to do so?
  • If we're removing keepweb_collectioncache prometheus counters/metrics, should we say so on the upgrading docs to warn admins that may be relying on them?
  • Are Collections.WebDAVCache.MaxUUIDEntries and Collections.WebDAVCache.MaxCollectionEntries still relevant?

The rest LGTM

Actions #11

Updated by Tom Clegg 4 months ago

Lucas Di Pentima wrote in #note-10:

I think the docs are correct, but test stubs don't always behave that way (also, belt+suspenders). I've updated the comment "Even if the controller returns more fields..."

  • Changes on mountByID() & mountCollection() signatures didn't require any update on the fs_site_test.go file, so I'm assuming we're not testing them. Especially the return error code change. Do you think it would be an excellent opportunity to do so?

Those are used as (*vdirnode).create methods. They're exercised in tests, but mountByID is only called from (*vdirnode)Child() via vn.create(vn, name), which you can't find by searching for mountByID(.

  • If we're removing keepweb_collectioncache prometheus counters/metrics, should we say so on the upgrading docs to warn admins that may be relying on them?

Ah, good call. Added a release note.

  • Are Collections.WebDAVCache.MaxUUIDEntries and Collections.WebDAVCache.MaxCollectionEntries still relevant?

Oops, no. Removed from config struct & config.default.yml, and noted that in the release note too.

19362-all-webdav-via-sitefs @ f1ec6199fbe9b6abdba3a9eba95eda7b46eb5265 -- developer-run-tests: #3296

19362-all-webdav-via-sitefs @ a88aa282a8da52589053244f4ef41283d51f629a (f1ec6199 plus api doc fix)

Actions #12

Updated by Lucas Di Pentima 4 months ago

This LGTM, thanks!!

Actions #13

Updated by Tom Clegg 4 months ago

19362-s3-webdav-sync @ 6289b570a8eb6aea964c14c8e8c5bf11443f726d -- developer-run-tests: #3301

This changes the sitefs implementation so the different paths to any given project or collection dir (e.g., by_id/$coll_uuid, by_id/$proj_uuid/$coll_name/, home/$proj_name/$coll_name) behave like hardlinks to the same collection: if you create/write/delete a file in one of those paths, it is immediately visible in all others too.

In the context of webdav/s3 this means that during a write operation, we can identify the collection that was modified, and splice the updated version into by_id/$coll_uuid in the current token's persistent readonly sitefs, so the update will be visible in a following read operation without a reload.

This is much more efficient than throwing out the entire cached session after each write, which is how we previously achieved read-after-write consistency.

Actions #14

Updated by Lucas Di Pentima 4 months ago

Some comments/questions:

  • File sdk/go/arvados/fs_site.go
    • Lines 161 & 164: Could those Unlock() calls be replaced with just one just after the n var assignment?
    • Lines 245, 247, 250, 263, 264: I find so much locking/unlocking calls from the same lock somewhat confusing. Not sure if it can be expressed in simpler terms, but just wanted to mention it.
  • I think some of the above lock/unlock ops can be removed, I'm reading that although map ops are not atomic, doing map reading operations concurrently is safe (https://go.dev/doc/faq#atomic_maps), wdyt?

The rest LGTM.

Actions #15

Updated by Tom Clegg 4 months ago

  • Lines 161 & 164: Could those Unlock() calls be replaced with just one just after the n var assignment?

Yes, good point.

  • Lines 245, 247, 250, 263, 264: I find so much locking/unlocking calls from the same lock somewhat confusing. Not sure if it can be expressed in simpler terms, but just wanted to mention it.

I've cleaned up and commented these.

  • I think some of the above lock/unlock ops can be removed, I'm reading that although map ops are not atomic, doing map reading operations concurrently is safe (https://go.dev/doc/faq#atomic_maps), wdyt?

That's true, but I don't think there's any opportunity to avoid locking on read ops here, because it's always possible another goroutine is writing.

19362-s3-webdav-sync @ bf6539ec9d5f207be4be363720ef118e25e7abbe -- developer-run-tests: #3302

Actions #16

Updated by Lucas Di Pentima 4 months ago

Thanks for the clarification, LGTM, thanks!

Actions #17

Updated by Tom Clegg 4 months ago

  • % Done changed from 50 to 100
  • Status changed from In Progress to Resolved
Actions #18

Updated by Peter Amstutz about 2 months ago

  • Release set to 47
Actions

Also available in: Atom PDF