Bug #20083
closedkeep-web sometimes tries to update a collection when the user has read-only access, causing an error
Description
Problem: A user using the Java SDK reports that when they try to download files from a large collection via keep-web, theyoccasionally get this error:
sync failed: update zzzzz-4zz18-aaaaaaaaaaaaaaa: request failed: https://api.example.com/arvados/v1/collections/zzzzz-4zz18-aaaaaaaaaaaaaaa: 403 Forbidden: //railsapi.internal/arvados/v1/collections/zzzzz-4zz18-aaaaaaaaaaaaaaa: 403 Forbidden: Uuid zzzzz-4zz18-aaaaaaaaaaaaaaa is not writable by zzzzz-tpzed-bbbbbbbbbbbbbbb (req-3613d933bafr1itnlnpf)
We have not 100% tracked this down, but handler.ServeHTTP
has this code:
forceReload := false if cc := r.Header.Get("Cache-Control"); strings.Contains(cc, "no-cache") || strings.Contains(cc, "must-revalidate") { forceReload = true } … if forceReload { err := collectionDir.Sync() …
The Java SDK sends Cache-Control: no-cache
with all requests, so it can go down this path. fsCollection.Sync
can sometimes issue an update request for the underlying collection, and returns an error message in the format above when that fails, which it would if the user didn't have permission to write the collection.
One way or another, keep-web should successfully serve the request if the user has permission to read but not write it.
Updated by Tom Clegg almost 2 years ago
Keep-web "force-reload" is implemented by calling Sync(), which also writes local changes back to API if there are any. In a read operation there shouldn't be any local changes. However the way the SDK checks for changes is by generating a manifest and comparing its checksum to the last known PDH, which can return a false positive if the original manifest has non-semantic spelling differences (e.g., files listed in non-lexical order).
Here's a failing test case, confirming this can happen.
20083-sync-readonly @ c4c57180c9ee9f79a1d272710aa7b8747d4d0c38
I think the fix will be to compute the PDH locally after initializing/reloading from upstream, and use that instead of the API-provided PDH as the last-saved value when checking for local changes.
Updated by Tom Clegg almost 2 years ago
- Assigned To set to Tom Clegg
- Status changed from New to In Progress
20083-sync-readonly @ 4b1a3fdcb5065f7a34eeeda662e31f01d8224392 -- developer-run-tests: #3479 retry wb1 developer-run-tests-apps-workbench-integration: #3752
20083-backport-250 @ 42bb84a001024e466687fb579992c6cfc82c979a -- developer-run-tests: #3480
It would be much more efficient to use a "dirty" flag for this, but this less invasive change seems more suitable for a patch.
Updated by Peter Amstutz almost 2 years ago
- Target version changed from Future to 2023-02-15 sprint
Updated by Peter Amstutz almost 2 years ago
- Target version changed from 2023-02-15 sprint to 2023-03-01 sprint
Updated by Brett Smith almost 2 years ago
Tom Clegg wrote in #note-3:
20083-sync-readonly @ 4b1a3fdcb5065f7a34eeeda662e31f01d8224392 -- developer-run-tests: #3479 retry wb1 developer-run-tests-apps-workbench-integration: #3752
20083-backport-250 @ 42bb84a001024e466687fb579992c6cfc82c979a -- developer-run-tests: #3480
Both LGTM, thanks.
Updated by Tom Clegg almost 2 years ago
- % Done changed from 0 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|94dcdefbf8fe264daa28c5b15f68b304c683e390.