Project

General

Profile

Actions

Bug #20083

closed

keep-web sometimes tries to update a collection when the user has read-only access, causing an error

Added by Brett Smith almost 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Story points:
-
Release relationship:
Auto

Description

Problem: A user using the Java SDK reports that when they try to download files from a large collection via keep-web, theyoccasionally get this error:

sync failed: update zzzzz-4zz18-aaaaaaaaaaaaaaa: request failed: https://api.example.com/arvados/v1/collections/zzzzz-4zz18-aaaaaaaaaaaaaaa: 403 Forbidden: //railsapi.internal/arvados/v1/collections/zzzzz-4zz18-aaaaaaaaaaaaaaa: 403 Forbidden: Uuid  zzzzz-4zz18-aaaaaaaaaaaaaaa is not writable by zzzzz-tpzed-bbbbbbbbbbbbbbb (req-3613d933bafr1itnlnpf)

We have not 100% tracked this down, but handler.ServeHTTP has this code:

    forceReload := false
    if cc := r.Header.Get("Cache-Control"); strings.Contains(cc, "no-cache") || strings.Contains(cc, "must-revalidate") {
        forceReload = true
    }
    …
    if forceReload {
        err := collectionDir.Sync()
        …

The Java SDK sends Cache-Control: no-cache with all requests, so it can go down this path. fsCollection.Sync can sometimes issue an update request for the underlying collection, and returns an error message in the format above when that fails, which it would if the user didn't have permission to write the collection.

One way or another, keep-web should successfully serve the request if the user has permission to read but not write it.


Subtasks 1 (0 open1 closed)

Task #20105: Review 20083-sync-readonlyResolvedTom Clegg02/15/2023Actions
Actions #2

Updated by Tom Clegg almost 2 years ago

Keep-web "force-reload" is implemented by calling Sync(), which also writes local changes back to API if there are any. In a read operation there shouldn't be any local changes. However the way the SDK checks for changes is by generating a manifest and comparing its checksum to the last known PDH, which can return a false positive if the original manifest has non-semantic spelling differences (e.g., files listed in non-lexical order).

Here's a failing test case, confirming this can happen.

20083-sync-readonly @ c4c57180c9ee9f79a1d272710aa7b8747d4d0c38

I think the fix will be to compute the PDH locally after initializing/reloading from upstream, and use that instead of the API-provided PDH as the last-saved value when checking for local changes.

Actions #3

Updated by Tom Clegg almost 2 years ago

  • Assigned To set to Tom Clegg
  • Status changed from New to In Progress

20083-sync-readonly @ 4b1a3fdcb5065f7a34eeeda662e31f01d8224392 -- developer-run-tests: #3479 retry wb1 developer-run-tests-apps-workbench-integration: #3752

20083-backport-250 @ 42bb84a001024e466687fb579992c6cfc82c979a -- developer-run-tests: #3480

It would be much more efficient to use a "dirty" flag for this, but this less invasive change seems more suitable for a patch.

Actions #4

Updated by Peter Amstutz almost 2 years ago

  • Target version changed from Future to 2023-02-15 sprint
Actions #5

Updated by Peter Amstutz almost 2 years ago

  • Release set to 57
Actions #6

Updated by Peter Amstutz almost 2 years ago

  • Target version changed from 2023-02-15 sprint to 2023-03-01 sprint
Actions #8

Updated by Tom Clegg almost 2 years ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF