Project

General

Profile

Actions

Bug #10008

open

[SDKs] [Python] When reading data through Collection et al., signatures should refresh automatically when needed

Added by Tom Clegg over 8 years ago. Updated 10 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
SDKs
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

Background

TokenExpiryTest in source:services/fuse/tests/test_mount.py is slow and unreliable because it relies on sleep(). (fixed)

There is signature-refreshing logic ("re-fetch collection from API server before signatures expire") in arv-mount. However:
  • It checks expiry times only when a file is opened. If a file is left open for a long time, reading from the file will eventually stop working.
  • Other programs using the Python SDK need the same auto-refresh behavior for the same reasons as arv-mount, but no facility is provided.

Proposed fix

Implement manifest-refreshing logic in the Python SDK. Reading from a file should always work:
  • regardless of how much time passes between instantiating the Collection, opening a file, writing, and reading.
  • regardless of whether the collection is writable or not, and regardless of whether it is referenced by PDH or UUID.

This will make the arv-mount token-refresh mechanism (added in notes 4-6 here) redundant, so that can be reverted.


Subtasks 1 (0 open1 closed)

Task #10330: Review 10008-check-token-exp-on-openResolvedTom Clegg09/12/2016Actions

Related issues 1 (0 open1 closed)

Related to Arvados - Bug #6833: [FUSE] arv-mount caches manifests too long, ends up using expired Keep signaturesResolvedPeter Amstutz07/30/2015Actions
Actions #1

Updated by Tom Clegg over 8 years ago

WIP: 10008-flaky-token-test

Unfortunately, the new test seems to indicate the token-refresh behavior is actually broken:
  • time is mocked so it appears to advance by 13 days each time open() is called
  • open() is called 3 times in a collection directory
  • API server logs show a single "get collection" request (expect 3)
Actions #2

Updated by Tom Clegg over 8 years ago

  • Status changed from New to In Progress
  • Assigned To set to Tom Clegg
  • Story points set to 0.5
Actions #3

Updated by Tom Clegg about 8 years ago

  • Target version set to 2016-10-26 sprint
Actions #4

Updated by Tom Clegg about 8 years ago

10008-check-token-exp-on-open

test b73985d8a0c9173aec57f6a81fe540b2813a5bff

Actions #5

Updated by Lucas Di Pentima about 8 years ago

  • File services/fuse/arvados_fuse/__init__.py:
    • Line 504: Could you please explain in a comment why is the parent traversal needed on open()?
    • Line 595: Is the method opendir() needing the same parent traversal to touch() all of its ancestors too?
  • Is there any test covering the case when the token is expired? If I’m understanding this test correctly, it checks that the token is refreshed if the fuse is accessed every 13 days, right?
Actions #6

Updated by Tom Clegg about 8 years ago

Lucas Di Pentima wrote:

  • File services/fuse/arvados_fuse/__init__.py:
    • Line 504: Could you please explain in a comment why is the parent traversal needed on open()?

Added.

  • Line 595: Is the method opendir() needing the same parent traversal to touch() all of its ancestors too?

The passage of time (and the resulting invalidation of the signature tokens) is the only kind of event we don't expect websocket to tell us about. So opendir() doesn't really need this.

Of course, I suppose that means this really belongs in read(): if a file is read after being left open for 2 weeks, even open() won't have a chance to refresh.

  • Is there any test covering the case when the token is expired? If I’m understanding this test correctly, it checks that the token is refreshed if the fuse is accessed every 13 days, right?

Yes, that's right. There test covers only "old enough to be nearly expired", not "actually expired". I'm not too worried about it. This would be a good integration test, but for that we'd also have to figure out how to convince keepstore to expire tokens immediately...

7bf89e17945bb6610d6339409a6a411803851434

Notes from meeting:
  • Checking in read() shouldn't wreck performance
  • Even checking in read() might not be enough
    • we will initiate an update, but will the read progress before the updated response arrives?
    • if we have a new collection when we're fulfilling the read(), but we created the collection, subdir, and file inodes back when we had the old collection, will we use the new collection's block locators, or will the cached inodes still cause us to use the old ones?
  • It would be better to fix this (re-fetch the manifest when reading data, if we notice the tokens expire soon) in the Python SDK so arv-mount wouldn't have to do anything special. However, if it's not too hard to fix arv-mount, we might as well do that in the meantime.
Actions #7

Updated by Peter Amstutz about 8 years ago

A few notes:

  1. Operations.lookup() calls p[name] which turns into Directory.__getitem__ which is wrapped by @check_update. This means when a lookup happens, it is supposed to synchronously refresh the underlying directory if it is stale.
  2. FUSE directory entry caching behavior has changed over kernel versions, so it is possible that lookup() used to always occur before open(), but now sometimes does not happen if the kernel has the directory entry in cache.
  3. CollectionDirectory.update has an early exit on self.collection_record is not None and portable_data_hash_pattern.match(self.collection_locator) which suggests that collections referenced by portable data hash never get refreshed, even if they need to be in order to refresh tokens.
  4. ArvadosFiles which are the same except for token updates are applied as TOK changes (in other words, Collection.update() should be presumed to work, and the problem is that it isn't being called when it is supposed to.)

Updating is synchronous and it will hold a lock on the collection which will prevent any other activity on the collection while it is updating, so there should not be an update race.

Actions #8

Updated by Peter Amstutz about 8 years ago

Suggested fix:

CollectionDirectory.update needs to force update when poll time is exceeded, even for read only collections.

FuseArvadosFile.readfrom() should be wrapped with @check_update. When created, FuseArvadosFile should have poll time set to match the parent.

FuseArvadosFile needs to have a reference to either the parent directory object or the inodes table (to look up parent_inode) so that it can initiate an update().

Actions #9

Updated by Tom Clegg about 8 years ago

  • Subject changed from [FUSE] [Tests] Fix flaky test TokenExpiryTest to [FUSE] [SDKs] When reading data through Collection et al., signatures should refresh automatically when needed
Actions #10

Updated by Tom Clegg about 8 years ago

  • Target version deleted (2016-10-26 sprint)
Actions #11

Updated by Tom Clegg about 8 years ago

  • Category changed from Tests to FUSE
  • Assigned To deleted (Tom Clegg)
Actions #12

Updated by Tom Clegg almost 8 years ago

  • Assigned To set to Tom Clegg
  • Target version set to 2017-03-01 sprint
Actions #13

Updated by Tom Clegg almost 8 years ago

  • Description updated (diff)
Actions #14

Updated by Tom Clegg almost 8 years ago

  • Subject changed from [FUSE] [SDKs] When reading data through Collection et al., signatures should refresh automatically when needed to [SDKs] [Python] When reading data through Collection et al., signatures should refresh automatically when needed
Actions #15

Updated by Tom Clegg almost 8 years ago

  • Category changed from FUSE to SDKs
  • Status changed from In Progress to New
  • Assigned To deleted (Tom Clegg)
  • Target version deleted (2017-03-01 sprint)
  • Story points deleted (0.5)
Actions #16

Updated by Tom Morris over 7 years ago

  • Target version set to Arvados Future Sprints
Actions #17

Updated by Ward Vandewege over 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions #18

Updated by Peter Amstutz almost 2 years ago

  • Release set to 60
Actions #19

Updated by Peter Amstutz 10 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF