Project

General

Profile

Actions

Bug #11226

closed

[SDK] Python client tries to cache discovery document at the same local path regardless of user

Added by Peter Amstutz almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
0.5

Description

The Python SDK instructs httplib2 to cache the discovery document in ~/.cache/arvados/discovery. However, the google API client code also has its own cache logic, which caches the discovery document in /tmp/google-api-python-client-discovery-doc.cache.

This is a problem particularly on a multi-user system such as a shell node. The file gets created in a global location like /tmp, then only the original user can update the file.

From looking at the google-api-python-client code, the destination is chosen as os.path.join(tempfile.gettempdir(), FILENAME) where FILENAME = 'google-api-python-client-discovery-doc.cache'

https://github.com/google/google-api-python-client/blob/dc6c1efea8e6c89afa94df984d9291359cb04797/googleapiclient/discovery_cache/file_cache.py#L87

Proposed fix

See note-6.


Subtasks 1 (0 open1 closed)

Task #11233: ReviewResolvedTom Clegg03/10/2017Actions

Related issues 1 (0 open1 closed)

Related to Arvados - Bug #10669: googleapiclient.errors.InvalidJsonError retrieving discovery documentResolvedTom Clegg12/06/2016Actions
Actions #1

Updated by Peter Amstutz almost 8 years ago

  • Description updated (diff)
Actions #2

Updated by Peter Amstutz almost 8 years ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz almost 8 years ago

  • Description updated (diff)
Actions #4

Updated by Tom Clegg almost 8 years ago

  • Subject changed from [SDK] Not using ~/.cache/arvados/discovery to [SDK] Python client tries to cache discovery document at the same local path regardless of user
  • Description updated (diff)
Actions #5

Updated by Tom Clegg almost 8 years ago

  • Description updated (diff)
Actions #6

Updated by Peter Amstutz almost 8 years ago

https://github.com/google/google-api-python-client/pull/127

I think we can use the cache_discovery option to build() to disable the Google API client caching.

Actions #7

Updated by Tom Clegg almost 8 years ago

  • Description updated (diff)
Actions #8

Updated by Peter Amstutz almost 8 years ago

Also want to do one of:

a) continue to use httplib2 caching, which is apparently not multithread/multiprocess safe

b) provide our own httplib2 cache which is multithread/multiprocess safe

c) provide a google api client cache object (with build(cache=XXX)) which is multithread/multiprocess safe:
https://github.com/google/google-api-python-client/blob/v1.4.2/googleapiclient/discovery_cache/base.py

Actions #9

Updated by Tom Clegg almost 8 years ago

Suggest we
  1. pass cache_discovery=False to google-api-client so we get back to what we thought we were doing, i.e., sensible http caching
  2. also fix #10669 so the http cache is safe for multiple processes to use
Actions #10

Updated by Peter Amstutz almost 8 years ago

+1

Actions #11

Updated by Peter Amstutz almost 8 years ago

  • Assigned To set to Peter Amstutz
  • Story points set to 0.5
Actions #12

Updated by Peter Amstutz almost 8 years ago

  • Target version set to 2017-03-15 sprint
Actions #13

Updated by Peter Amstutz almost 8 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:82697fea93b1c87cdee27d2b9a76c1b7ac07497e.

Actions #14

Updated by Tom Morris almost 8 years ago

I take it this got instagroomed while I was gone.

Does this mean that #10669 is fixed (and can be closed) as well?

Actions #15

Updated by Peter Amstutz almost 8 years ago

Tom Morris wrote:

I take it this got instagroomed while I was gone.

Does this mean that #10669 is fixed (and can be closed) as well?

No, this only disables google-api-client caching so that it uses the httplib2 cache as originally intended. The bug in #10669 is that the httplib2 cache is not multi-thread/multi-process safe, and which makes it possible to get corrupted. The google-api-client cache actually does do file locking, but puts the cache file in the global /tmp and doesn't provide a way to change that except for changing the global tempfile path (which has side effects for everything using the tempfile module).

Actions

Also available in: Atom PDF