Bug #11226
closed[SDK] Python client tries to cache discovery document at the same local path regardless of user
Description
The Python SDK instructs httplib2 to cache the discovery document in ~/.cache/arvados/discovery
. However, the google API client code also has its own cache logic, which caches the discovery document in /tmp/google-api-python-client-discovery-doc.cache
.
This is a problem particularly on a multi-user system such as a shell node. The file gets created in a global location like /tmp, then only the original user can update the file.
From looking at the google-api-python-client code, the destination is chosen as os.path.join(tempfile.gettempdir(), FILENAME)
where FILENAME = 'google-api-python-client-discovery-doc.cache'
Proposed fix¶
See note-6.
Updated by Tom Clegg almost 8 years ago
- Subject changed from [SDK] Not using ~/.cache/arvados/discovery to [SDK] Python client tries to cache discovery document at the same local path regardless of user
- Description updated (diff)
Updated by Peter Amstutz almost 8 years ago
https://github.com/google/google-api-python-client/pull/127
I think we can use the cache_discovery
option to build() to disable the Google API client caching.
Updated by Peter Amstutz almost 8 years ago
Also want to do one of:
a) continue to use httplib2 caching, which is apparently not multithread/multiprocess safe
b) provide our own httplib2 cache which is multithread/multiprocess safe
c) provide a google api client cache object (with build(cache=XXX)
) which is multithread/multiprocess safe:
https://github.com/google/google-api-python-client/blob/v1.4.2/googleapiclient/discovery_cache/base.py
Updated by Tom Clegg almost 8 years ago
- pass cache_discovery=False to google-api-client so we get back to what we thought we were doing, i.e., sensible http caching
- also fix #10669 so the http cache is safe for multiple processes to use
Updated by Peter Amstutz almost 8 years ago
- Assigned To set to Peter Amstutz
- Story points set to 0.5
Updated by Peter Amstutz almost 8 years ago
- Target version set to 2017-03-15 sprint
Updated by Peter Amstutz almost 8 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:82697fea93b1c87cdee27d2b9a76c1b7ac07497e.
Updated by Tom Morris almost 8 years ago
I take it this got instagroomed while I was gone.
Does this mean that #10669 is fixed (and can be closed) as well?
Updated by Peter Amstutz almost 8 years ago
Tom Morris wrote:
I take it this got instagroomed while I was gone.
Does this mean that #10669 is fixed (and can be closed) as well?
No, this only disables google-api-client caching so that it uses the httplib2 cache as originally intended. The bug in #10669 is that the httplib2 cache is not multi-thread/multi-process safe, and which makes it possible to get corrupted. The google-api-client cache actually does do file locking, but puts the cache file in the global /tmp
and doesn't provide a way to change that except for changing the global tempfile path (which has side effects for everything using the tempfile module).