Project

General

Profile

Actions

Story #19686

open

arvados.api() should return ThreadSafeApiCache

Added by Peter Amstutz about 1 month ago. Updated 3 days ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
SDKs
Target version:
Start date:
11/29/2022
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

The python SDK method arvados.api() should return a ThreadSafeApiCache object instead of the plain API object. Benefits:

  • It is threadsafe
  • It creates a corresponding keep client object, which can be accessed by classes like Collection() that take an api client without having to manage passing an extra keep client object around.
  • ThreadSafeApiCache uses python thread-local storage, we should be aware of how items in thread-local storage get cleaned up when the thread terminates.

This is in response to users running into this issue.


Subtasks 1 (1 open0 closed)

Task #19802: Review 19686-threadsafe-api-defaultIn ProgressTom Clegg11/29/2022

Actions
Actions #1

Updated by Peter Amstutz about 1 month ago

  • Description updated (diff)
Actions #2

Updated by Peter Amstutz about 1 month ago

  • Subject changed from arvados.api() returns ThreadSafeApiCache to arvados.api() should return ThreadSafeApiCache
Actions #3

Updated by Peter Amstutz 30 days ago

  • Target version changed from 2022-11-23 sprint to 2022-12-07 Sprint
Actions #4

Updated by Peter Amstutz 23 days ago

  • Target version changed from 2022-12-07 Sprint to 2022-11-23 sprint
Actions #5

Updated by Peter Amstutz 23 days ago

  • Assigned To deleted (Peter Amstutz)
Actions #6

Updated by Peter Amstutz 23 days ago

  • Target version changed from 2022-11-23 sprint to 2022-12-07 Sprint
Actions #7

Updated by Peter Amstutz 10 days ago

  • Description updated (diff)
Actions #8

Updated by Peter Amstutz 10 days ago

  • Description updated (diff)
Actions #9

Updated by Peter Amstutz 10 days ago

  • Description updated (diff)
Actions #10

Updated by Peter Amstutz 9 days ago

  • Assigned To set to Brett Smith
Actions #11

Updated by Brett Smith 3 days ago

19686-threadsafe-api-default @ c953b61d6

New API

At standup, we discussed that we wanted to let users still have a way to build just the Resource object. I have done that by adding a new api_client function that does just that, it's the actual object-building part of the old api function. I took this approach for two reasons:

  1. IMO, having a flag like threadsafe=False that substantially changes what the method does (in this case, changing its return type) is a bad code smell.
  2. This gave me an opportunity to build a clean API for api_client. It just builds objects, and takes the parameters required for that process as arguments. It doesn't load configuration, or build the discovery URL, or any of that. Building the inputs is a job for other functions.

From here I broke out two new functions: normalize_api_kwargs is the argument-parsing part of the old api, and api_kwargs_from_config is the argument-building part of the old api_from_config. IMO these are pretty safe to make public: all they do is build dictionaries from dictionaries, and the names are pretty precise on that point and not taking up valuable short names. That said, I don't feel super strongly about keeping them public. If we'd rather they be flagged private, I could go along with that.

With these separated, it's easier to compose api, api_from_config, and ThreadSafeApiCache. That's all the first commit does: introduce the new APIs, and use them to implement the old APIs. It's admittedly a huge diff, but IMO the end result is easier to follow and nicer to use.

API compatibility notes

The second commit actually makes the requested change. I had to add some dedicated handling of the request_id attribute to ThreadSafeApiCache to keep the tests passing.

My docstrings say ThreadSafeApiCache is API-compatible, but that's only mostly true. When we build the Resource object we attach a bunch of attributes on it. Any code that writes these attributes may see different behavior when they start getting a ThreadSafeApiCache instead. It's not clear whether that's intended to be a supported part of the API or not. For now I decided to just use tested behavior as my threshold for what we need to support.

isinstance(arv_client, Resource) used to return True and will now return False now too. We could keep it returning True if we wanted, by overriding __instancecheck__. IMO this is kind of a trade-off: doing so would maintain stricter API compatibility, but the result could be really confusing in some contexts. For now I erred on the side of less code, but I'm also open to discussing this.

Actions #12

Updated by Brett Smith 3 days ago

  • Status changed from New to In Progress
Actions

Also available in: Atom PDF