arvados.api() should return ThreadSafeApiCache
The python SDK method arvados.api() should return a ThreadSafeApiCache object instead of the plain API object. Benefits:
- It is threadsafe
- It creates a corresponding keep client object, which can be accessed by classes like Collection() that take an api client without having to manage passing an extra keep client object around.
- ThreadSafeApiCache uses python thread-local storage, we should be aware of how items in thread-local storage get cleaned up when the thread terminates.
This is in response to users running into this issue.
Updated by Brett Smith 3 days ago
19686-threadsafe-api-default @ c953b61d6
At standup, we discussed that we wanted to let users still have a way to build just the
Resource object. I have done that by adding a new
api_client function that does just that, it's the actual object-building part of the old
api function. I took this approach for two reasons:
- IMO, having a flag like
threadsafe=Falsethat substantially changes what the method does (in this case, changing its return type) is a bad code smell.
- This gave me an opportunity to build a clean API for
api_client. It just builds objects, and takes the parameters required for that process as arguments. It doesn't load configuration, or build the discovery URL, or any of that. Building the inputs is a job for other functions.
From here I broke out two new functions:
normalize_api_kwargs is the argument-parsing part of the old
api_kwargs_from_config is the argument-building part of the old
api_from_config. IMO these are pretty safe to make public: all they do is build dictionaries from dictionaries, and the names are pretty precise on that point and not taking up valuable short names. That said, I don't feel super strongly about keeping them public. If we'd rather they be flagged private, I could go along with that.
With these separated, it's easier to compose
ThreadSafeApiCache. That's all the first commit does: introduce the new APIs, and use them to implement the old APIs. It's admittedly a huge diff, but IMO the end result is easier to follow and nicer to use.
API compatibility notes¶
The second commit actually makes the requested change. I had to add some dedicated handling of the
request_id attribute to
ThreadSafeApiCache to keep the tests passing.
My docstrings say
ThreadSafeApiCache is API-compatible, but that's only mostly true. When we build the
Resource object we attach a bunch of attributes on it. Any code that writes these attributes may see different behavior when they start getting a
ThreadSafeApiCache instead. It's not clear whether that's intended to be a supported part of the API or not. For now I decided to just use tested behavior as my threshold for what we need to support.
isinstance(arv_client, Resource) used to return True and will now return False now too. We could keep it returning
True if we wanted, by overriding
__instancecheck__. IMO this is kind of a trade-off: doing so would maintain stricter API compatibility, but the result could be really confusing in some contexts. For now I erred on the side of less code, but I'm also open to discussing this.