Idea #7696
closed
[SDKs] PySDK KeepClient works with all service types, applying rules for the proxy type
Added by Brett Smith about 9 years ago.
Updated about 9 years ago.
Description
The KeepClient in the Python SDK will only talk to services that have a known type: either disk, or proxy. It should connect to services of any type. For any type that isn't disk, it should follow the rules that we currently follow when connecting to proxies:
- Only allow one writer thread at a time.
- Use "proxy" timeouts.
- Respect the replication headers in the response.
Test case:
- Create some keep_services with service_type="fancynewblobstore"
- Ensure those keep services are used when reading and writing
- Ensure thread count == 1
Write the above test cases to ensure proper behavior for non-disk, non-proxy services, then make whatever code changes are necessary to make them pass. Right now, we believe this only requires adjusting a few if conditions in the code. Logic to follow the rules has been implemented, but it's being thwarted by earlier code that causes the client code to avoid using the service at all.
- Target version set to Arvados Future Sprints
- Description updated (diff)
- Description updated (diff)
- Story points set to 1.0
- Status changed from New to In Progress
- Assigned To set to Brett Smith
- Target version changed from Arvados Future Sprints to 2015-12-02 sprint
7696-pysdk-all-keep-service-types-wip is up for review. You will probably want to review this by reading each commit individually (git log -p). Adding the test to ensure writing threads == 1 regularly tickled a race condition, which I think has affected other Keep writing tests in the past. I finally tracked that down and fixed it, and ended up making several improvements to ThreadLimiter in the process. So there are three main commits:
- One to use Keep services of all types.
- One to test thread limiting, and fix the associated race condition.
- One to make various ThreadLimiter improvements that happened along the way of working on the previous commit.
I think they'll be easier to understand if you read each in isolation.
I don't think this change is as safe as it looks: self._local
is a threading.local()
, so even after setting self._local.sequence
in __init__()
, accessing self._local.sequence
will raise AttributeError if __enter__()
is called later from a different thread T without first calling set_sequence()
from thread T. Effectively this makes set_sequence()
mandatory. Granted, we don't use ThreadLimiter without it, but still...
self._local = threading.local()
+ self._local.sequence = None
def __enter__(self):
self._start_lock.acquire()
- if getattr(self._local, 'sequence', None) is not None:
+ if self._local.sequence is not None:
All other changes here LGTM.
Tom Clegg wrote:
I don't think this change is as safe as it looks: self._local
is a threading.local()
, so even after setting self._local.sequence
in __init__()
, accessing self._local.sequence
will raise AttributeError if __enter__()
is called later from a different thread T without first calling set_sequence()
from thread T. Effectively this makes set_sequence()
mandatory. Granted, we don't use ThreadLimiter without it, but still...
Ugh, of course you're right, and I even thought about this later that night after I pushed the branch. Unfortunately, I forgot again before I got back to a computer. Thanks for catching this. Fixed and merged.
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:be2d7e55af1036699282d06f629805b6508b6ceb.
Also available in: Atom
PDF