Bug #20507
closedFix websocket service
Description
In the course of looking at #20449, it doesn't seem like workbench is receiving any events from websockets. It seems to be receiving a once-per-minute empty ping, but not any actual events.
I think this explains why most of the time the continuous-reloading behavior described in #20449 doesn't actually happen. (But sometimes it does, which means sometimes websockets works?)
We should either fix it or abandon it entirely and rely on polling.
Updated by Peter Amstutz over 1 year ago
- Category set to API
- Description updated (diff)
Updated by Peter Amstutz over 1 year ago
- Related to Bug #20449: Background refresh tasks of "all processes" issues added
Updated by Brett Smith over 1 year ago
I recently ran a toy websocket client against pirca (running 2.6.3), just subscribed to everything and dumping the records to JSON. It seemed to work, so websockets aren't completely broken.
Updated by Peter Amstutz over 1 year ago
Brett Smith wrote in #note-6:
I recently ran a toy websocket client against pirca (running 2.6.3), just subscribed to everything and dumping the records to JSON. It seemed to work, so websockets aren't completely broken.
Tom and I chatted about this a couple weeks ago, I suspect there's a bug in Workbench 2 where it sometimes it sends an empty subscription and thus doesn't receive any events. Nobody has gone in to track it down yet.
Updated by Peter Amstutz over 1 year ago
- Subject changed from Fix websocket service or get rid of it to Fix websocket service
Updated by Peter Amstutz over 1 year ago
Unfortunately, I don't yet know how to reproduce this.
Updated by Peter Amstutz over 1 year ago
- Target version changed from Future to Development 2023-09-13 sprint
Updated by Tom Clegg over 1 year ago
- Has duplicate Bug #20904: Investigate websockets issue seen by user added
Updated by Tom Clegg over 1 year ago
Our listener pings create the conditions for a deadlock issue.
Specifically: our event loop can deadlock if enough (~32) server notifications arrive after the event loop decides to call Ping (e.g., while listener.Ping() is waiting for a response from the server, or in the time.Sleep() invoked by testSlowPing).
(*ListenerConn)listenerConnLoop() doesn't see the server's ping response until it finishes sending a previous notification through its internal queue to (*Listener)listenerConnLoop(), which is blocked on sending to our Notify channel, which is blocked on waiting for the Ping response.
(The lib/pq example uses "go listener.Ping()" so it doesn't deadlock.)
Updated by Tom Clegg over 1 year ago
- Assigned To set to Tom Clegg
- Status changed from New to In Progress
Updated by Peter Amstutz over 1 year ago
- Target version changed from Development 2023-09-13 sprint to Development 2023-09-27 sprint
Updated by Tom Clegg over 1 year ago
- % Done changed from 0 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|5e3f6c9ad492c43044c88ebdc7eea6bdff667f46.