Bug #4591
closed[API] When websockets server runs out of memory, it should exit so it can be restarted, instead of wedging.
Description
Symptom: Pages not refreshing due to failure to connect to websockets
bcosc:
If I start a new instance, the page will not get refreshed that the job has started/running.
Peter:
"Iceweasel can't establish a connection to the server at wss://ws.qr1hi.arvadosapi.com/websocket"
Tom:
http://ruby-doc.org/core-2.1.4/Thread.html#method-c-abort_on_exception-3D ?
Updated by Peter Amstutz over 10 years ago
- Subject changed from workbench fails to refresh at pipeline instances when jobs are "Not ready" to [OPS] Websockets not working
- Target version set to Bug Triage
Updated by Peter Amstutz over 10 years ago
- Subject changed from [OPS] Websockets not working to [OPS] Pages not refreshing due to failure to connect to websockets
Updated by Tom Clegg over 10 years ago
- Subject changed from [OPS] Pages not refreshing due to failure to connect to websockets to [API] When websockets server runs out of memory, it should exit so it can be restarted, instead of wedging.
- Description updated (diff)
- Category set to API
Updated by Brett Smith over 10 years ago
- Assigned To set to Brett Smith
- Target version changed from Bug Triage to 2014-12-10 sprint
Updated by Peter Amstutz over 10 years ago
Per Tom's comment in the description, let's try setting Thread.abort_on_exception = true
and see if that breaks anything. The websockets server retains very little state so it is best to just kill it with extreme prejudice at the first sign of trouble.
Updated by Brett Smith over 10 years ago
Peter Amstutz wrote:
Per Tom's comment in the description, let's try setting
Thread.abort_on_exception = true
and see if that breaks anything. The websockets server retains very little state so it is best to just kill it with extreme prejudice at the first sign of trouble.
Seems to be fine. Tested this locally with a hacked arv-ws that was rigged up to send a non-JSON string. That got back the "malformed request" response as expected. Then a normal arv-ws could connect without trouble. Now at 95d12ecb.
Updated by Brett Smith over 10 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:1af2d4f71f6a7ba4374f8490ef1b4f0b972e2dec.