Project

General

Profile

Actions

Bug #4591

closed

[API] When websockets server runs out of memory, it should exit so it can be restarted, instead of wedging.

Added by Bryan Cosca over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Story points:
0.5

Description

Symptom: Pages not refreshing due to failure to connect to websockets

bcosc:
If I start a new instance, the page will not get refreshed that the job has started/running.

Peter:
"Iceweasel can't establish a connection to the server at wss://ws.qr1hi.arvadosapi.com/websocket"

Tom:
http://ruby-doc.org/core-2.1.4/Thread.html#method-c-abort_on_exception-3D ?


Subtasks 1 (0 open1 closed)

Task #4701: Review 4591-websockets-raise-oom-wipResolvedBrett Smith12/02/2014Actions

Related issues

Has duplicate Arvados - Bug #4623: No auto-update of pipeline times in browserClosed11/20/2014Actions
Actions #1

Updated by Bryan Cosca over 9 years ago

also from queued to pending

Actions #2

Updated by Bryan Cosca over 9 years ago

or pending to complete for that matter

Actions #3

Updated by Peter Amstutz over 9 years ago

  • Subject changed from workbench fails to refresh at pipeline instances when jobs are "Not ready" to [OPS] Websockets not working
  • Target version set to Bug Triage
Actions #4

Updated by Peter Amstutz over 9 years ago

  • Subject changed from [OPS] Websockets not working to [OPS] Pages not refreshing due to failure to connect to websockets
Actions #5

Updated by Peter Amstutz over 9 years ago

  • Description updated (diff)
Actions #6

Updated by Tom Clegg over 9 years ago

  • Subject changed from [OPS] Pages not refreshing due to failure to connect to websockets to [API] When websockets server runs out of memory, it should exit so it can be restarted, instead of wedging.
  • Description updated (diff)
  • Category set to API
Actions #7

Updated by Brett Smith over 9 years ago

  • Assigned To set to Brett Smith
  • Target version changed from Bug Triage to 2014-12-10 sprint
Actions #8

Updated by Brett Smith over 9 years ago

  • Status changed from New to In Progress
Actions #9

Updated by Peter Amstutz over 9 years ago

Per Tom's comment in the description, let's try setting Thread.abort_on_exception = true and see if that breaks anything. The websockets server retains very little state so it is best to just kill it with extreme prejudice at the first sign of trouble.

Actions #10

Updated by Brett Smith over 9 years ago

Peter Amstutz wrote:

Per Tom's comment in the description, let's try setting Thread.abort_on_exception = true and see if that breaks anything. The websockets server retains very little state so it is best to just kill it with extreme prejudice at the first sign of trouble.

Seems to be fine. Tested this locally with a hacked arv-ws that was rigged up to send a non-JSON string. That got back the "malformed request" response as expected. Then a normal arv-ws could connect without trouble. Now at 95d12ecb.

Actions #11

Updated by Peter Amstutz over 9 years ago

Great. LGTM

Actions #12

Updated by Brett Smith over 9 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:1af2d4f71f6a7ba4374f8490ef1b4f0b972e2dec.

Actions #13

Updated by Ward Vandewege over 9 years ago

  • Story points set to 0.5
Actions

Also available in: Atom PDF