Project

General

Profile

Actions

Bug #21260

open

abort processing on timeout and/or client hangup in Rails API

Added by Peter Amstutz 5 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
API
Target version:
Story points:
-

Description

Relevant notes on #21160

https://dev.arvados.org/issues/21160#note-8

https://dev.arvados.org/issues/21160#note-10

Summary: controller enforces request timeout using a context (supposed to be API.RequestTimeout that defaults to 5 minutes but I am seeing the controller context expire after 1 minute -- might also be a bug?)

However, Rails / Postgres don't get any signal to stop processing. As a result the request continues processing (despite being cut loose by controller).

When controller cancels the session, the client gets 500 Internal Server Error. This is treated as a retryable response.

As a result, the client retries the expensive request which is still running, and the retry takes up a second request handler slot.

This can cascade with the retry timing out, blocked by the first request (if there are locks involved) resulting in another retry which ties up a third request handler slot, and so on.

To make the system more stable, we should have a mechanism that terminates long-running requests in Rails when they exceed a certain runtime and/or the client hangs up.

We might want to use this:

https://github.com/ankane/slowpoke

This specifically supports passenger and tells passenger to abandon the Ruby process on timeout (which is fine, because we use passenger in forked multiprocess mode since threaded mode is "enterprise only").

Actions #2

Updated by Peter Amstutz 5 months ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz 5 months ago

  • Description updated (diff)
Actions #4

Updated by Peter Amstutz 5 months ago

  • Target version changed from Future to Development 2024-01-17 sprint
Actions #5

Updated by Peter Amstutz 5 months ago

  • Target version changed from Development 2024-01-17 sprint to Development 2024-01-31 sprint
Actions #6

Updated by Peter Amstutz 3 months ago

  • Target version changed from Development 2024-01-31 sprint to Development 2024-02-14 sprint
Actions #7

Updated by Peter Amstutz 3 months ago

  • Target version changed from Development 2024-02-14 sprint to Development 2024-02-28 sprint
Actions #8

Updated by Peter Amstutz 2 months ago

  • Target version changed from Development 2024-02-28 sprint to Development 2024-03-27 sprint
Actions #9

Updated by Peter Amstutz 2 months ago

  • Target version changed from Development 2024-03-27 sprint to To be scheduled
Actions #10

Updated by Peter Amstutz about 2 months ago

  • Target version changed from To be scheduled to Future
Actions

Also available in: Atom PDF