Project

General

Profile

Actions

Bug #9068

closed

[Keep] keepstore should drop connections immediately if the client hangs up while waiting for bufferPool

Added by Tom Clegg almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Story points:
1.0

Description

Background

If a keepstore server is busy, a client can wait a long time in the queue before we start fulfilling its request. If the client times out and hangs up during this period, we waste a network socket (possibly reaching max open files) and (in the case of GET) we waste time retrieving the requested data from storage, because we don't notice that the client has given up.

Proposed fix

(from #8825)

Check CloseNotifier in GetBlockHandler

Move the bufferpool allocation step up from volume to handler. Add a MightExist() method to each volume type: return false if the block definitely doesn't exist, otherwise true (otherwise, we'll have a performance regression in that unix volumes would no longer be able to respond 404 without reserving a data buffer). With this change, we can check CloseNotifier in GetBlockHandler:
  • if the client hangs up before we get a buffer, don't keep waiting for a buffer, or do any other work on the request. Log 503.
  • if the client hangs up after we get a buffer, but before we finish reading data from disk/blob storage, we'll still waste some time. This is OK.

Check CloseNotifier in PutBlockHandler

Detecting abandoned PUT requests is easier: PutBlockHandler() has the http.Request object at the point where it waits for bufs.Get().

Something like this:

-buf := bufs.Get(int(req.ContentLength))
+var closeNotifier <-chan bool
+if resp, ok := resp.(http.CloseNotifier); ok {
+    closeNotifier = resp.CloseNotify()
+}
+var buf []byte
+bufReady := make(chan []byte)
+go func() {
+    bufReady <- bufs.Get(int(req.ContentLength))
+    close(bufReady)
+}()
+select {
+case buf = <-bufReady:
+case <-closeNotifier:
+    go func() {
+        // Even if closeNotifier happened first, we need to keep waiting
+        // for our buf so we can return it to the pool.
+        bufs.Put(<-bufReady)
+    }()
+    // Meanwhile we exit our handler to free up connection resources.
+    resp.WriteHeader(http.StatusServiceUnavailable)
+    resp.Write("Client disconnected")
+    return
+}
 _, err := io.ReadFull(req.Body, buf)

Subtasks 3 (0 open3 closed)

Task #9104: Drop abandoned GET requestsResolvedTom Clegg04/26/2016Actions
Task #9105: Drop abandoned PUT requestsResolvedTom Clegg04/26/2016Actions
Task #9099: Review 9068-drop-abandoned-connsResolvedRadhika Chippada04/26/2016Actions

Related issues

Related to Arvados - Bug #8825: [Keepstore] many connections in CLOSE_WAITResolved03/07/2017Actions
Actions

Also available in: Atom PDF