Bug #14022

[Tests] Timeouts causing build failures on jenkins

Added by Tom Clegg over 3 years ago. Updated over 3 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

FUSE tests have been hanging sometimes, and workbench-integration tests have been timing out with enough frequency that the suite usually fails. These might have a common cause, like an API request that hangs forever.

History

#1 Updated by Tom Clegg over 3 years ago

Logs seem normal (e.g., there's a response for every request) near workbench timeout in https://ci.curoverse.com/job/run-tests-apps-workbench-integration/2175/

controller.log

{"RequestID":"req-69zpvtnb9hmlyjjwxe2o","level":"info","msg":"request","remoteAddr":"127.0.0.1:44078","reqBytes":46,"reqForwardedFor":"127.0.0.1","reqHost":"0.0.0.0:38915","reqMethod":"POST","reqPath":"arvados/v1/users/current","reqQuery":"","time":"2018-08-13T18:01:45.903228612Z"}
{"RequestID":"req-69zpvtnb9hmlyjjwxe2o","level":"info","msg":"response","remoteAddr":"127.0.0.1:44078","reqBytes":46,"reqForwardedFor":"127.0.0.1","reqHost":"0.0.0.0:38915","reqMethod":"POST","reqPath":"arvados/v1/users/current","reqQuery":"","respBytes":336,"respStatus":"OK","respStatusCode":200,"time":"2018-08-13T18:01:45.920873422Z","timeToStatus":0.017561,"timeTotal":0.017632,"timeWriteBody":0.000071}
{"RequestID":"req-1hnkll7y0376slxo94jl","level":"info","msg":"request","remoteAddr":"127.0.0.1:44090","reqBytes":39,"reqForwardedFor":"127.0.0.1","reqHost":"0.0.0.0:38915","reqMethod":"POST","reqPath":"database/reset","reqQuery":"","time":"2018-08-13T18:02:15.805311688Z"}
{"RequestID":"req-1hnkll7y0376slxo94jl","level":"info","msg":"response","remoteAddr":"127.0.0.1:44090","reqBytes":39,"reqForwardedFor":"127.0.0.1","reqHost":"0.0.0.0:38915","reqMethod":"POST","reqPath":"database/reset","reqQuery":"","respBytes":42,"respStatus":"OK","respStatusCode":200,"time":"2018-08-13T18:02:17.112781365Z","timeToStatus":1.307381,"timeTotal":1.307457,"timeWriteBody":0.000076}

nginx_access.log

[13/Aug/2018:18:01:45 +0000] controller 200 336 0.018 POST "https://0.0.0.0:38915/arvados/v1/users/current" 127.0.0.1:55600 "-" "HTTPClient/1.0 (2.8.3, ruby 2.3.3 (2016-11-21))" 
[13/Aug/2018:18:01:45 +0000] controller 200 336 0.019 POST "https://0.0.0.0:38915/arvados/v1/users/current" 127.0.0.1:55600 "-" "HTTPClient/1.0 (2.8.3, ruby 2.3.3 (2016-11-21))" 
[13/Aug/2018:18:02:17 +0000] controller 200 42 1.308 POST "https://0.0.0.0:38915/database/reset" 127.0.0.1:55616 "-" "HTTPClient/1.0 (2.8.3, ruby 2.3.3 (2016-11-21))" 

nginx_error.log

2018/08/13 18:01:38 [info] 1490#1490: *3140 client closed connection while waiting for request, client: 127.0.0.1, server: 0.0.0.0:46191
2018/08/13 18:01:43 [info] 1490#1490: *3139 client 127.0.0.1 closed keepalive connection
2018/08/13 18:01:46 [info] 1490#1490: *2990 client 127.0.0.1 closed keepalive connection
2018/08/13 18:01:46 [info] 1490#1490: *2992 client 127.0.0.1 closed keepalive connection
2018/08/13 18:02:17 [info] 1490#1490: *3008 client 127.0.0.1 closed keepalive connection

Also available in: Atom PDF