Project

General

Profile

Actions

Bug #21285

closed

Add MaxGatewayTunnels config, separate from MaxConcurrentRequests

Added by Tom Clegg 5 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Story points:
2.0
Release relationship:
Auto

Description

Currently N running containers will start N gateway tunnels that occupy N of the MaxConcurrentRequests slots, even though they don't use the resources that MaxConcurrentRequests is meant to protect (mainly RailsAPI and PostgreSQL). Since each one stays open for the entire duration of the respective container, these tunnel connections can easily consume most/all of the MaxConcurrentRequests slots, leaving none for workbench2 or even other API calls from the containers themselves.

To address this, we should add a separate MaxGatewayTunnels config. Incoming gateway_tunnel requests should not occupy MaxConcurrentRequests slots. After reaching the MaxGatewayTunnels limit, additional gateway_tunnel requests should return 503 immediately rather than wait in a queue. Crunch-run should delay and retry when this happens.

Nginx and load balancers will be expected to allow {MaxConcurrentRequests + MaxQueuedRequests + MaxGatewayTunnels} concurrent requests. Documentation and installer should be updated accordingly.


Files

queue-names.png (28.7 KB) queue-names.png Lucas Di Pentima, 01/09/2024 10:06 PM

Subtasks 2 (0 open2 closed)

Task #21325: Review 21285-max-gw-tunnelsResolvedTom Clegg01/01/2024Actions
Task #21348: Update salt installer (21285-installer-updates)ResolvedLucas Di Pentima01/01/2024Actions

Related issues

Related to Arvados - Bug #21319: Avoid waiting/deadlock when a controller handler performs subrequests against the same controllerNewActions
Actions #2

Updated by Peter Amstutz 5 months ago

On the topic of tying up request slots --

Requests that rely on external services (e.g. proxying to keep-web, or federation) other than the API server and/or local database should also be tracked and limited separately. There is a known pattern where certain requests to go out to the other service, and then that service contacts the API server back to verify the token, fetch user info, fetch a collection manifest, etc. This can result in deadlock if all the request slots are tied up waiting on external services.

Actions #3

Updated by Tom Clegg 4 months ago

  • Related to Bug #21319: Avoid waiting/deadlock when a controller handler performs subrequests against the same controller added
Actions #4

Updated by Tom Clegg 4 months ago

  • Assigned To set to Tom Clegg
  • Status changed from New to In Progress

21285-max-gw-tunnels @ a5af8485c9658e13c6ef40981138731b0db35a68 -- developer-run-tests: #3978
retry wb2 developer-run-tests-services-workbench2: #118

I reworked httpserver.RequestLimiter to have the caller create a number of RequestQueue objects, and provide a function to assign each incoming request to one of them. Each RequestQueue specifies its own MaxConcurrent, MaxQueue, and MaxQueueTimeForMinPriority, as well as a label used for reporting in metrics.

This means we now have a set of metrics for each queue ("api" and "tunnel"), which can be graphed separately and/or in aggregate.

-arvados_concurrent_requests 54
-arvados_max_concurrent_requests 54
-arvados_max_queued_requests 1
-arvados_queue_delay_seconds_count{priority="normal"} 54
-arvados_queue_timeout_seconds_count{priority="normal"} 0
-arvados_queued_requests{priority="normal"} 1
+arvados_concurrent_requests{queue="api"} 24
+arvados_concurrent_requests{queue="tunnel"} 30
+arvados_max_concurrent_requests{queue="api"} 24
+arvados_max_concurrent_requests{queue="tunnel"} 30
+arvados_max_queued_requests{queue="api"} 1
+arvados_max_queued_requests{queue="tunnel"} 0
+arvados_queue_delay_seconds_count{priority="normal",queue="api"} 24
+arvados_queue_delay_seconds_count{priority="normal",queue="tunnel"} 30
+arvados_queue_timeout_seconds_count{priority="normal",queue="api"} 0
+arvados_queue_timeout_seconds_count{priority="normal",queue="tunnel"} 0
+arvados_queued_requests{priority="normal",queue="api"} 1
+arvados_queued_requests{priority="normal",queue="tunnel"} 0

The "tunnel" queue is used for container shell SSH sessions (client->controller->crunchrun) and container gateway tunnels (crunchrun->controller).

  • ✅ All agreed upon points are implemented / addressed.
    • New config is MaxGatewayTunnels
    • ✅ Crunch-run should delay and retry if tunnel setup endpoint responds 503 (we already had this - crunch-run waits 5s and retries after any error/disconnection)
  • ✅ Anything not implemented (discovered or discussed during work) has a follow-up story.
  • ✅ Code is tested and passing, both automated and manual, what manual testing was done is described
  • ✅ Documentation has been updated.
  • ✅ Behaves appropriately at the intended scale (describe intended scale).
  • ✅ Considered backwards and forwards compatibility issues between client and server.
  • ✅ Follows our coding standards and GUI style guidelines.
Notes:
  • ☐ Need to update Nginx config in salt installer?
Actions #5

Updated by Lucas Di Pentima 4 months ago

  • If every running container requires a tunnel slot, I think that the MaxGatewayTunnels config would behave as an upper limit to the Containers.CloudVMs.MaxInstances setting. Or putting it in another way: in the cloud scenario both config knobs are kind of equivalent (without counting the container shell sessions). Is that a correct understanding?
    • Should we check that MaxGatewayTunnels is at least equal to (or greater than) MaxInstances to avoid surprising behaviors?
    • Do you think we could simplify configuration by setting MaxGatewayTunnels's default to something related to whatever MaxInstances has?
  • The docs update doesn't include any mention on a-d-c's section, is that on purpose? Or maybe this branch is only HPC related?
    • If containers only use tunnel queue slots on HPC clusters, I think it would be convenient to also mention it on the config file.
  • For changes on the installer, I will need to wait for this branch to be merged, so that I can do some testing with dev packages.
    • I think the grafana dashboard will also need some tweaking.
Actions #6

Updated by Tom Clegg 4 months ago

Lucas Di Pentima wrote in #note-5:

  • If every running container requires a tunnel slot, ...

Sorry, the note I put in the config file was not correct. The crunchrun-to-controller tunnel is not set up when using the cloud dispatcher, so this does only apply on HPC, where MaxInstances doesn't apply. I've updated the note in the config file:

      # Maximum number of active gateway tunnel connections. One slot                                                                                                                        
      # is consumed by each "container shell" connection. If using an                                                                                                                        
      # HPC dispatcher (LSF or Slurm), one slot is consumed by each                                                                                                                          
      # running container.  These do not count toward                                                                                                                                        
      # MaxConcurrentRequests.                                                                                                                                                               
      MaxGatewayTunnels: 1000
  • For changes on the installer, I will need to wait for this branch to be merged, so that I can do some testing with dev packages.
    • I think the grafana dashboard will also need some tweaking.

SGTM

21285-max-gw-tunnels @ f0908fd5878be533d1208c6a528459945524256b

Actions #7

Updated by Lucas Di Pentima 4 months ago

Tom Clegg wrote in #note-6:

Sorry, the note I put in the config file was not correct. The crunchrun-to-controller tunnel is not set up when using the cloud dispatcher, so this does only apply on HPC, where MaxInstances doesn't apply. I've updated the note in the config file:
21285-max-gw-tunnels @ f0908fd5878be533d1208c6a528459945524256b

Thanks for the clarification, with that I think 21285-max-gw-tunnels LGTM.

Actions #8

Updated by Tom Clegg 4 months ago

  • Assigned To changed from Tom Clegg to Lucas Di Pentima

Merged 21285-max-gw-tunnels.

Leaving ticket open for installer changes.

Actions #9

Updated by Lucas Di Pentima 4 months ago

Updates at ce3c4f8 - branch 21285-installer-updates

  • Adds a new optional CONTROLLER_MAX_GATEWAY_TUNNELS variable in local.params's performance tuning section.
  • Adds API.MaxGatewayTunnels config with default value set to 1000.
  • Updates balancer & controller nginx configs to account for this new value.

Manually tested on our AWS sandbox account.

Pending: Grafana dashboard update.

Actions #10

Updated by Lucas Di Pentima 4 months ago

Update at 9f6c2250f6

  • Adds _{{queue}} to the labels on the concurrent request graph so that new queues get identified on demand.

Actions #11

Updated by Tom Clegg 4 months ago

My only question is about the rlimit and worker_connections calculations:

      worker_rlimit_nofile: {{ (max_reqs * 3 + max_tunnels) * controller_nr }}
      worker_rlimit_nofile: {{ max_reqs * 3 + 1 + max_tunnels }}

I think we want to add max_tunnels*3 to nofile here: one fd for the incoming connection + one for the outgoing connection to controller + one from controller back to passenger for a permission check.

Also, questioning the existing logic: does nginx really only use one fd for an incoming connection from controller to passenger/rails? Or does it use one for controller->vhost + one for vhost->upstream? Just wondering whether we should be multiplying by 4 instead of 3 here. AFAIK there is no real reason to be stingy with file descriptors once we're going to the trouble to increase the limit...?

Actions #12

Updated by Lucas Di Pentima 4 months ago

After discussing on chat, I've applied the following diff at ba1937c21e

diff --git a/tools/salt-install/config_examples/multi_host/aws/pillars/nginx_balancer_configuration.sls b/tools/salt-install/config_examples/multi_host/aws/pillars/nginx_balancer_configuration.sls
index 25027b8571..485cf9c12f 100644
--- a/tools/salt-install/config_examples/multi_host/aws/pillars/nginx_balancer_configuration.sls
+++ b/tools/salt-install/config_examples/multi_host/aws/pillars/nginx_balancer_configuration.sls
@@ -16,9 +16,9 @@ nginx:
   ### SERVER
   server:
     config:
-      worker_rlimit_nofile: {{ (max_reqs * 3 + max_tunnels) * controller_nr }}
+      worker_rlimit_nofile: {{ (max_reqs + max_tunnels) * 5 * controller_nr }}
       events:
-        worker_connections: {{ (max_reqs * 3 + max_tunnels) * controller_nr }}
+        worker_connections: {{ (max_reqs + max_tunnels) * 5 * controller_nr }}
       ### STREAMS
       http:
         'geo $external_client':
diff --git a/tools/salt-install/config_examples/multi_host/aws/pillars/nginx_passenger.sls b/tools/salt-install/config_examples/multi_host/aws/pillars/nginx_passenger.sls
index 0655a0db1f..0c9ef1c36e 100644
--- a/tools/salt-install/config_examples/multi_host/aws/pillars/nginx_passenger.sls
+++ b/tools/salt-install/config_examples/multi_host/aws/pillars/nginx_passenger.sls
@@ -55,9 +55,10 @@ nginx:
       # controller, then potentially 1 from controller back to
       # passenger).  Each connection consumes a file descriptor.
       # That's how we get these calculations
-      worker_rlimit_nofile: {{ max_reqs * 3 + 1 + max_tunnels }}
+      # (we're multiplying by 5 instead to be on the safe side)
+      worker_rlimit_nofile: {{ (max_reqs + max_tunnels) * 5 + 1 }}
       events:
-        worker_connections: {{ max_reqs * 3 + 1 + max_tunnels }}
+        worker_connections: {{ (max_reqs + max_tunnels) * 5 + 1 }}

   ### SITES
   servers:
Actions #13

Updated by Tom Clegg 4 months ago

LGTM, thanks!

Actions #14

Updated by Lucas Di Pentima 3 months ago

  • % Done changed from 50 to 100
  • Status changed from In Progress to Resolved
Actions #15

Updated by Peter Amstutz 3 months ago

  • Release set to 69
Actions

Also available in: Atom PDF