Project

General

Profile

Actions

Story #17207

open

External access to web services running in containers

Added by Tom Clegg almost 2 years ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Target version:
-
Start date:
12/01/2022
Due date:
03/31/2023 (Due in about 4 months)
% Done:

0%

Estimated time:
Story points:
-
Release:
Release relationship:
Auto

Description

Given
  • container request zzzzz-xvhdp-iiiiiiiiiiiiiii was submitted with runtime_constraints: API: true
  • corresponding container zzzzz-dz642-iiiiiiiiiiiiiii is running
  • process in the container has an http server listening on tcp 0.0.0.0:6061 (or some other port(s))
  • cluster’s wildcard DNS, TLS certificates, and Nginx routing are suitably configured
  • “v2/foo/bar” is a valid api token for a user who has write permission on the container request
  • token scope includes “all”

Pointing a browser to an url like

https://zzzzz-xvhdp-iiiiiiiiiiiiiii-6061-http.zzzzz.arvadosapi.com/foo?api_token=v2/foo/bar&baz

(or similar url with container UUID) will result in a redirect-with-cookie to an url like

https://zzzzz-dz642-iiiiiiiiiiiiiii-6061-http.zzzzz.arvadosapi.com/foo?baz

which will be proxied through to the http server listening on port 6061.

Implementation phase 1

Nginx proxies *-http.example.com to controller.

Controller extracts UUID from vhost, checks token/scope permission.

Controller looks up the container’s crunch-run IP addr:port and forwards the request there.
  • don’t read/modify the authorization header (it must be passed through to the container unchanged)
  • strip the arvados_api_token cookie and other non-forwardable headers
  • set “x-arvados-target-uuid” header to the target container uuid (even if original request gave the container request uuid)
  • set “x-arvados-target-port” header to the requested port
  • set “x-arvados-request-host” header to the original request’s Host header value
  • set “x-arvados-authorization” header to “rand+timestamp hmac-sha256(systemroottoken, rand+timestamp)”
  • change target URL to crunch-run’s addr:port (iow, don’t let the go http client see the original request’s Host, in case that influences its connection-sharing logic)

Cloud dispatcher tells crunch-run (via "env" json doc passed on crunch-run's stdin) the external IP address/hostname it is using to connect to the worker’s SSH service.

Crunch-run handles incoming https requests.
  • get worker host’s external IP address from stdin supplied by dispatcher.
  • listen on a random port.
  • save the external IP address and listening port number in the container record when changing state to Running.
  • (for now) use a non-verifiable/self-signed tls certificate.
  • check “x-arvados-authorization” header timestamp and hmac.
  • check “x-arvados-target-uuid” header is the current container uuid.
  • get target port from “x-arvados-target-port” header.
  • strip “x-arvados-*” and other non-forwardable headers.
  • proxy request to target port using plain http, using the incoming “x-arvados-request-host” header as the outgoing Host header.
  • if “x-arvados-request-host” is missing, return 502 (avoid mishandling requests from future controller versions)

Crunch-run integration test uses a real docker daemon (might not be jenkins-friendly but must be trivial for a developer to run).

Implementation phase 2+

  • Also accept scoped token if it allows “CONNECT /arvados/v1/container/{uuid}/{port}”
  • Work with a container process that serves https instead of http.
  • When handling an incoming request, if needed, use the relevant docker APIs to add a private network between container and host.
  • Periodically list all listening tcp ports in the container, and sync that list to the container record (so workbench2 and the front-end proxy can see it).
  • In Workbench2, show a “Connect to port N” item for listening port in the context menu of any writable container/CR.
  • Option to prohibit connections to a container (details TBD).
  • Option to avoid reusing a container for other CRs if it has accepted any incoming connections (details TBD).
  • Provide docs/conveniences re configuring services to produce working self-URLs in this environment.
  • (done in #17170) certificate validation on https traffic between controller and crunch-run (worker nodes) without deploying real CA-signed certificates to the worker nodes (e.g., by passing a fresh arvados-signed cert from a-d-c to each new crunch-run).
  • (Even if the config specifies a management port number instead of using a dynamic port) handle multiple concurrent containers on a single node without interrupting login sessions (it's easy enough for multiple crunch-run processes to listen on the same port using SO_REUSEPORT and direct traffic to the correct container based on request params/headers; but if container A finishes while its crunch-run process is still providing a tunnel to container B, it needs to release flocks, shutdown tunnels to A, stop listening for new connections, etc., but stay alive to keep B tunnels running until B also ends).

Related features

  • “arvados-client forward lport ctr rport” listens to lport on localhost and forwards all incoming connections to rport on ctr (note this isn't needed on a system that has openssh, because “arvados-client shell ctr -N -Llport:rhost:rport” already works)
  • #17103 “arvados-client shell ctr” connects your pty to a new interactive shell running in ctr (with #17657 optional/multiple -Llport:rport arguments to forward tcp connections at the same time).

Related issues

Related to Arvados - Feature #17206: crunch-run reverse proxies HTTP requests to containerNew

Actions
Related to Arvados - Feature #17209: Controller forwards web requests to crunch worker nodesIn ProgressTom Clegg05/11/2021

Actions
Related to Arvados - Feature #17170: Shell into container proof of conceptResolvedTom Clegg01/14/2021

Actions
Related to Arvados - Feature #17657: [container shell] support SSH port forwardingResolvedTom Clegg05/10/2021

Actions
Related to Arvados - Bug #17682: [arvados-client] shell stutterResolvedTom Clegg05/26/2021

Actions
Related to Arvados - Feature #19166: Container shell support for SLURM and LSF dispatchersResolvedTom Clegg06/24/2022

Actions
Actions #1

Updated by Tom Clegg almost 2 years ago

  • Related to Feature #17206: crunch-run reverse proxies HTTP requests to container added
Actions #2

Updated by Tom Clegg almost 2 years ago

  • Related to Feature #17209: Controller forwards web requests to crunch worker nodes added
Actions #3

Updated by Peter Amstutz almost 2 years ago

A use case we haven't really talked about yet is supporting communication between containers. For example, could use something like this to set up a Spark cluster from within Arvados? You could submit a bunch of containers that run the Spark agent and each one would have a host:port that allows the leader process to communicate and distribute computations through Spark's own job/task scheduling mechanism.

Actions #4

Updated by Tom Clegg almost 2 years ago

  • Related to Story #17103: Developer shell inside running container added
Actions #5

Updated by Tom Clegg almost 2 years ago

  • Related to Feature #17170: Shell into container proof of concept added
Actions #6

Updated by Tom Clegg almost 2 years ago

  • Description updated (diff)
Actions #7

Updated by Tom Clegg almost 2 years ago

  • Description updated (diff)
Actions #9

Updated by Peter Amstutz almost 2 years ago

  • Related to deleted (Story #17103: Developer shell inside running container)
Actions #10

Updated by Peter Amstutz almost 2 years ago

  • Category deleted (Crunch)
  • Project changed from Arvados to Arvados Epics
Actions #11

Updated by Peter Amstutz almost 2 years ago

  • Start date set to 09/01/2021
  • Due date set to 12/31/2021
Actions #12

Updated by Tom Clegg over 1 year ago

  • Related to Feature #17657: [container shell] support SSH port forwarding added
Actions #13

Updated by Ward Vandewege over 1 year ago

  • Related to Bug #17682: [arvados-client] shell stutter added
Actions #14

Updated by Tom Clegg over 1 year ago

  • Description updated (diff)
Actions #15

Updated by Peter Amstutz over 1 year ago

  • Start date changed from 09/01/2021 to 01/01/2022
  • Due date changed from 12/31/2021 to 03/31/2022
Actions #16

Updated by Peter Amstutz about 1 year ago

  • Start date changed from 01/01/2022 to 03/01/2022
  • Due date changed from 03/31/2022 to 06/30/2022
Actions #17

Updated by Peter Amstutz 11 months ago

  • Start date changed from 03/01/2022 to 09/01/2022
  • Due date changed from 06/30/2022 to 12/31/2022
Actions #18

Updated by Peter Amstutz 7 months ago

  • Start date changed from 09/01/2022 to 12/01/2022
  • Due date changed from 12/31/2022 to 03/31/2023
Actions #19

Updated by Tom Clegg 6 months ago

  • Related to Feature #19166: Container shell support for SLURM and LSF dispatchers added
Actions

Also available in: Atom PDF