Story #17207

Updated by Tom Clegg about 2 years ago

 * container request zzzzz-xvhdp-iiiiiiiiiiiiiii was submitted with runtime_constraints: API: true 
 * corresponding container zzzzz-dz642-iiiiiiiiiiiiiii is running 
 * process in the container has an http server listening on tcp (or some other port(s)) 
 * cluster’s wildcard DNS, TLS certificates, and Nginx routing are suitably configured 
 * “v2/foo/bar” is a valid api token for a user who has write permission on the container request 
 * token scope includes “all” 

 Pointing a browser to an url like 

 (or similar url with container UUID) will result in a redirect-with-cookie to an url like 

 which will be proxied through to the http server listening on port 6061. 

 h3. Implementation phase 1 

 Nginx proxies * to controller. 

 Controller extracts UUID from vhost, checks token/scope permission. 

 Controller looks up the container’s crunch-run IP addr:port and forwards the request there. 
 * don’t read/modify the authorization header (it must be passed through to the container unchanged) 
 * strip the arvados_api_token cookie and other non-forwardable headers 
 * set “x-arvados-target-uuid” header to the target container uuid (even if original request gave the container request uuid) 
 * set “x-arvados-target-port” header to the requested port 
 * set “x-arvados-request-host” header to the original request’s Host header value 
 * set “x-arvados-authorization” header to “rand+timestamp hmac-sha256(systemroottoken, rand+timestamp)” 
 * change target URL to crunch-run’s addr:port (iow, don’t let the go http client see the original request’s Host, in case that influences its connection-sharing logic) 

 Cloud dispatcher tells crunch-run (via "env" json doc passed on crunch-run's stdin) the external IP address/hostname it is using to connect to the worker’s SSH service. 

 Crunch-run handles incoming https requests. 
 * get worker host’s external IP address from stdin supplied by dispatcher. 
 * listen on a random port. 
 * save the external IP address and listening port number in the container record when changing state to Running. 
 * (for now) use a non-verifiable/self-signed tls certificate. 
 * check “x-arvados-authorization” header timestamp and hmac. 
 * check “x-arvados-target-uuid” header is the current container uuid. 
 * get target port from “x-arvados-target-port” header. 
 * strip “x-arvados-*” and other non-forwardable headers. 
 * proxy request to target port using plain http, using the incoming “x-arvados-request-host” header as the outgoing Host header. 
 * if “x-arvados-request-host” is missing, return 502 (avoid mishandling requests from future controller versions) 

 Crunch-run integration test uses a real docker daemon (might not be jenkins-friendly but must be trivial for a developer to run). 

 h3. Implementation phase 2+ 

 * Also accept scoped token if it allows “CONNECT /arvados/v1/container/{uuid}/{port}” 
 * Work with a container process that serves https instead of http. 
 * When handling an incoming request, if needed, use the relevant docker APIs to add a private network between container and host. 
 * Periodically list all listening tcp ports in the container, and sync that list to the container record (so workbench2 and the front-end proxy can see it). 
 * In Workbench2, show a “Connect to port N” item for listening port in the context menu of any writable container/CR. 
 * Option to prohibit connections to a container (details TBD). 
 * Option to avoid reusing a container for other CRs if it has accepted any incoming connections (details TBD). 
 * Provide docs/conveniences re configuring services to produce working self-URLs in this environment. 
 * Admin option to enable certificate validation on https traffic between controller and crunch-run (worker nodes) without deploying real CA-signed certificates to the worker nodes (e.g., by passing a fresh arvados-signed cert from a-d-c to each new crunch-run). 
 * (Even if the config specifies a management port number instead of using a dynamic port) handle Handle multiple concurrent containers on a single node without interrupting login sessions (it's easy enough for multiple crunch-run processes to listen on the same port using SO_REUSEPORT and direct traffic to the correct container based on request params/headers; but if container A finishes while its crunch-run process is still providing a tunnel to container B, it needs to release flocks, shutdown tunnels to A, stop listening for new connections, etc., but stay alive to keep B tunnels running until B also ends). 

 h3. Related features 

 * “arvados-client forward lport ctr rport” listens to lport on localhost and forwards all incoming connections to rport on ctr (analogous to “ssh -Llport:rhost:rport”) 
 * #17103 “arvados-client shell ctr” connects your pty to a new interactive shell running in ctr (with optional/multiple -Llport:rport arguments to forward tcp connections at the same time).