Project

General

Profile

Actions

Feature #22551

open

Containers can expose HTTP endpoints

Added by Peter Amstutz 2 months ago. Updated 4 days ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
-

Description

https://dev.arvados.org/projects/arvados/wiki/Service_containers

API changes

Containers and container requests

They get two new fields:

"service" boolean does the container represent a long lived service or a once-through batch run? if "service" is "true" then this has the effect of disabling container reuse.
"published_ports" dict dictionary with keys being the port on the container, and values described below

published_port values

"access" string "public" or "private". public means unauthenticated connections are allowed. private means you must provide an unscoped Arvados API key for the same user as owns the container
"label" string text describing the service to be displayed in workbench

Example:

{
  "published_ports": {
    "80": {
      "access": "private",
      "label": "My great web app" 
    }
  }
}

TODO discussion point

I think on the container, when it is running, each entry should have a "published_ports" should have a "connect_at" (or "connect_url" or whatever) field as well -- so Workbench doesn't have to do as much work trying to figure out the right URL to provide. This would also make it feasible to distinguish services by port in order to support the single host case (useful for development).

New "link" type

The "published_port_name" link let us assign friendly, stable names to services. Discussed more in the "controller" section.

link_class "published_port"
owner_uuid normal ownership (user or project)
name a hostname for which requests will be proxed to the given port on the container associated with the given container request. must be a valid DNS name, cannot have invalid characters or dots
head_uuid container request uuid
properties has a single key, "port", which is an integer e.g. {"port": 80}

The API server shall have a new unique constraint index on so that only one link object of (link_class=published_port, name) can exist at a time.

Controller/crunch-run changes

Requests to virtual host in the configured HTTP container domain (e.g. *.containers.zzzzz.arvadosapi.com) are directed to controller and intercepted.

1. Controller uses the hostname to determine what container they apply to.
  • The hostname has the format "uuid-port" e.g. zzzzz-xvhdp-iiiiiiiiiiiiiii-1234.containers.zzzzz.arvadosapi.com (It might be nice if the port is missing to default to port 80 on the target)
  • The hostname has been claimed with a "hostname" link type as described above (it should fall back to this after checking uuid to prevent someone from taking over a uuid, I think?)

This gets a container request uuid and target port. From the container request, it gets the container uuid.

If the container does not exist, return a 404 error.

2. Check for "?api_token=" in the URL. If found, remove "?api_token" from the query, and return a redirect setting a cookie containing the token (this should use the same mechanism that keep-web uses).

3. Get "published_ports" from the container record. Look up the port that the user has requested to connect to. Check if access is marked "public", otherwise it is considered "private"

If it is "private", check for an authorization cookie and/or "Authorization" header to get the token. The token must be valid and correspond to the user that owns the container.

If the user does not match or there is no authorization header and the container is private, return a 403 error.

If access is "public", the container record should be cached to avoid excessive container/user lookups on subsequent requests.

4. If the request passes access control, the request is proxied to the container's crunch-run process using the mechanism previously developed for container shell and container logs.

5. The container's crunch-run process receives the request and proxies it to the container on the specified port.

If the port is not open on the client, return an error (should be a 404 or a 502? depends on whether we want the client to retry)

6. The response is relayed back through crunch-run and controller.

Workbench changes

If a container request is running and has a non-empty "published ports" it should display those prominently at the top as something the user can click on. It should check for "hostname" links pointing to the container and preferentially use those when constructing the URL. The link should include "?api_token=".

arvados-cwl-runner changes

Introduce an extension to CommandLineTool that corresponds to setting the "service" and "published_ports" fields.

In the future, we may want to be able to launch a service and then return its endpoint as output that can be passed downstream; this is out of scope for this ticket.


Subtasks 2 (0 open2 closed)

Task #22569: Tom to review implementation plan in ticket descriptionResolvedPeter Amstutz04/16/2025Actions
Task #22570: ReviewResolvedTom Clegg04/16/2025Actions

Related issues 4 (1 open3 closed)

Related to Arvados Epics - Idea #17207: services running in containersIn Progress03/01/202505/31/2025Actions
Related to Arvados - Feature #22581: Implement API server changes to expose HTTP endpoints described in #22551ResolvedPeter AmstutzActions
Related to Arvados - Feature #22677: Controller supports published_ports access control and "published_port" links as DNS aliasesDuplicateActions
Related to Arvados - Feature #22678: Workbench renders links for "published_ports"DuplicateActions
Actions

Also available in: Atom PDF