Feature #22551
Updated by Peter Amstutz 10 days ago
https://dev.arvados.org/projects/arvados/wiki/Service_containers
h2. API changes
h3. Containers and container requests
They get two new fields:
|"service"|boolean|does the container represent a long lived service or a once-through batch run? if "service" is "true" then this has the effect of disabling container reuse.|
|"published_ports"|dict|dictionary with keys being the port on the container, and values described below|
published_port values
|"access"|string|"public" or "private". public means unauthenticated connections are allowed. private means you must provide an unscoped Arvados API key for the same user as owns the container|
|"label"|string|text describing the service to be displayed in workbench|
Example:
<pre>
{
"published_ports": {
"80": {
"access": "private",
"label": "My great web app"
}
}
}
</pre>
h3. New "link" type
The "hostname" link let us assign friendly, stable names to services. Discussed more in the "controller" section.
|link_class|"hostname"|
|owner_uuid|normal ownership (user or project)|
|name|the hostname to assign to the port. must be a valid DNS name, cannot have invalid characters or dots|
|head_uuid|container request uuid|
|properties|has a single key, "port", e.g. {"port": 80}|
The API server shall have a new unique constraint index on links where link_class=hostname, so that only one link with a given hostname can exist at a time.
h2. Controller/crunch-run changes
Requests to virtual host in the configured HTTP container domain (e.g. *.containers.zzzzz.arvadosapi.com) are directed to controller and intercepted.
1. Controller uses the hostname to determine what container they apply to.
* The hostname has the format "uuid-port" e.g. zzzzz-xvhdp-iiiiiiiiiiiiiii-1234.containers.zzzzz.arvadosapi.com (It might be nice if the port is missing to default to port 80 on the target)
* The hostname has been claimed with a "hostname" link type as described above (it should fall back to this _after_ checking uuid to prevent someone from taking over a uuid, I think?)
This gets a container request uuid and target port. From the container request, it gets the container uuid.
If the container does not exist, return a 404 error.
2. Check for "?api_token=" in the URL. If found, remove "?api_token" from the query, and return a redirect setting a cookie containing the token (this should use the same mechanism that keep-web uses).
3. Get "published_ports" from the container record. Look up the port that the user has requested to connect to. Check if access is marked "public", otherwise it is considered "private"
If it is "private", check for an authorization cookie and/or "Authorization" header to get the token. The token must be valid and correspond to the user that owns the container.
If the user does not match or there is no authorization header and the container is private, return a 403 error.
4. If the request passes access control, the request is proxied to the container's crunch-run process using the mechanism previously developed for container shell and container logs.
5. The container's crunch-run process receives the request and proxies it to the container on the specified port.
If the port is not open on the client, return an error (should be a 404 or a 502? depends on whether we want the client to retry)
6. The response is relayed back through crunch-run and controller.
h2. Workbench changes
If a container request is running and has a non-empty "published ports" it should display those prominently at the top as something the user can click on. It should check for "hostname" links pointing to the container and preferentially use those when constructing the URL. The link should include "?api_token=".
h2. arvados-cwl-runner changes
Introduce an extension to CommandLineTool that corresponds to setting the "service" and "published_ports" fields.
In the future, we may want to be able to launch a service and then return its endpoint as output that can be passed downstream; this is out of scope for this ticket.