Project

General

Profile

Service containers » History » Version 7

Peter Amstutz, 02/05/2025 07:23 PM

1 1 Peter Amstutz
h1. Service containers
2
3
Concept: Containers launched via the Crunch infrastructure, but provide a network port that things can connect to.
4
5 2 Peter Amstutz
Arvados epic: https://dev.arvados.org/issues/17207
6
7 3 Peter Amstutz
h2. Uses cases
8 1 Peter Amstutz
9 2 Peter Amstutz
* Applications providing an API
10
** a bunch of data needs to be loaded into RAM before it can be used, queried, or computed on
11
** e.g. large language models, databases, function-as-a-service
12 1 Peter Amstutz
** Makes sense when the time spent on any given query is much much smaller than the loading time
13 2 Peter Amstutz
14
* User facing web applications
15
** e.g. Integrative Genomics Viewer (IGV), Jupyter notebooks
16
** Also includes web applications that interact with an API (first bullet)
17 1 Peter Amstutz
18 4 Peter Amstutz
* Cluster maintenance services
19 5 Peter Amstutz
** Services that react to stuff happening on the cluster, such as kicking off a workflow when a collection appears in a certain project, or checking projects for metadata conformance.  These things currently run outside of the cluster, but could may benefit from Arvados features if they were also managed by the cluster. 
20
** This doesn't strictly require the ability to expose web services, but would benefit from other tweaks to better accommodate long-lived containers.
21 4 Peter Amstutz
22 3 Peter Amstutz
h2. Fundamental requirement
23
24
Crunch launches a container and makes it possible for an outside client to communicate with the container.
25
26
h2. Discussion points
27 1 Peter Amstutz
28 4 Peter Amstutz
h3. Who can communicate with the container 
29 1 Peter Amstutz
30 4 Peter Amstutz
Exposing services primarily to outside clients vs communication between containers on the inside have different requirements.
31 3 Peter Amstutz
32 4 Peter Amstutz
Outside: Must be able to connect from outside.  Because containers are on a private network, some kind of proxying or network address translation (NAT) is required.
33 1 Peter Amstutz
34 4 Peter Amstutz
Inside: Assuming containers are on the same private network and can route to each other, they can communicate directly.  Need to be able to discover how to contact other containers.  (Might even want a way of declaring exactly containers can connect to which other containers).
35 3 Peter Amstutz
36
h3. HTTP only, or arbitrary TCP connections?
37 1 Peter Amstutz
38 4 Peter Amstutz
HTTP only: Can proxy HTTP requests using wildcard DNS and "Host:" headers, we have machinery and operational experience doing that already.  Can apply Arvados authentication to requests, e.g. setting a cookie with an Arvados token so the client can only communicate with containers that have read access to.  Cannot host services that don't use HTTP.
39 1 Peter Amstutz
40 4 Peter Amstutz
Arbitrary TCP: Would need to apply NAT or connection tunneling to connections on an arbitrary external port that is associated with the container.  We don't currently have machinery to do this.  Authentication is left up to the service.  Can host services that have their own protocols, such as postgresql or ssh.
41 3 Peter Amstutz
42
Container shell uses connection tunneling, it makes a HTTP connection and doing a connection upgrade to SSH.  This requires special cooperation between arvados-client and ssh, which doesn't generalize.
43
44 4 Peter Amstutz
Internal-only connections (between containers) may be a bit easier to orchestrate arbitrary TCP connections without tunneling.  Authentication is still left up to the container, or requires fiddling with firewall rules on the fly to control who can access the container.
45 1 Peter Amstutz
46 4 Peter Amstutz
h3. Redundancy with other platforms
47 3 Peter Amstutz
48 4 Peter Amstutz
Kubernetes orchestrates services.  This feature overlaps with kubernetes.  We don't have the resources to compete with Kubernetes.  However, with Arvados as a data analytics platform where scheduling and running code is a core feature, a carefully scoped feature for hosting services could give us some very significant new capability relative to the amount of work.
49 1 Peter Amstutz
50 5 Peter Amstutz
h3. Long lived containers
51
52
We might want to limit certain kinds of logging such as the stats from crunchstat, hoststat, and arv-mount, because a container running for weeks will accumulate a _lot_ of logs.
53
54
h3. Container naming
55
56
If you start a service, use it for a bit, shut it down, then submit a new container request to bring it back up again, it will get a new UUID.  This is a problem if a new session represents the same service and people have it bookmarked, written into scripts, etc.
57
58
It would be great to be able to assign a friendly hostname to a running container.  Example: instead of https://zzzzz-xvhdp-iiiiiiiiiiiiiii.svc.zzzzz.arvadosapi.com/ you could go to https://ollama.svc.zzzzz.arvadosapi.com/
59
60 1 Peter Amstutz
h2. Initial proposal
61 5 Peter Amstutz
62
1. container request zzzzz-xvhdp-iiiiiiiiiiiiiii submitted with 
63
64
<pre>
65
{
66
  runtime_constraints: {
67
    expose_http_from: 80
68
  }
69
}
70
</pre>
71
72
This means "expose the HTTP service running inside the container on port 80".  Must be an unencrypted HTTP endpoint.
73
This creates a corresponding container zzzzz-dz642-iiiiiiiiiiiiiii 
74
75
2. For running containers with "expose_http_from", a user can visit a URL proxied by controller:
76
https://zzzzz-xvhdp-iiiiiiiiiiiiiii.svc.zzzzz.arvadosapi.com/foo?baz&api_token=v2/foo/bar
77
This does a cookie-setting-redirect to:
78
https://zzzzz-xvhdp-iiiiiiiiiiiiiii.svc.zzzzz.arvadosapi.com/foo?baz
79
80
On each request, the proxy checks the API token to determine if the user has read access to the container request.
81
The proxy also adds X-Arvados-User-UUID to the request.
82
If the container is in a project shared with the anonymous user, no API token is required.
83
84
3. Controller forwards the request to the container and returns the response using the mechanism that has been developed for container shell and container logs.
85
86
Visiting the container request on workbench give an easy to click link to "https://zzzzz-xvhdp-iiiiiiiiiiiiiii.svc.zzzzz.arvadosapi.com/?api_token=v2/foo/bar"
87 6 Peter Amstutz
88
h2. Engineering meeting notes
89
90
Considering the notion of a service container (a long-lived container process) and a container that is available over HTTP to be distinct features.
91
92
h3. Service container request
93
94 7 Peter Amstutz
Service containers can _only_ reuse running containers.
95 6 Peter Amstutz
96 7 Peter Amstutz
Need to double check container cancellation behavior, we might want to be able to do a gracious shutdown.
97 6 Peter Amstutz
98
h3. HTTP endpoints
99
100 1 Peter Amstutz
Mulling over the idea of being able to connect to arbitrary ports but also have named, published endpoints.
101 6 Peter Amstutz
102
Arbitrary ports are only available to the user that own the container.
103
104 7 Peter Amstutz
Published endpoints have access control:
105 6 Peter Amstutz
106
* owner only
107
* can_manage
108
* can_write
109
* can_read
110
* public (or anonymous)