Project

General

Profile

Service containers » History » Revision 4

Revision 3 (Peter Amstutz, 02/04/2025 08:46 PM) → Revision 4/14 (Peter Amstutz, 02/04/2025 09:16 PM)

h1. Service containers 

 Concept: Containers launched via the Crunch infrastructure, but provide a network port that things can connect to. 

 Arvados epic: https://dev.arvados.org/issues/17207 

 h2. Uses cases 

 * Applications providing an API 
 ** a bunch of data needs to be loaded into RAM before it can be used, queried, or computed on 
 ** e.g. large language models, databases, function-as-a-service 
 ** Makes sense when the time spent on any given query is much much smaller than the loading time 

 * User facing web applications 
 ** e.g. Integrative Genomics Viewer (IGV), Jupyter notebooks 
 ** Also includes web applications that interact with an API (first bullet) 

 * Cluster maintenance services 
 ** Services that react to stuff happening on the cluster, such as kicking off a workflow when a collection appears in a certain project, or checking projects for metadata conformance.    These things currently run outside of the cluster, but could may benefit from Arvados features if they were also managed by the cluster. 

 h2. Fundamental requirement 

 Crunch launches a container and makes it possible for an outside client to communicate with the container. 

 h2. Discussion points 

 h3. Who can communicate with the container  

 Communication between containers 

 Exposing services primarily to outside clients vs communication between containers on the inside have different requirements. 

 Outside: Must be able to connect from outside.    Because containers are on a private network, some kind of proxying or network address translation (NAT) NAT is required. 

 Inside: Assuming containers are on the same private network and can route to each other, they can communicate directly.    Need to be able to discover how to contact other containers.    (Might even want a way of declaring exactly containers can connect to which other containers). using direct TCP connections. 

 h3. HTTP only, or arbitrary TCP connections? 

 HTTP only: Can proxy HTTP requests using wildcard DNS and "Host:" headers, we have a lot of machinery and operational experience doing to do that already.    Can apply Arvados authentication to requests, e.g. setting a cookie with clients have to provide an Arvados token so the client and can only communicate with containers that have read access to.    Cannot host services things that don't use HTTP. 

 Arbitrary TCP: Would need to apply NAT or connection tunneling to connections on an arbitrary external port that is associated with the container.    We don't currently have machinery to do this.    Authentication is left up to the service.    Can host services things that have their own protocols, such as postgresql or ssh. 

 Container shell uses connection tunneling, it makes a HTTP connection and doing a connection upgrade to SSH.    This requires special cooperation between arvados-client and ssh, which doesn't generalize. 

 Internal-only connections (between containers) may be a bit easier to orchestrate arbitrary TCP connections without tunneling. (no tunneling required).    Authentication is still left up to the container, or requires fiddling with firewall rules on the fly to control who can access the container. 

 h3. Redundancy with other platforms 

 Kubernetes orchestrates services.    This feature overlaps with kubernetes.    We don't have h2.  


 * Running container can expose one or more TCP service ports. 
 * Internal crunch containers should be able to connect to the resources service 
 * External clients should be able to compete with Kubernetes.    However, with Arvados as a data analytics platform where scheduling and running code is a core feature, a carefully scoped feature for hosting services could give us some very significant new capability relative connect to the amount of work. 

 h2. Initial proposal 

 service.