Project

General

Profile

Idea #22580

Updated by Peter Amstutz about 2 months ago

The purpose of @arvbox@ is  

 # to provide a self container developer environment capable of running the entire test suite 
 # to enable launching a self-contained, auto-configured cluster that is can support integration tests (such as running CWL workflows) and manual testing of components that the end user might interact with such as Workbench and keep-web. 

 Arvbox has significant overlap with other functionality -- all of which was written after @arvbox@ was created, but the approaches taken by @arvbox@ were not intended to be general purpose, where as these new methods (mostly based around Ansible) are general purpose, and thus could support a new arvbox. 

 So I'm thinking about how a new iteration of arvbox should work. 

 Current functional overlap: 

 * arvbox Dockerfile uses @arvados-server install@ plus installs some additional packages, but @arvados-server install@ is redundant with the new ansible playbook and will be removed (#22436) 
 * arvbox can launch run-tests, but the "test" environment (set up by run-tests) has entirely separate code from the arvbox scripts that create a "development" environment.    having separate binaries depending on how you're running things is a bit confusing. 
 * arvbox has its own code to configure and launch services, which overlaps with code in @run-tests@, @sdk/python/tests/run_test_server.py@, @arvados-server boot@ and the production @systemd@ units 

 A big part of what the arvbox shell script (that the user interacts with on the host) is managing the docker container(s), which are brought up with a particular set of command line options to bind-mount various things into the container to make them persistent while being able to tear down the container itself. 

 One of the reasons for doing it this way was to draw clear lines between what is stateful in the container and what isn't, so if the container environment is modified a certain way that involves changing some part of the file system that isn't preserved, that had better be something that is scripted to be re-configured on the next boot.    It keeps us honest. 

 It would be great to be able to offload as much as possible to general purpose Ansible playbooks and other configuration code.    If so, then arvbox2 could focus on virtual environment management and then only needs to launch general purpose "run-tests.sh" or "launch a development arvados cluster" entry points. 

 This brings up questions about what container or VM technology to use.    Ones that we have some experience with include: 

 * Docker (currently used by arvbox) 
 * systemd-nspawn 
 * Singularity (included for completeness)  
 * kvm 

 h2. Docker 

 pros:  

 * The 

 cons: 

 h2. systemd-nspawn 

 pros: 

 cons: 

 h2. Singularity 

 It's unclear if the feature matches what we need. 

 h2. kvm 


 h2. Abstraction layers 

 h3. libvirt and virsh 

 https://ubuntu.com/server/docs/libvirt 

 This is the standard interface for @kvm@, but also supports @LXC@ which is a container technology for Linux that has been around before Docker. 

 h3. Vagrant 

 https://github.com/hashicorp/vagrant 

 Specifically intended to help create developer environments using different conainer/virtualization technologies, but now has an icky "Business Source License". 

Back