Feature #6348

Updated by Tom Clegg over 6 years ago

The instructions at http://doc.arvados.org/install/install-compute-node.html install a bunch of packages from our apt repo. This causes dependency hell when using an older system (or a non-Debian distro) on a compute worker.

The current set of dependencies include:
* Docker
* Arvados Python SDK debian package (and its dependencies)

There are three reasons why the Arvados Python SDK is needed on the compute node.
# The ping script given on that install page uses the Python SDK.
# Crunch jobs invoke arv-mount on the host.
# Users can run uncontainerized jobs, and those will work only if the relevant SDKs (typically Python) are installed on the worker host.

For the ping script's purposes, the SDK can be installed like this:
* <pre>
sudo apt-get install python-virtualenv python-dev libcurl4-openssl-dev
virtualenv /root/arvados-venv
/root/arvados-venv/bin/pip install arvados-python-client

It should be possible to install arv-mount in a virtualenv rather than system-wide, using a similar recipe. However, some effort will probably be needed to make "srun arv-mount" (when invoked from the controller node) run arv-mount from the appropriate virtualenv.

Uncontainerized jobs can go away when #6096 6096 is resolved -- enabling default_docker_image_for_jobs will be a realistic option.

Solving and documenting the details of the above three points, and offering an alternative install recipe with Docker[1] as the only system-wide dependency that isn't already provided by distro vendors, points should make it easy to use a much greater variety of linux distros/versions as worker hosts.

fn1. Docker has its own install docs, so it won't be necessary to use our debian package repo at all.