[Deployment] [Documentation] Minimize system-wide dependencies for compute node setup
The instructions at http://doc.arvados.org/install/install-compute-node.html install a bunch of packages from our apt repo. This causes dependency hell when using an older system (or a non-Debian distro) on a compute worker.The current set of dependencies include:
- Arvados Python SDK debian package (and its dependencies)
- The ping script given on that install page uses the Python SDK.
- Crunch jobs invoke arv-mount on the host.
- Users can run uncontainerized jobs, and those will work only if the relevant SDKs (typically Python) are installed on the worker host.
sudo apt-get install python-virtualenv python-dev libcurl4-openssl-dev virtualenv /root/arvados-venv /root/arvados-venv/bin/pip install arvados-python-client
It should be possible to install arv-mount in a virtualenv rather than system-wide, using a similar recipe. However, some effort will probably be needed to make "srun arv-mount" (when invoked from the controller node) run arv-mount from the appropriate virtualenv.
Uncontainerized jobs can go away when #6096 is resolved -- enabling default_docker_image_for_jobs will be a realistic option.
Solving and documenting the details of the above three points, and offering an alternative install recipe with Docker1 as the only system-wide dependency that isn't already provided by distro vendors, should make it easy to use a much greater variety of linux distros/versions as worker hosts.
1 Docker has its own install docs, so it won't be necessary to use our debian package repo at all.