Singularity proof of concept
When we run a container on a compute node, we do a container conversion, on the fly, to a SIF file, and run that with singularity instead. Perhaps we even save the SIF file in Keep and do something with another Link object to make it findable in the future, for the corresponding docker image. TODO: check if the framework we built in for the docker image format v1 -> v2 could be used here.
- global option that switches between docker or singularity runner
- container_request runtime parameters flag that chooses between docker and singularity
- crunch-run gets docker tar file from keep (existing docker v2 format images)
- crunch-run converts docker tar file to SIF:
$ docker save arvados/jobs:latest > arvados-jobs.latest.tar $ ls -laF arvados-jobs.latest.tar -rw-r--r-- 1 ward ward 295209984 Jan 14 17:16 arvados-jobs.latest.tar $ singularity build arvados-jobs.latest.sif docker-archive://arvados-jobs.latest.tar INFO: Starting build... ...
- crunch-run executes singularity with mount points, stdout/stderr captured to logs
- slurm dispatcher supports singularity
- ideally the backend container runner should be transparent to the dispatcher
- proof of concept will be tested on 9tee4
- assume that user id inside the container will be the same as the crunch-run user (?)
- try to support running containers without setuid, identify specific features that require setuid on singularity binary.
Testing goals / acceptance criteria
- MVP: runs a container
- default value for singularity binary (/usr/bin/singularity) but can be changed from arvados config.yml
- captures stdout/stderr to logs
- can bind-mount arv-mount inside the container
- can bind mount tmp/output directories inside the container
- output files have proper permissions to be read for upload & cleaned up (deleted) by crunch-run
- see if it makes sense to have singularity mock the docker API
- should have similar test coverage of singularity features as exist to the Docker features
For future tickets:
- memory / CPU constraints
#14 Updated by Nico César about 1 month ago
very first take trying to create an abstraction that matches 1 to 1 with docker for now,
- [DONE] 0a27815bd review ContainerConfig, HostConfig settings and add them to the ThinContainerExecRunner interface as Get/Set methods to abstract from the internal representation
- review all the networking related options and see if can be simplified
- make a run with crunch-run --container-runner singularity to see how it behaves
- add tests related to singularity
#17 Updated by Nico César about 1 month ago
As I'm reading all the documentation available about singularity I want to write down some notes:
It is also important to note that the philosophy of Singularity is Integration over Isolation. Most container run times strive to isolate your container from the host system and other containers as much as possible. Singularity, on the other hand, assumes that the user’s primary goals are portability, reproducibility, and ease of use and that isolation is often a tertiary concern.
Therefore, Singularity only isolates the mount namespace by default, and will bind mount several host directories such as $HOME and /tmp into the container at runtime. If needed, additional levels of isolation can be achieved by passing options causing Singularity to enter any or all of the other kernel namespaces and to prevent automatic bind mounting. These measures allow users to interact with the host system from within the container in sensible ways.
(taken from https://sylabs.io/guides/3.7/user-guide/security.html )
I see a potential problem here, since singularity tries to incorporate the HOST files as part of the container in a transparent way, this could cause problems if crunch-run is running everything with the same user and maybe in a shared environment