Project

General

Profile

Crunch2 installation » History » Version 11

Tom Clegg, 06/20/2016 09:29 PM

1 1 Tom Clegg
h1. Crunch2 installation
2
3
(DRAFT -- when ready, this will move to doc.arvados.org→install)
4
5 2 Tom Clegg
{{toc}}
6
7
h2. Set up a crunch-dispatch service
8
9
Currently, dispatching containers via SLURM is supported.
10
11 9 Brett Smith
Install crunch-dispatch-slurm on a node that can submit SLURM jobs. This can be any node appropriately configured to connect to the SLURM controller node.
12 2 Tom Clegg
13
<pre><code class="shell">
14
sudo apt-get install crunch-dispatch-slurm
15
</code></pre>
16
17 9 Brett Smith
Create a privileged Arvados API token for use by the dispatcher. If you have multiple dispatch processes, you should give each one a different token.
18 2 Tom Clegg
19
<pre><code class="shell">
20
apiserver:~$ cd /var/www/arvados-api/current
21
apiserver:/var/www/arvados-api/current$ sudo -u webserver-user RAILS_ENV=production bundle exec script/create_superuser_token.rb
22
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
23
</code></pre>
24
25 4 Tom Clegg
Save the token on the dispatch node, in <code>/etc/sv/crunch-dispatch-slurm/env/ARVADOS_API_TOKEN</code>
26 2 Tom Clegg
27 4 Tom Clegg
Example runit script (@/etc/sv/crunch-dispatch-slurm/run@):
28 2 Tom Clegg
29 1 Tom Clegg
<pre><code class="shell">
30
#!/bin/sh
31 4 Tom Clegg
set -e
32
exec 2>&1
33 2 Tom Clegg
34
export ARVADOS_API_HOST=uuid_prefix.your.domain
35
36
exec chpst -e ./env -u crunch crunch-dispatch-slurm
37
</code></pre>
38
39 6 Tom Clegg
Example runit logging script (@/etc/sv/crunch-dispatch-slurm/log/run@):
40
41
<pre><code class="shell">
42
#!/bin/sh
43
set -e
44
[ -d main ] || mkdir main
45
exec svlogd -tt ./main
46
</code></pre>
47
48 10 Tom Clegg
Ensure the @crunch@ user on the dispatch node can run Docker containers on SLURM compute nodes via @srun@ or @sbatch@. Depending on your SLURM installation, this may require that the @crunch@ user exist -- and have the same UID, GID, and home directory -- on the dispatch node and all SLURM compute nodes.
49
50
For example, this should print "OK" (possibly after some extra status/debug messages from SLURM and docker):
51
52
<pre>
53
crunch@dispatch:~$ srun -N1 docker run busybox echo OK
54
</pre>
55
56 2 Tom Clegg
57 3 Tom Clegg
h2. Install crunch-run on all compute nodes
58 1 Tom Clegg
59 3 Tom Clegg
<pre><code class="shell">
60
sudo apt-get install crunch-run
61
</code></pre>
62
63 1 Tom Clegg
h2. Enable cgroup accounting on all compute nodes
64
65 4 Tom Clegg
(This requirement isn't new for crunch2/containers, but it seems to be a FAQ. The Docker install guide mentions it's optional and performance-degrading, so it's not too surprising if people skip it. Perhaps we should say why/when it's a good idea to enable it?)
66
67 3 Tom Clegg
Check https://docs.docker.com/engine/installation/linux/ for instructions specific to your distribution.
68
69
For example, on Ubuntu:
70
# Update @/etc/default/grub@ to include: <pre>
71
GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"
72
</pre>
73
# @sudo update-grub@
74
# Reboot
75 2 Tom Clegg
76 9 Brett Smith
h2. Configure Docker
77 1 Tom Clegg
78 4 Tom Clegg
Unchanged from current docs.
79
80 1 Tom Clegg
h2. Test the dispatcher
81 4 Tom Clegg
82 5 Tom Clegg
On the dispatch node, monitor the crunch-dispatch logs.
83 4 Tom Clegg
84
<pre><code class="shell">
85
dispatch-node$ tail -F /etc/sv/crunch-dispatch-slurm/log/main/current
86
</code></pre>
87
88 11 Tom Clegg
(TODO: Add example startup logs from crunch-dispatch-slurm)
89
90 9 Brett Smith
On a shell VM, install a Docker image for testing.
91 1 Tom Clegg
92
<pre><code class="shell">
93 9 Brett Smith
user@shellvm:~$ arv keep docker busybox
94 5 Tom Clegg
</code></pre>
95
96 11 Tom Clegg
(TODO: Add example log/debug messages)
97
98 5 Tom Clegg
On a shell VM, run a trivial container.
99
100
<pre><code class="shell">
101 4 Tom Clegg
user@shellvm:~$ arv container_request create --container-request '{
102 1 Tom Clegg
  "name":            "test",
103 4 Tom Clegg
  "state":           "Committed",
104
  "priority":        1,
105 5 Tom Clegg
  "container_image": "busybox",
106 8 Tom Clegg
  "command":         ["true"],
107
  "output_path":     "/out",
108
  "mounts": {
109
    "/out": {
110
      "kind":        "tmp",
111
      "capacity":    1000
112
    }
113
  }
114 7 Tom Clegg
}'
115
</code></pre>
116 1 Tom Clegg
117
Measures of success:
118 11 Tom Clegg
* Dispatcher log entries will indicate it has submitted a SLURM job. (TODO: Add example logs.)
119
* Before the container finishes, SLURM's @squeue@ command will show the new job in the list of queued/running jobs. (TODO: Add squeue output, showing how containers look there.)
120 8 Tom Clegg
* After the container finishes, @arv container list --limit 1@ will indicate the outcome: <pre>
121 7 Tom Clegg
{
122
 ...
123
 "exit_code":0,
124
 ...
125
 "state":"Complete",
126
 ...
127
}
128
</pre>