Migrating from arvados-node-manager to arvados-dispatch-cloud » History » Version 2

Tom Clegg, 02/11/2019 04:05 PM

1 1 Tom Clegg
h1. Migrating from arvados-node-manager to crunch-dispatch-cloud
2 1 Tom Clegg
3 1 Tom Clegg
{{toc}}
4 1 Tom Clegg
5 1 Tom Clegg
h2. Choose a node
6 1 Tom Clegg
7 1 Tom Clegg
The dispatch service can run on any host that can connect to the Arvados API service, the cloud provider's API, and the SSH service on cloud VMs. In the following example it runs on the same node as the API server and controller.
8 1 Tom Clegg
9 1 Tom Clegg
h2. Update cluster configuration file
10 1 Tom Clegg
11 1 Tom Clegg
In @/etc/arvados/config.yml@, add configuration items for the dispatch service.
12 1 Tom Clegg
13 1 Tom Clegg
<pre><code class="yaml">
14 1 Tom Clegg
Clusters:
15 1 Tom Clegg
  uuid_prefix:
16 1 Tom Clegg
    CloudVMs:
17 1 Tom Clegg
      BootProbeCommand: "mount | grep /mnt/scratch"
18 1 Tom Clegg
      SSHPort: "2222"
19 1 Tom Clegg
      SyncInterval: 1m
20 1 Tom Clegg
      TimeoutIdle: 2m
21 1 Tom Clegg
      TimeoutBooting: 10m
22 1 Tom Clegg
      TimeoutProbe: 5m
23 1 Tom Clegg
      TimeoutShutdown: 30s
24 1 Tom Clegg
      ImageID: "image-12345678"
25 1 Tom Clegg
      Driver: Azure
26 1 Tom Clegg
      DriverParameters:
27 2 Tom Clegg
        # before #14745:
28 2 Tom Clegg
        subscription_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
29 2 Tom Clegg
        key: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
30 2 Tom Clegg
        secret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
31 2 Tom Clegg
        tenant_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
32 2 Tom Clegg
        cloud_environment: AzurePublicCloud
33 2 Tom Clegg
        resource_group: zzzzz
34 2 Tom Clegg
        region: centralus
35 2 Tom Clegg
        network: zzzzz
36 2 Tom Clegg
        subnet: zzzzz-subnet-private
37 2 Tom Clegg
        storage_account: example
38 2 Tom Clegg
        blob_container: vhds
39 2 Tom Clegg
        delete_dangling_resources_after: 20
40 2 Tom Clegg
        # after #14745:
41 2 Tom Clegg
        SubscriptionID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
42 2 Tom Clegg
        ClientID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
43 2 Tom Clegg
        ClientSecret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
44 2 Tom Clegg
        TenantID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
45 2 Tom Clegg
        CloudEnv: AzurePublicCloud
46 2 Tom Clegg
        ResourceGroup: zzzzz
47 2 Tom Clegg
        Location: centralus
48 2 Tom Clegg
        Network: zzzzz
49 2 Tom Clegg
        Subnet: zzzzz-subnet-private
50 2 Tom Clegg
        StorageAccount: example
51 2 Tom Clegg
        BlobContainer: vhds
52 2 Tom Clegg
        DeleteDanglingResourcesAfter: 20
53 1 Tom Clegg
    Dispatch:
54 1 Tom Clegg
      PrivateKey: "..."
55 1 Tom Clegg
      StaleLockTimeout: 1m
56 1 Tom Clegg
      PollInterval: 10s
57 1 Tom Clegg
      ProbeInterval: 10s
58 1 Tom Clegg
      MaxProbesPerSecond: 10
59 1 Tom Clegg
    InstanceTypes:
60 1 Tom Clegg
      x1lg:
61 1 Tom Clegg
        ProviderType: x1.large
62 1 Tom Clegg
        VCPUs: 16
63 1 Tom Clegg
        RAM: 128G
64 1 Tom Clegg
        Scratch: 128G
65 1 Tom Clegg
        Price: 1.23
66 1 Tom Clegg
    ManagementToken: "example-secret-management-token"
67 1 Tom Clegg
    NodeProfiles:
68 1 Tom Clegg
      apiserver:                       # references ARVADOS_NODE_PROFILE in environment file (see below).
69 1 Tom Clegg
        arvados-dispatch-cloud:
70 1 Tom Clegg
          Listen: ":9005"
71 1 Tom Clegg
</code></pre>
72 1 Tom Clegg
73 1 Tom Clegg
Create the host configuration file @/etc/arvados/environment@.
74 1 Tom Clegg
75 1 Tom Clegg
<pre>
76 1 Tom Clegg
ARVADOS_NODE_PROFILE=apiserver
77 1 Tom Clegg
</pre>
78 1 Tom Clegg
79 1 Tom Clegg
h2. Stop crunch-dispatch-slurm
80 1 Tom Clegg
81 1 Tom Clegg
Stop and disable the crunch-dispatch-slurm service, and uninstall the package to make sure it doesn't start after the next reboot/upgrade.
82 1 Tom Clegg
83 1 Tom Clegg
<pre>
84 1 Tom Clegg
# systemctl stop crunch-dispatch-slurm
85 1 Tom Clegg
# systemctl disable crunch-dispatch-slurm
86 1 Tom Clegg
# apt-get remove crunch-dispatch-slurm
87 1 Tom Clegg
</pre>
88 1 Tom Clegg
89 1 Tom Clegg
Containers that have already been locked and submitted to SLURM will make their way through the SLURM queue, but newly queued containers will be left for crunch-dispatch-cloud to run.
90 1 Tom Clegg
91 1 Tom Clegg
h2. Install crunch-dispatch-cloud
92 1 Tom Clegg
93 1 Tom Clegg
<pre>
94 1 Tom Clegg
# apt-get install crunch-dispatch-cloud
95 1 Tom Clegg
</pre>
96 1 Tom Clegg
97 1 Tom Clegg
h2. Verify the service is running
98 1 Tom Clegg
99 1 Tom Clegg
<pre>
100 1 Tom Clegg
$ token="example-secret-management-token"
101 1 Tom Clegg
$ curl -H "Authorization: Bearer $token" http://localhost:9005/metrics
102 1 Tom Clegg
</pre>
103 1 Tom Clegg
104 1 Tom Clegg
h2. Verify the service is functional