Project

General

Profile

Migrating from arvados-node-manager to arvados-dispatch-cloud » History » Version 3

Tom Clegg, 02/11/2019 07:39 PM

1 1 Tom Clegg
h1. Migrating from arvados-node-manager to crunch-dispatch-cloud
2
3
{{toc}}
4
5
h2. Choose a node
6
7
The dispatch service can run on any host that can connect to the Arvados API service, the cloud provider's API, and the SSH service on cloud VMs. In the following example it runs on the same node as the API server and controller.
8
9
h2. Update cluster configuration file
10
11
In @/etc/arvados/config.yml@, add configuration items for the dispatch service.
12
13
<pre><code class="yaml">
14
Clusters:
15
  uuid_prefix:
16
    CloudVMs:
17
      BootProbeCommand: "mount | grep /mnt/scratch"
18
      SSHPort: "2222"
19
      SyncInterval: 1m
20
      TimeoutIdle: 2m
21
      TimeoutBooting: 10m
22
      TimeoutProbe: 5m
23
      TimeoutShutdown: 30s
24
      ImageID: "image-12345678"
25
      Driver: Azure
26
      DriverParameters:
27 2 Tom Clegg
        SubscriptionID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
28 3 Tom Clegg
        subscription_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX        # not needed after #14745
29 2 Tom Clegg
        ClientID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
30 3 Tom Clegg
        key: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX                    # not needed after #14745 (same value as ClientID)
31 2 Tom Clegg
        ClientSecret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
32 3 Tom Clegg
        secret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX         # not needed after #14745 (same value as ClientSecret)
33 2 Tom Clegg
        TenantID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
34 3 Tom Clegg
        tenant_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX              # not needed after #14745
35 2 Tom Clegg
        CloudEnv: AzurePublicCloud
36 3 Tom Clegg
        cloud_environment: AzurePublicCloud                          # not needed after #14745
37 2 Tom Clegg
        ResourceGroup: zzzzz
38 3 Tom Clegg
        resource_group: zzzzz
39 2 Tom Clegg
        Location: centralus
40 3 Tom Clegg
        region: centralus                                            # not needed after #14745 (same value as Location)
41 2 Tom Clegg
        Network: zzzzz
42
        Subnet: zzzzz-subnet-private
43
        StorageAccount: example
44 3 Tom Clegg
        storage_account: example                                     # not needed after #14745
45 2 Tom Clegg
        BlobContainer: vhds
46 3 Tom Clegg
        blob_container: vhds                                         # not needed after #14745
47 2 Tom Clegg
        DeleteDanglingResourcesAfter: 20
48 3 Tom Clegg
        delete_dangling_resources_after: 20                          # not needed after #14745
49 1 Tom Clegg
    Dispatch:
50
      PrivateKey: "..."
51
      StaleLockTimeout: 1m
52
      PollInterval: 10s
53
      ProbeInterval: 10s
54
      MaxProbesPerSecond: 10
55
    InstanceTypes:
56
      x1lg:
57
        ProviderType: x1.large
58
        VCPUs: 16
59
        RAM: 128G
60
        Scratch: 128G
61
        Price: 1.23
62
    ManagementToken: "example-secret-management-token"
63
    NodeProfiles:
64
      apiserver:                       # references ARVADOS_NODE_PROFILE in environment file (see below).
65
        arvados-dispatch-cloud:
66
          Listen: ":9005"
67
</code></pre>
68
69
Create the host configuration file @/etc/arvados/environment@.
70
71
<pre>
72
ARVADOS_NODE_PROFILE=apiserver
73
</pre>
74
75
h2. Stop crunch-dispatch-slurm
76
77
Stop and disable the crunch-dispatch-slurm service, and uninstall the package to make sure it doesn't start after the next reboot/upgrade.
78
79
<pre>
80
# systemctl stop crunch-dispatch-slurm
81
# systemctl disable crunch-dispatch-slurm
82
# apt-get remove crunch-dispatch-slurm
83
</pre>
84
85
Containers that have already been locked and submitted to SLURM will make their way through the SLURM queue, but newly queued containers will be left for crunch-dispatch-cloud to run.
86
87
h2. Install crunch-dispatch-cloud
88
89
<pre>
90
# apt-get install crunch-dispatch-cloud
91
</pre>
92
93
h2. Verify the service is running
94
95
<pre>
96
$ token="example-secret-management-token"
97
$ curl -H "Authorization: Bearer $token" http://localhost:9005/metrics
98
</pre>
99
100
h2. Verify the service is functional