Project

General

Profile

Migrating from arvados-node-manager to arvados-dispatch-cloud » History » Version 2

Tom Clegg, 02/11/2019 04:05 PM

1 1 Tom Clegg
h1. Migrating from arvados-node-manager to crunch-dispatch-cloud
2
3
{{toc}}
4
5
h2. Choose a node
6
7
The dispatch service can run on any host that can connect to the Arvados API service, the cloud provider's API, and the SSH service on cloud VMs. In the following example it runs on the same node as the API server and controller.
8
9
h2. Update cluster configuration file
10
11
In @/etc/arvados/config.yml@, add configuration items for the dispatch service.
12
13
<pre><code class="yaml">
14
Clusters:
15
  uuid_prefix:
16
    CloudVMs:
17
      BootProbeCommand: "mount | grep /mnt/scratch"
18
      SSHPort: "2222"
19
      SyncInterval: 1m
20
      TimeoutIdle: 2m
21
      TimeoutBooting: 10m
22
      TimeoutProbe: 5m
23
      TimeoutShutdown: 30s
24
      ImageID: "image-12345678"
25
      Driver: Azure
26
      DriverParameters:
27 2 Tom Clegg
        # before #14745:
28
        subscription_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
29
        key: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
30
        secret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
31
        tenant_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
32
        cloud_environment: AzurePublicCloud
33
        resource_group: zzzzz
34
        region: centralus
35
        network: zzzzz
36
        subnet: zzzzz-subnet-private
37
        storage_account: example
38
        blob_container: vhds
39
        delete_dangling_resources_after: 20
40
        # after #14745:
41
        SubscriptionID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
42
        ClientID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
43
        ClientSecret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
44
        TenantID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
45
        CloudEnv: AzurePublicCloud
46
        ResourceGroup: zzzzz
47
        Location: centralus
48
        Network: zzzzz
49
        Subnet: zzzzz-subnet-private
50
        StorageAccount: example
51
        BlobContainer: vhds
52
        DeleteDanglingResourcesAfter: 20
53 1 Tom Clegg
    Dispatch:
54
      PrivateKey: "..."
55
      StaleLockTimeout: 1m
56
      PollInterval: 10s
57
      ProbeInterval: 10s
58
      MaxProbesPerSecond: 10
59
    InstanceTypes:
60
      x1lg:
61
        ProviderType: x1.large
62
        VCPUs: 16
63
        RAM: 128G
64
        Scratch: 128G
65
        Price: 1.23
66
    ManagementToken: "example-secret-management-token"
67
    NodeProfiles:
68
      apiserver:                       # references ARVADOS_NODE_PROFILE in environment file (see below).
69
        arvados-dispatch-cloud:
70
          Listen: ":9005"
71
</code></pre>
72
73
Create the host configuration file @/etc/arvados/environment@.
74
75
<pre>
76
ARVADOS_NODE_PROFILE=apiserver
77
</pre>
78
79
h2. Stop crunch-dispatch-slurm
80
81
Stop and disable the crunch-dispatch-slurm service, and uninstall the package to make sure it doesn't start after the next reboot/upgrade.
82
83
<pre>
84
# systemctl stop crunch-dispatch-slurm
85
# systemctl disable crunch-dispatch-slurm
86
# apt-get remove crunch-dispatch-slurm
87
</pre>
88
89
Containers that have already been locked and submitted to SLURM will make their way through the SLURM queue, but newly queued containers will be left for crunch-dispatch-cloud to run.
90
91
h2. Install crunch-dispatch-cloud
92
93
<pre>
94
# apt-get install crunch-dispatch-cloud
95
</pre>
96
97
h2. Verify the service is running
98
99
<pre>
100
$ token="example-secret-management-token"
101
$ curl -H "Authorization: Bearer $token" http://localhost:9005/metrics
102
</pre>
103
104
h2. Verify the service is functional