Project

General

Profile

Migrating from arvados-node-manager to arvados-dispatch-cloud » History » Revision 2

Revision 1 (Tom Clegg, 02/07/2019 08:07 PM) → Revision 2/22 (Tom Clegg, 02/11/2019 04:05 PM)

h1. Migrating from arvados-node-manager to crunch-dispatch-cloud 

 {{toc}} 

 h2. Choose a node 

 The dispatch service can run on any host that can connect to the Arvados API service, the cloud provider's API, and the SSH service on cloud VMs. In the following example it runs on the same node as the API server and controller. 

 

 h2. Update cluster configuration file 

 In @/etc/arvados/config.yml@, add configuration items for the dispatch service. 

 <pre><code class="yaml"> 
 Clusters: 
   uuid_prefix: 
     CloudVMs: 
       BootProbeCommand: "mount | grep /mnt/scratch" 
       SSHPort: "2222" 
       SyncInterval: 1m 
       TimeoutIdle: 2m 
       TimeoutBooting: 10m 
       TimeoutProbe: 5m 
       TimeoutShutdown: 30s 
       ImageID: "image-12345678" 
       Driver: Azure 
       DriverParameters: 
         # before #14745: 
         subscription_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX 
         key: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX 
         secret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 
         tenant_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX 
         cloud_environment: AzurePublicCloud 
         resource_group: zzzzz 
         region: centralus 
         network: zzzzz 
         subnet: zzzzz-subnet-private 
         storage_account: example 
         blob_container: vhds 
         delete_dangling_resources_after: 20 
         # after #14745: 
         SubscriptionID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX 
         ClientID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX 
         ClientSecret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 
         TenantID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX 
         CloudEnv: AzurePublicCloud 
         ResourceGroup: zzzzz 
         Location: centralus 
         Network: zzzzz 
         Subnet: zzzzz-subnet-private 
         StorageAccount: example 
         BlobContainer: vhds 
         DeleteDanglingResourcesAfter: 20 ... 
     Dispatch: 
       PrivateKey: "..." 
       StaleLockTimeout: 1m 
       PollInterval: 10s 
       ProbeInterval: 10s 
       MaxProbesPerSecond: 10 
     InstanceTypes: 
       x1lg: 
         ProviderType: x1.large 
         VCPUs: 16 
         RAM: 128G 
         Scratch: 128G 
         Price: 1.23 
     ManagementToken: "example-secret-management-token" 
     NodeProfiles: 
       apiserver:                         # references ARVADOS_NODE_PROFILE in environment file (see below). 
         arvados-dispatch-cloud: 
           Listen: ":9005" 
 </code></pre> 

 Create the host configuration file @/etc/arvados/environment@. 

 <pre> 
 ARVADOS_NODE_PROFILE=apiserver 
 </pre> 

 

 h2. Stop crunch-dispatch-slurm 

 Stop and disable the crunch-dispatch-slurm service, and uninstall the package to make sure it doesn't start after the next reboot/upgrade. 

 <pre> 
 # systemctl stop crunch-dispatch-slurm 
 # systemctl disable crunch-dispatch-slurm 
 # apt-get remove crunch-dispatch-slurm 
 </pre> 

 Containers that have already been locked and submitted to SLURM will make their way through the SLURM queue, but newly queued containers will be left for crunch-dispatch-cloud to run. 

 h2. Install crunch-dispatch-cloud 

 <pre> 
 # apt-get install crunch-dispatch-cloud 
 </pre> 

 h2. Verify the service is running 

 <pre> 
 $ token="example-secret-management-token" 
 $ curl -H "Authorization: Bearer $token" http://localhost:9005/metrics 
 </pre> 

 h2. Verify the service is functional