Project

General

Profile

Actions

Support #18606

closed

GPU support on tordo cluster

Added by Peter Amstutz 6 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

Support GPUs on tordo:

  • Build compute image with nvidia support
  • Add instance type g4dn.xlarge to configuration
  • Check that "container shell" feature is enabled on tordo

Related issues

Related to Arvados - Feature #18325: Option to include CUDA tooling in cloud compute imageResolvedWard Vandewege12/20/2021

Actions
Related to Arvados Epics - Story #15957: GPU supportResolved10/01/202103/31/2022

Actions
Actions #1

Updated by Peter Amstutz 6 months ago

  • Description updated (diff)
  • Subject changed from GPU support on dev cluster to GPU support on tordo cluster
Actions #2

Updated by Peter Amstutz 6 months ago

  • Description updated (diff)
Actions #3

Updated by Ward Vandewege 6 months ago

  • Status changed from New to In Progress
Actions #4

Updated by Ward Vandewege 6 months ago

  • Related to Feature #18325: Option to include CUDA tooling in cloud compute image added
Actions #5

Updated by Ward Vandewege 6 months ago

Actions #6

Updated by Ward Vandewege 6 months ago

The new image is in place, and this entry was added to `config.yml`:

      g4dnxlarge:
        ProviderType: g4dn.xlarge
        VCPUs: 4
        RAM: 16GiB
        IncludedScratch: 125GB
        Price: 0.526
        CUDA:
          DriverVersion: "11.4" 
          HardwareCapability: "7.5" 
          DeviceCount: 1

Diagnostics are running at

https://workbench.tordo.arvadosapi.com/container_requests/tordo-xvhdp-g5nv9tz7ez9vn2a#Status

Container shell support was already enabled.

Actions #7

Updated by Ward Vandewege 6 months ago

  • Status changed from In Progress to Resolved

The new image is in place, and this entry was added to `config.yml`:

      g4dnxlarge:
        ProviderType: g4dn.xlarge
        VCPUs: 4
        RAM: 16GiB
        IncludedScratch: 125GB
        Price: 0.526
        CUDA:
          DriverVersion: "11.4" 
          HardwareCapability: "7.5" 
          DeviceCount: 1

Diagnostics completed successfully at

https://workbench.tordo.arvadosapi.com/container_requests/tordo-xvhdp-g5nv9tz7ez9vn2a#Status

Container shell support was already enabled.

Actions

Also available in: Atom PDF