Project

General

Profile

Actions

Support #18606

closed

GPU support on tordo cluster

Added by Peter Amstutz about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Due date:
Story points:
-

Description

Support GPUs on tordo:

  • Build compute image with nvidia support
  • Add instance type g4dn.xlarge to configuration
  • Check that "container shell" feature is enabled on tordo

Related issues

Related to Arvados - Feature #18325: Option to include CUDA tooling in cloud compute imageResolvedWard Vandewege12/20/2021Actions
Related to Arvados Epics - Idea #15957: GPU supportResolved10/01/202103/31/2022Actions
Actions #1

Updated by Peter Amstutz about 2 years ago

  • Description updated (diff)
  • Subject changed from GPU support on dev cluster to GPU support on tordo cluster
Actions #2

Updated by Peter Amstutz about 2 years ago

  • Description updated (diff)
Actions #3

Updated by Ward Vandewege about 2 years ago

  • Status changed from New to In Progress
Actions #4

Updated by Ward Vandewege about 2 years ago

  • Related to Feature #18325: Option to include CUDA tooling in cloud compute image added
Actions #5

Updated by Ward Vandewege about 2 years ago

Actions #6

Updated by Ward Vandewege about 2 years ago

The new image is in place, and this entry was added to `config.yml`:

      g4dnxlarge:
        ProviderType: g4dn.xlarge
        VCPUs: 4
        RAM: 16GiB
        IncludedScratch: 125GB
        Price: 0.526
        CUDA:
          DriverVersion: "11.4" 
          HardwareCapability: "7.5" 
          DeviceCount: 1

Diagnostics are running at

https://workbench.tordo.arvadosapi.com/container_requests/tordo-xvhdp-g5nv9tz7ez9vn2a#Status

Container shell support was already enabled.

Actions #7

Updated by Ward Vandewege about 2 years ago

  • Status changed from In Progress to Resolved

The new image is in place, and this entry was added to `config.yml`:

      g4dnxlarge:
        ProviderType: g4dn.xlarge
        VCPUs: 4
        RAM: 16GiB
        IncludedScratch: 125GB
        Price: 0.526
        CUDA:
          DriverVersion: "11.4" 
          HardwareCapability: "7.5" 
          DeviceCount: 1

Diagnostics completed successfully at

https://workbench.tordo.arvadosapi.com/container_requests/tordo-xvhdp-g5nv9tz7ez9vn2a#Status

Container shell support was already enabled.

Actions

Also available in: Atom PDF