Project

General

Profile

Actions

Feature #18324

closed

LSF support for requesting node with CUDA support

Added by Peter Amstutz over 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-
Release relationship:
Auto

Description

https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=features-enabling-jobs-use-gpu-resources

According to this, GPUs can be configured at the job level, but also at the queue level, so depending on the site, you might need to request a specific queue.

Customer email:

these are the parameters we're using to request GPUs:

-gpu "num=1:j_exclusive=yes"

The exclusive part should probably be configurable as it's not mandatory, but on our cluster the default is that GPUs are shared, so we recommend our users to request them exclusively.

Maybe having a parameter for the GPU string with a placeholder for the number of GPUs similar to the Memory or CPUs.

Proposed design:

Add new option "LSF.BsubCUDAArguments". It is appended to the end of "BsubArgumentsList" when CUDA.DeviceCount > 0 in the container runtime constraints. Introduce a new template variable %G with for the value of DeviceCount.

Example:

BsubCUDAArguments: ["-gpu", "num=%G:j_exclusive=yes"]


Subtasks 1 (0 open1 closed)

Task #18598: review 18324-lsf-gpuResolvedPeter Amstutz01/05/2022Actions

Related issues

Related to Arvados Epics - Idea #15957: GPU supportResolved10/01/202103/31/2022Actions
Actions

Also available in: Atom PDF