Feature #18656

expression to dynamically request number of gpus

Added by Peter Amstutz 4 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
CWL
Target version:
Start date:
03/01/2022
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

Can use a CWL expression for min and max GPUs.

Also need to re-introduce matching a specific list of hardware capabilities.


Subtasks

Task #18708: Review 18656-dynamic-gpu-reqResolvedPeter Amstutz

History

#1 Updated by Peter Amstutz 4 months ago

  • Release set to 46
  • Description updated (diff)

#2 Updated by Peter Amstutz 4 months ago

  • Assigned To set to Peter Amstutz

#3 Updated by Peter Amstutz 3 months ago

  • Target version changed from 2022-02-16 sprint to 2022-03-02 sprint

#4 Updated by Peter Amstutz 3 months ago

  • Target version changed from 2022-03-02 sprint to 2022-03-16 sprint

#5 Updated by Peter Amstutz 3 months ago

  • Target version changed from 2022-03-16 sprint to 2022-03-02 sprint

#6 Updated by Peter Amstutz 3 months ago

  • Status changed from New to In Progress

#7 Updated by Peter Amstutz 3 months ago

18656-dynamic-gpu-req @ 926c011fb4f7a4d7722b88a19afed51c5d4bd1c4

  • Update cwltool version
  • Update extension
  • Update tests

developer-run-tests: #2937

#8 Updated by Peter Amstutz 3 months ago

  • Target version changed from 2022-03-02 sprint to 2022-03-16 sprint

#9 Updated by Lucas Di Pentima 3 months ago

Sorry for the delay! Just a couple of comments:

  • There's documentation referencing the old keywords that need updating.
  • At file sdk/cwl/arvados_cwl/arvcontainer.py L298: There's a resources.get("cudaDeviceCount", 1) call, but cudaDeviceCount doesn't exist without its Max/Min suffix, correct?

#10 Updated by Peter Amstutz 3 months ago

Lucas Di Pentima wrote:

Sorry for the delay! Just a couple of comments:

  • There's documentation referencing the old keywords that need updating.

Good catch, fixed

  • At file sdk/cwl/arvados_cwl/arvcontainer.py L298: There's a resources.get("cudaDeviceCount", 1) call, but cudaDeviceCount doesn't exist without its Max/Min suffix, correct?

That's the "resources" object which has the actual resources that (will be) allocated, which is separate from the min/max request.

Right now Arvados is dumb about this and doesn't actually do anything useful with min/max ranges (it just requests the "min" value) but that is a different issue (#16316).

18656-cuda-expr-request @ 2dbbd648655ceb248dafff72e659c47277d11539

#11 Updated by Lucas Di Pentima 3 months ago

LGTM, thanks!

#12 Updated by Peter Amstutz 3 months ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF