Project

General

Profile

Actions

Feature #19263

open

Support preemptible containers on LSF

Added by Peter Amstutz almost 2 years ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

HPC systems can support low priority / preemptible jobs similar to cloud. One way is by having a separate "preemptible" queue.

Propose that, similar to GPU support, if a job is marked "preemptible", the Arvados configuration can supply a command line fragment that submits the job to a site-specific "preemptible" queue.

The dispatcher must also recognize when a job has been preempted and behave appropriately (e.g. not try to re-submit the job). I.e. when LSF preempts a job, it can be configured to suspend it with SIGSTOP and keep it in in memory to resume later. We don't want Arvados to think a suspended job has failed and try to cancel it.

Actions #1

Updated by Peter Amstutz almost 2 years ago

  • Description updated (diff)
Actions #2

Updated by Peter Amstutz almost 2 years ago

  • Subject changed from Support preemptible on HPC to Support preemptible containers on HPC
Actions #4

Updated by Peter Amstutz over 1 year ago

  • Subject changed from Support preemptible containers on HPC to Support preemptible containers on LSF
Actions #5

Updated by Peter Amstutz over 1 year ago

  • Target version set to 2022-10-12 sprint
Actions #6

Updated by Peter Amstutz over 1 year ago

  • Description updated (diff)
Actions #7

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-10-12 sprint to 2022-10-26 sprint
Actions #8

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-10-26 sprint to 2022-11-09 sprint
Actions #9

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-11-09 sprint to 2022-11-23 sprint
Actions #10

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-11-23 sprint to 2022-12-07 Sprint
Actions #11

Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2022-12-07 Sprint to 2022-12-21 Sprint
Actions #12

Updated by Peter Amstutz over 1 year ago

  • Target version deleted (2022-12-21 Sprint)
Actions #13

Updated by Peter Amstutz about 1 year ago

  • Release set to 60
Actions #14

Updated by Peter Amstutz about 2 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF