Actions
Feature #19982
openAbility to know when a container died because of spot instance reclamation and option to resubmit
Status:
In Progress
Priority:
Normal
Assigned To:
Category:
CWL
Target version:
Start date:
Due date:
% Done:
0%
Estimated time:
(Total: 0.00 h)
Story points:
3.0
Description
New arvados-cwl-runner behavior when spot instances are enabled
- When submitting spot instance, don't retry
- Ability to detect when a container failed due to reclaimed spot instance (#19961)
- Exit code to indicate workflow failed due to spot instance
- Option to automatically re-submit as reserved instance
Related issues
Updated by Peter Amstutz 11 months ago
- Blocked by Feature #19961: Detect and log spot instance interruption notices added
Updated by Peter Amstutz 11 months ago
- Category changed from CWL to Crunch
- Description updated (diff)
Updated by Peter Amstutz 11 months ago
- Target version changed from To be groomed to To be scheduled
Updated by Peter Amstutz 10 months ago
- Related to Feature #19975: Option to re-submit container with higher memory request if previous job was killed and crunchstat shows >90% memory usage added
Updated by Peter Amstutz 10 months ago
- Related to Feature #19974: Option to re-submit preemptible jobs to reserved nodes when previous attempt was interrupted added
Updated by Peter Amstutz 8 months ago
- Related to Story #18179: Better spot instance support added
Updated by Peter Amstutz 5 months ago
- Target version changed from To be scheduled to Development 2023-08-02 sprint
Updated by Peter Amstutz 4 months ago
- Target version changed from Development 2023-08-02 sprint to Development 2023-08-16
Updated by Peter Amstutz 4 months ago
- Target version changed from Development 2023-08-16 to Development 2023-08-30
Updated by Peter Amstutz 3 months ago
- Target version changed from Development 2023-08-30 to Development 2023-09-13 sprint
Updated by Brett Smith 3 months ago
- Related to Bug #20606: Unstartable preemptible:true containers should not be reused by non-retryable preemptible:false requests added
Updated by Brett Smith 3 months ago
We should consider undoing or narrowing the reuse changes we made in #20606 after we implement this. If Arvados gets better about retrying, then odds go up that the reuse narrowing is more likely to be wasteful than helpful.
Updated by Peter Amstutz 3 months ago
- Target version changed from Development 2023-09-13 sprint to Development 2023-09-27 sprint
Updated by Peter Amstutz 2 months ago
- Target version changed from Development 2023-09-27 sprint to Development 2023-10-11 sprint
Updated by Peter Amstutz about 2 months ago
- Target version changed from Development 2023-10-11 sprint to Development 2023-10-25 sprint
Updated by Peter Amstutz about 2 months ago
- Target version changed from Development 2023-10-25 sprint to Development 2023-11-08 sprint
Updated by Peter Amstutz about 1 month ago
- Target version changed from Development 2023-11-08 sprint to Development 2023-11-29 sprint
Updated by Peter Amstutz 11 days ago
- Target version changed from Development 2023-11-29 sprint to Development 2024-01-03 sprint
Actions