Story #18179

Better spot instance support

Added by Peter Amstutz 29 days ago. Updated 29 days ago.

Status:
New
Priority:
Normal
Assigned To:
-
Target version:
-
Start date:
11/01/2021
Due date:
03/31/2022
% Done:

0%

Estimated time:
Story points:
-
Release:
Release relationship:
Auto

Description

  • Currently sitewide on/off choice, can't choose per-workflow
  • Have to duplicate instance types in the config (obnoxious)
  • Records the wrong price (uses price from instance type config not actual information from the cloud)
  • Scheduling choices are too narrow, should be able to request different node types when the node you want isn't available
    • Could we query spot prices on the fly to make scheduling decisions
    • Try bigger instance types but only bid the spot price for the smallest node type
    • Should eventually escalate to an on-demand instance if spot instance isn't available
  • User should be able to communicate cost tolerance
  • Want to try other availability zones, but requires feature of Keepstore running on compute nodes (#16516)
  • Need better way to handle spot instance shutdown
    • Maybe just always retry on a regular cost node
  • Consider shutting down spot instances after a job because there is a timer?
    • Need to research this more
  • Can the VM be frozen / restored?

Related issues

Related to Arvados - Feature #18180: Ability to control use of spot instances on a per-workflow basisNew

Related to Arvados - Feature #18181: Ability to specify a % of compute instance price that user is willing to go over from cheapestNew

Related to Arvados - Feature #17695: [costanalyzer] make an accurate report for spot instances on AWSIn Progress

Blocked by Arvados - Feature #18205: [api] [cloud] add live compute instance price to container recordNew

History

#1 Updated by Peter Amstutz 29 days ago

  • Start date set to 11/01/2021
  • Due date set to 03/31/2022

#2 Updated by Peter Amstutz 29 days ago

  • Description updated (diff)

#3 Updated by Peter Amstutz 29 days ago

  • Related to Feature #18180: Ability to control use of spot instances on a per-workflow basis added

#4 Updated by Peter Amstutz 29 days ago

  • Related to Feature #18181: Ability to specify a % of compute instance price that user is willing to go over from cheapest added

#5 Updated by Ward Vandewege 29 days ago

  • Description updated (diff)

#6 Updated by Ward Vandewege 29 days ago

  • Related to Feature #17695: [costanalyzer] make an accurate report for spot instances on AWS added

#7 Updated by Ward Vandewege 23 days ago

  • Blocked by Feature #18205: [api] [cloud] add live compute instance price to container record added

Also available in: Atom PDF