Project

General

Profile

Actions

Idea #7478

closed

[Node Manager] Creates compute nodes using AWS spot instances

Added by Brett Smith over 8 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Node Manager
Target version:
Start date:
05/25/2018
Due date:
Story points:
3.0
Release:
Release relationship:
Auto

Description

Functional requirements:

  • Requests spot instances, waits for those requests to be fulfilled (minutes?) and launches the instances as compute nodes.
  • For the initial implementation, just bid the standard price rather than trying to design a fancy bidding strategy. We'll still get the cost benefit as long as the spot price is lower.
  • When the bid price is exceeded (hopefully rarely/never), we're likely to lose our entire fleet of compute instances and, perhaps, not be able to start any until demand subsides enough to cause the spot prices to go down. In this scenario, we'll need some configuration knobs to control whether to fall back to on-demand instances, wait for spot instances to become available again, etc.

Implementation details:

  • Enhance libcloud to support AWS spot instances. (Done)
  • API server will have a config option which specifies whether spot instances are enabled or not. If they are enabled, child containers will get created with the spot instances scheduling parameter set.
  • Spot instances will be their own instance type. Node manager needs to manage instance types separately from the libcloud-specified instance type that it currently does. Node manager will use the new libcloud support to request spot instances when needed. No arvados-cwl-runner required.
  • Nodemanager spot instance handling:
    • [Size <name>] sections on the config use instance types as <name>: decouple that and add it as instance_type attribute inside the section leaving <name> for description purposes only
    • Each size section will have a boolean “preemptable” attribute, defaulting to False.
    • Update ServerCalculator & related code so that the instance type is not the unique id of a "nodesize"
    • Update ec2 driver to pass the the ex_spot_marke=True parameter on the libcloud create_node call
  • Update documentation explaining nodemanager config file format changes

Subtasks 1 (0 open1 closed)

Task #13461: Review 7478-anm-spot-instancesResolvedPeter Amstutz05/25/2018Actions

Related issues

Related to Arvados - Bug #13649: c-d-s doesn't request a preemptible instance when it shouldResolvedLucas Di Pentima06/21/2018Actions
Blocked by Arvados - Idea #13051: Spike - Investigate/prototype AWS spot instance support in libcloudResolvedLucas Di Pentima04/18/2018Actions
Actions

Also available in: Atom PDF