Bug #18101

[a-d-c] [AWS] add option to spin up (spot) instances in more/all availability zones in the region

Added by Ward Vandewege 3 months ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

When using spot instances on AWS, it is common to see a message like this in the a-d-c logs:

InsufficientInstanceCapacity: We currently do not have sufficient m5.8xlarge capacity in the Availability Zone you requested (us-east-1a). Our system will be working on provisioning additional capacity. You can currently get m5.8xlarge capacity by not specifying an Availability Zone in your request or choosing us-east-1b, us-east-1c, us-east-1d, us-east-1f.

Currently, a-d-c requests compute instances with a specific subnet, which is tied to one availability zone, and we recommend that that zone is the same as the one the keepstores run in.

Traffic between availability zones in the same AWS region costs $0.02/GB (cf. https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer_within_the_same_AWS_Region).

Once #16516 (run Keepstore on the compute node) is implemented, it will be advantageous to configure a cluster on AWS where (spot) instances are requested across multiple (all?) availability zones in a region. When a spot instance runs in a different AZ, there would be an extra cost of $0.02/GB for all traffic to/from the permanent EC2 instances (e.g. API server), but that traffic should be minimal (mostly crunchstat-summary log traffic).

The Arvados configuration should support multiple subnets:

CloudVMs:
  Driver: ec2
  DriverParameters:
    SubnetIDs: ['subnet-...', 'subnet-...']

Alternatively, it would be nice if we could pass no AZ in the request; I'm not sure how that would work in the AWS sdk, presumably you would still have to supply a desired subnet. This needs a bit of investigation.


Related issues

Blocked by Arvados Epics - Story #16516: Run Keepstore on local compute nodesIn Progress10/01/202111/30/2021

History

#1 Updated by Ward Vandewege 3 months ago

  • Description updated (diff)
  • Subject changed from [a-d-c] [AWS] add option to spin up spot instances in all availability zones in the region to [a-d-c] [AWS] add option to spin up spot instances in more/all availability zones in the region

#2 Updated by Ward Vandewege 3 months ago

  • Description updated (diff)
  • Subject changed from [a-d-c] [AWS] add option to spin up spot instances in more/all availability zones in the region to [a-d-c] [AWS] add option to spin up (spot) instances in more/all availability zones in the region

#3 Updated by Ward Vandewege 3 months ago

  • Blocked by Story #16516: Run Keepstore on local compute nodes added

Also available in: Atom PDF