Project

General

Profile

Bug #18101

Updated by Ward Vandewege over 2 years ago

When using spot instances on AWS, it is common to see a message like this in the a-d-c logs: 

 <pre> 
 InsufficientInstanceCapacity: We currently do not have sufficient m5.8xlarge capacity in the Availability Zone you requested (us-east-1a). Our system will be working on provisioning additional capacity. You can currently get m5.8xlarge capacity by not specifying an Availability Zone in your request or choosing us-east-1b, us-east-1c, us-east-1d, us-east-1f. 
 </pre> 

 Currently, a-d-c requests compute instances with a specific subnet, which is tied to one availability zone, and we recommend that that zone is the same as the one the keepstores run in. 

 Traffic between availability zones in the same AWS region costs $0.02/GB (cf. https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer_within_the_same_AWS_Region). 

 Once #16516 (run Keepstore on the compute node) is implemented, it will be advantageous to configure a cluster on AWS where (spot) spot instances are requested across multiple (all?) (all) availability zones in a region. When a spot instance runs in a different AZ, there would be an extra cost of $0.02/GB for all traffic to/from the permanent EC2 instances (e.g. API server), but that traffic should be minimal (mostly crunchstat-summary log traffic). 

 The Arvados configuration should support multiple subnets: 

 <pre> 
 CloudVMs: 
   Driver: ec2 
   DriverParameters: 
     SubnetIDs: ['subnet-...', 'subnet-...'] 
 </pre> 

 Alternatively, it would be nice if we could pass *no* AZ in the request; I'm not sure how that would work in the AWS sdk, presumably you would still have to supply a desired subnet. This needs a bit of investigation.

Back