AWS Usage: Cost and Availability Impacts of Auto Scaling and ELB
Both multi-AZ Auto Scaling and ELB are designed by AWS to insure high availability. As AWS says: “we provide a way to achieve a balanced group of EC2 instances that are spread across multiple Availability Zones for high availability, and provide a single entity for you to manage. In addition, we mitigate the problem of zones becoming unavailable or congested by temporarily allocating capacity in other zones and rebalancing the group back over time.”
The issue that arises with multi-AZ Auto Scaling and ELB is that users can lose control of where their instance launches and where the resource is assigned. AWS states that “availability zone balance triumphs everything else.”
Although it may be hidden in the AWS fine-print, the automatic reallocation of resources is important to keep in mind when users employ either Reserved Instances (RI) or Spot instances. As will become clear, it may directly and significantly decrease availability and/or increase user costs.
Before delving deeper into this issue, let’s review how RI and Spot instance pricing function. When purchasing a RI reservation, users must specify platform (e.g. Windows SQL Server Standard), size (e.g. c1.xlarge), AZ (e.g. us-east-1b), VPC (in a VPC or not), and Tenancy (dedicated or not). All of these choices must be consistent for a newly launched instance to utilize a RI reservation. If they are not, you will be left paying for an On-Demand instance while your RI reservation sits idle.
When bidding for a Spot instance, the user is guaranteed availability as long as the Spot price does not exceed the bid price. Once the Spot price exceeds the bid price, the instance is terminated.
These are the operating constraints: exact match to use a RI reservation and price maximum for Spot instance.
Impacts for RI:
This means that, when using RI, the automatic balancing and/or response to AZ failure/congestion will occur irrespective of cost or RI placement. This means that users could be subject to a potentially cost-unfavorable shift in resources from one AZ with unused RI reservations to another AZ without RI available reservations.
As a practical matter, we have seen this happen to CloudCheckr customers. We have seen AWS users who have purchased a Heavy RI but, for one reason or another, seen it bumped into another AZ. The users are then left paying double — paying for the unused Heavy and paying for On-Demand in another AZ.
In summary, the risk of using RI with multi-AZ Auto Scaling or ELB is one of increased and inefficient spend.
Impacts for Spot:
The prioritization of balance is even more critical when thinking about using Spot
instances. Users who employ Spot instances balance the risk of termination for the benefit of lower cost. Users can mitigate the termination risk by bidding above market.
However, most users are not aware that when Spot prices differ between AZ, Auto Scaling (or ELB) will not look for instances in the cheapest AZ or even instances in an AZ with prices below your bid price. Users could be bidding above the Spot price in one AZ but still lose availability because Auto Scaling sends the resource request to another AZ with a Spot price above the bid. This is because resource dispersion and balance is the priority in the AWS hierarchy. AWS is aware of this issue but, as of today, has not offered a solution to address it.
Consequently, the risk for Spot users is that Auto Scaling or ELB may actually decrease resource availability.
How to Mitigate these Risks:
The solutions are imperfect.
RI users should recognize that multi-AZ usage requires balancing reservations across AZ. They need to understand their typical resource utilization patterns to assess and optimize the correct balance of RI reservations and non-RI instances.
Spot users need to be aware of pricing patterns and not rely exclusively on Spot instances. Prices spikes and availability issues arise with unfortunate frequency and Spot users need to consider their usage and availability requirements carefully.
These assessments need to be performed. Users can undertake them either manually, automate much of the process with the free CloudCheckr solution, or employ other 3rd party solutions.