The cloud is a very useful tool for a number of different purposes. Most notably, it removes the heavy burden of startup costs associated with needing new hardware when beginning or expanding a project. It can be used as your baseline resources for all everyday purposes, or simply something you tap into when you need extra computing power under a heavy load. Whatever your needs, there is no short supply of providers, nor is there a lack of options in terms of server power, speed, memory, etc. However, one currently underutilized type of cloud instance is also the best deal available, the Discount Cloud Instance. Throughout this three-part series blog post, I am going to explore the following: What these instances are, what the benefits and drawbacks are of utilizing them, which cloud providers are offering these instances and how they compare, and finally a discussion on how Azure can learn from the other providers to make themselves a more appealing cloud provider.
What are Discount Cloud Instances?
First, we should define what a Discount Cloud Instance is, as this is not a technical term. Every provider that offers such an option has named them something unique. The first available include Amazon’s AWS Spot Instances, followed later by Google’s Preemptible VMs, and now we have Microsoft Azure’s Low Priority VMs. Each provider offers different details to their Discounted Cloud Instances, but the benefit is the same: significant discounts over On-Demand pricing. These discounts can be almost shocking, generally around an 80% price drop, which is little more than the cost of the electricity needed to power these machines. These instances are available when there is underutilized space on an active server so that the provider can recoup the cost associated with operating the server.
When a cloud instance is requested, the provider can give the requesting user access to a Virtual Space on a server that is already in operation, but when no instance of the requested size is available, they must now turn on a new server and give the user access. In the latter case, the provider has now created a situation where a server is on and utilizing 70%+ of its total possible power consumption, while only providing 10% or less of its capabilities. This causes massive overhead and profit loss for the company if other users do not utilize the remaining space of the server soon. In such a case, the company can attempt to recoup some of their lost revenue on the server by selling off the remainder of its space very cheaply, incentivizing its use over a different server type. This is beneficial to the company in the time when no users are willing to pay full price for the space, but now they must deal with the issue where they have sold the server space off cheaply and a full-price paying customer is requesting access to a now-full server.
This is the drawback of attempting to utilize these instances: reliability. They are designed to recoup costs until a customer comes that is willing to pay full price. Once they do, the provider then tells one of the users utilizing a discount server that their time is expiring and they are about to be removed from the server. This is the tradeoff discount instance users are making to receive these prices, the knowledge that these instances are inherently unreliable and can be taken away with only 30 seconds notice if any at all.
When are Discount Cloud Instances useful?
Despite the reliability of Discount Cloud Instances being nonexistent, they can still be very useful for the right types of jobs. Consider the following example:
Let’s assume you work for a company on a biweekly pay period system. Let’s further assume that your next pay period ends on Friday, September 30. Presumably, you’ll receive your next payment about a week later on Friday, October 7. What happens during this week? In the interim time, your company verifies you worked the time you claimed, removes any sick time or vacation time you used in this period, gives you vacation and sick time that you’ve accrued, etc. In short, there are quite a few “little” things that go into it, iterated over every employee in the company, and this work has to be complete sometime in that week. If we can automate most of this work, we are left with a program that needs to run and complete in 1 week’s time. For something so computationally trivial, each employee can be handled individually very quickly, but the total computational cost increases linearly with the number of employees, so dealing with an entire company in that week can add up to a noticeable amount of server use. However, if we choose to deal with employees individually or in small batches, we can complete everything in separated chunks of effort and not have to do so in one concentrated effort.
We have now created a common use case with an easily parallelized problem that requires some amount of non-sequential computing power that we must complete sometime within a week. This is exactly the type of problem we can solve cheaply using Discount Cloud Instances. As discussed earlier, these instances have no guaranteed availability and thus come with a far reduced price tag. Given that we do not need any particular time available, some goal of computing power in the given timeframe, we can write a program to obtain cheap instances as soon as they are available, get some amount of work done, and ensure all the work completed is saved before the server instance is lost.
While this example used some very specific details, applying the same logic and process to other situations is trivial. Another such example is health care providers billing insurers. They could save all bills going to one specific insurer for a week or so at a time, and then use this same process to send them all of the bills accumulated. The benefit to these situations is overhead cost savings. Assuming a company working from the cloud either needs to get more computing resources when doing jobs such as these, or they already purchase more resources than they require and could reduce that amount by separating out batch processes, being able to save upwards of 80% of the cost on these resources can amount to a significant sum over time.
Azure Low Priority Virtual Machines
To provide a specific example of these types of instances, the following is a brief description of Azure Low Priority VMs:
Azure’s Low Priority VMs have been available since May 2017 and so are still very new. As of this article, Azure has only made use of Low Priority VMs available through using Batch, an Azure offering specializing in workloads that can be trivially parallelized, such as the example provided above. Azure Batch allows users to specify how many of the servers they are requesting they would like to be Low Priority VMs if they are available. While completing the job given, Batch will attempt to maintain the specified ratio of Low Priority to Standard instances and replace lost instances when more become available. While very limited in scale so far, setting up their Discount Instances in this way takes care of all of the work for you to make it very easy to try.
For the next post in this series, we will discuss the offerings of Microsoft‘s competitors and how they each compare with one another.