On-Prem vs the Cloud: Comparing the Cost of Configuring for High Availability
In this article, compare the cost of configuring for high availability.
Join the DZone community and get the full member experience.Join For Free
Any way you look at it, configuring SQL Server for high availability involves the cost of redundant infrastructure. If you need your SQL Server data to be available 99.99% of the time and the infrastructure supporting your database goes offline — for whatever reason — you need backup infrastructure that can take over immediately.
But the costs of deploying a high availability infrastructure on-premises vs. in the cloud are very different and it’s important to understand those differences in order to achieve the high availability assurances you seek at the optimal cost to your organization.
Let’s start with the basics: If you’re deploying any production SQL Server infrastructure on prem you’ll need personnel to administer and maintain that infrastructure. If you build your backup SQL Server infrastructure in a remote location – always a good idea if you’re trying to ensure HA in a region prone to hurricanes or earthquakes — then your support costs may be even higher as you may need personal at that secondary location to manage and administer that infrastructure. Add the cost of a dedicated high-speed network link, which you’ll need to ensure the ongoing replication of data between the sites.
Contrast that to the cloud, where neither the personnel nor inter-site networking infrastructure issues are yours to worry about. The cloud service providers employ a full staff of experts that can administer and maintain the compute, storage, and network resources you’ll need—and they can provide all those services at a much lower price point than you can.
Then, consider the infrastructure itself. If you’re building a HA solution on-prem, you will need to build at least two identical infrastructures on which to run SQL Server — one active, and at least one standby. Moreover, you’ll need to configure the systems to support the heaviest traffic loads you anticipate – even if those loads occur as spikes during a quarterly close or at holiday shopping periods. That requirement may be particularly painful from a cost perspective because the CPU and RAM configuration needed to accommodate those spikes may be wholly unnecessary for most of your day-to-day operations.
In the cloud, the underlying infrastructure demands are dramatically different. You can configure your primary SQL Server instance with the number of cores and RAM to meet the demands you encounter on a day-to-day basis. When traffic spikes, cloud-based services are elastic enough to add more CPU and RAM resources – automatically and dynamically. As your SQL Server infrastructure needs more power and memory to perform a prescribed level, the cloud makes those resources available. When the spikes subside, the added CPU and RAM resources are removed. You pay for what you use, and you pay when you use it.
Note too that I said that you can configure your primary SQL Server instance in the cloud to support your day-to-day traffic demands. For HA in the cloud, you’re still going to need to use Windows Failover Clustering Services (WFCS) to bind together one or more fully-configured instances of SQL Server in a multi-node failover cluster instance (FCI). However, the elasticity of the cloud is such that you can under-configure the virtual machines (VMs) supporting your secondary SQL Server nodes. If the primary instance of SQL Server goes offline for any reason, you can immediately resize one of secondary VMs and then reboot it.
When the VM restarts, it comes back as a VM with as many cores and as much memory as you need to meet your SQL Server needs. Just as when you draw on added CPU and RAM capacity to support your spikes, you’ll pay more for the “larger” VM than you would pay for the under-configured VM that was waiting to be put into service – and you’ll pay that higher fee for as long as you use that larger VM. However, once your former primary VM comes back online, you can fail back over to the primary VM and reboot your secondary VM into its under-configured (and less expensive) state until the next time it is needed.
Data Replication Costs
Finally, there’s the question of how you ensure that your backup instance (or instances) of SQL Server stays data-synchronized with the primary instance. On-prem you might use a storage area network (SAN) that all the nodes in the failover cluster can access. That in itself can be very costly, but that’s a cost easily avoided in the cloud because a SAN simply isn’t a cloud storage option. To configure an FCI in the cloud, you must configure each node in your cluster with storage and then replicate data from storage on the active primary node to storage on the secondary nodes. That way, in a failover situation, a secondary node can start up and it already has a copy of all the SQL Server data that the primary node had been using.
There are two ways to replicate data among nodes in a cloud-based HA cluster. One relies on the Availability Group (AG) feature of SQL Server; the other relies on a third-party SANless Clustering tool. Either is likely to be less costly than an on-premises SAN, but your operational requirements could make one of these approaches far more expensive than the other. If you have more than one database to replicate — or are replicating primary storage to multiple secondary nodes — using the AG approach will require you to use SQL Server Enterprise Edition, even if your application does not require it. If you’re budgeting on SQL Server Standard Edition, that’s going to be an onerous cost difference — and that’s where using the SANless Cluster approach can avoid those unnecessary costs.
The SANless Cluster approach uses block-level replication to copy data from your primary node to your secondary node(s) without regard for the edition of SQL Server you’re using. That means you can continue to use SQL Server Standard Edition if that’s what works for your organization — and replicate as many databases, to as many secondary nodes, as your needs require — which will keep your costs down.
Opinions expressed by DZone contributors are their own.