Active-Passive vs Multi-Active Database Topologies
There are a couple of things to consider when designing a resilient database architecture.
There are pros and cons to each type of database deployment, Multi-Active or Active-Passive. There are a couple of things to consider when designing a resilient database architecture. and in this blog, we will outline these.
The most basic deployment is a single-site single-node architecture. This does not give you anything in terms of business continuity. It provides no High availability and the only DR mechanism is to restore your database from a backup file. This type of deployment is normally seen in less critical environments such as development or used in CI/CD Pipelining technologies when testing is automated as part of that process. Nearly all databases can be deployed in this manner including CockroachDB, Oracle, SQL Server, etc.
The Benefits of This Model Are:
- Cost-Effective as only one node is licensed. However, the cost of an outage to the business if running this model in production could be astronomical in terms of lost revenue and productivity.
The Cons of This Model Are:
- Lack of HA. If the node goes down or has issues there is no failover. You have to fix the existing node or restore it from a backup
- Any maintenance that could cause downtime has to be factored in around slow traffic times but there will always be some form of customer/service impact when completing patching or upgrades etc.
The next level up from a single node is Single-Site Multi-Node. This gives you more in terms of DR and can also give a little bit of HA depending on the technologies used. In this deployment, there are typically 2 or more nodes involved in either an active-passive model or a multi-active model. The nodes are typically spread across different failure domains such as Rack, Network Switches and Disks.
In this model, there is a single master node and x number of passive nodes. This means that if there is an issue with the master node the application can be pointed at a secondary node and the secondary node can be promoted as a master, This is typically automatic but does involve downtime as the application repoints to the new node. This makes recovering from failure a lot faster than a single node architecture but still not a perfect solution for production deployments. Examples of databases that run in active-passive configurations are Oracle, SQL Server, MySQL, and Postgres.
The Pros of the Active Passive Model Are:
- Still relatively cost-effective as some providers allow you to run a passive node for no charge providing you have an active support contract
- Provides a lot better HA capability than single-node architecture
The Cons of the Active Passive Model are:
- Can get expensive if you have to license the secondary node(s) as you are paying for hardware resources that are not used. The only time they are used is in the event of a disaster or failure.
- Expensive in terms of operational cost as in the event of a failure all the servers need to be re-synced in order to get back to an active-passive configuration.
Multi-Active Single Site
In this model, all the nodes in the cluster are available for read and write operations. There is no concept of a master and secondary nodes as all nodes are equal within a multi-active cluster. This gives you a lot of benefits and control over HA and DR capabilities. It also has the inherent ability to allow ease of scalability. Databases that can be deployed in this category include CockroachDB, Cassandra, and Couchbase.
The Pros of the Multi-Active Model are:
- Scalability of both Read and Write operations
- Always on availability meaning no downtime when completing maintenance tasks like upgrades and patching.
- Cost-effective in terms of resource utilization as all nodes are actively used all of the time. This makes these solutions to be some of the most cost-effective solutions on the market, although a higher upfront licensing cost may be observed.
- RPO of 0 and RTO < 10 seconds
The Cons of the Multi-Active Model are:
- Most of these solutions will have a performance hit due to there being some network traffic involved. This is usually minimal in the single-site model due to the network being super quick with lots of bandwidth.
- Some of the technologies like Cassandra need regular maintenance jobs ie recovery operations to be completed to ensure the data on all nodes is replicated and consistent.
The overall con for the single-site deployment is there is no coverage for a whole site/regional outage. If this is a requirement, then a multi-site deployment is a more suitable model.
The Multi-Site Model dictates that the nodes are spread across different sites or regions. This is important if the survivability criteria is one of losing a site or region. A multi-site deployment's single biggest advantage over a single-site deployment is that you can survive the outage of a region/data center/site.
A lot of the pros and cons regarding Multi-Active vs Active-Passive are the same in terms of the Single site. However, more consideration should be put on the below points.
Network Latency — As sites are normally geo Geographically dispersed for resiliency, Network latency normally plays a part in the response time of applications. Some databases, like CockroachDB, allow you to control a portion of this latency using its Geo-Partitioning feature. You can read more of that here. This is normally the price you have to pay for regional or data center resiliency.
DR — Backups have to be taken from all nodes within the cluster, not just a single node as in the active-passive scenario.
The other solutions are still solid options depending on your application requirements and needs.
As you can see from the pros and cons of each solution, I hope this will help you decide on the correct deployment for your applications. Whichever solution you pick needs to meet your requirements in terms of the Business Continuity (HA and DR), Cost (Not only in terms of $ but also in terms of operational and outage cost. A full TCO model should be considered), and performance needs.