Database Metro Area Clustering Across Data Centers
Database Metro Area Clustering Across Data Centers
With ClustrixDB 9, you can define zones within a cluster that each contain a subset of the nodes in the cluster. Each zone can reside in another data center within a large metro area.
Join the DZone community and get the full member experience.Join For Free
Discover Tarantool's unique features which include powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU core!
Data integrity, high availability, and easy disaster recovery have always been a major requirement of our ClustrixDB customers. Their OLTP applications are mission-critical and often customer- or consumer-facing, which makes the acceptable margin for error virtually nil.
Over the years, we have brought a variety of innovations to market to meet these requirements, such as our patented ClustrixDB nResiliency for fault tolerance and the Clustrix Rebalancer that automatically optimizes data distribution for the number of available nodes.
With the advent of networking between data centers within a metropolitan region, such as with AWS Availability Zones (AWS AZs), we saw the opportunity to take another leap in meeting the stringent application requirements of our high-profile customers.
ClustrixDB Metro Area Clustering and Availability Zones
ClustrixDB 9 supports Metro Area Clustering and Availability Zones as a new way to deploy our distributed database in public or private clouds across networked data centers in large metropolitan areas. AWS Availability Zones is one example, but other cloud vendors offer similar networking support and some private data centers are deploying them.
With ClustrixDB 9, you can define "zones" within a cluster, where each zone contains a subset of the nodes in the cluster and each zone can reside in another data center within a large metropolitan area; for example, San Francisco, Palo Alto, and Oakland.
A Quick Review of Failure Domains
A Failure Domain is any logical group of resources that are likely to fail together. For most distributed databases, the cluster would consider a single server (AKA node) as a failure domain. If a disk failed in the node, the database considers the entire node failed. If other nodes in the cluster can't reach a node over the network, for any reason, the cluster considers that node failed. This means the failure domain is at the level of a server.
That's great, but what if you have a bunch of servers on a server rack and that rack has an unreliable power distribution system causing the entire rack to lose power? And let's assume this is a common problem that you have not been able to take care of, though you know you should. In this case, the entire rack is a failure domain. If the rack's power stops, all servers on that rack will stop.
Now, in ClustrixDB 9, you can tell the cluster that a set of nodes should be considered as a single failure domain. So, let's say you have a nine-node cluster and you have three nodes in each rack. And you still haven't found time to fix that power distribution problem yet (we're not judging you). You can now tell ClustrixDB 9 that the nine nodes are grouped into three zones of three nodes each, corresponding to which rack they are in.
What this does is give ClustrixDB 9 the information that it needs to avoid placing both (or all) replicas of a data slice in the same zone. In any deployment scenario, ClustrixDB makes sure it does not put both replicas of a data slice on the same node by default, but now that you have told it that multiple nodes are in the same zone (AKA failure domain), it will know not to put both replicas in the same zone. That way, if the zone fails, you haven't lost all copies of that data slice and the database will continue reading and writing all of the data.
Extend That Concept to Metro Area Clustering and Availability Zones
Now, let's take that three zone cluster of nine nodes, and instead of running each zone in a rack in your data center, we'll put it into AWS. Yay! Now, you don't have to fix that power distribution system. We will create the cluster with three nodes in one AWS Availability Zone (AZ), three nodes in another AZ, and the remaining three nodes in a third AZ. A key point is that all of these AZs are in the same AWS Region.
If you're not 100% clear on the difference between an AWS region and an AZ, check out this
great blog article by our good partners at Rackspace: AWS 101: Regions and Availability Zones.
The reason we need all nodes in AZs in the same AWS region is that ClustrixDB needs low-latency networking between the nodes in the cluster. We state a latency requirement of less than 2 ms to ensure the good performance of your ClustrixDB cluster. AWS does a good job of keeping the network latency between same-region AZ around 1ms or less (most of the time). But between AWS regions, your network traffic travels across the public Internet, so there's little anyone can do to ensure the latency.
If you read that great Rackspace article, you now know that AWS AZs within a region can be in totally different data centers within that region. So, you can think of multiple AZs as actually multiple data centers (although technically even one AZ can be multiple data centers, but let's not go there today).
Finally, let's say you actually don't want to run in AWS, but instead, you want to run ClustrixDB across multiple data centers in your metropolitan area. Let's say you have a data center in San Francisco, another in Palo Alto, and one in Oakland, and your inter-data-center network has latencies consistently less than 2 ms.
What you've just created is a Metro Area Cluster of Zones using ClustrixDB 9's new Zones feature and your own data centers.
Why Metro Area Clustering and Availability Zones?
With ClustrixDB, the important thing to remember is that this is a single database instance that is stretched across three closely networked data centers. It is not three (or even nine) databases that are replicated to each other.
This means that any application that writes to a node in San Francisco can immediately see the effect of that write in Palo Alto. Regular RDBMS transaction semantics still hold true and are not changed in any way by this configuration (i.e. the app in SF needs to commit before any session in Palo Alto will be allowed to see that change).
That means that you don't need to set up or manage any replication between the data centers. And since there is no replication, there is no slave lag (when the replication slave is transactionally behind the master).
If you had used replication instead of ClustrixDB's Zones, then you would have a master database in one zone, with two slave databases in other zones. Aside from the aforementioned slave lag, which many applications can't tolerate, your application could only make changes to the master database and must not make changes to the slaves. This means that any database session that needs to perform an insert, update, delete, or select for updates must run only on the master. Therefore, your transactional workload is limited to only the servers in the master's zone. But with ClustrixDB, all nodes in all zones are read/write and enforce 100% consistency with each other. So, your app can read and write from any node in any zone at the same it's reading and writing from other nodes in other zones. This means that your app can use all nine nodes instead of just using the nodes in a master zone.
Published at DZone with permission of Lisa Schultz , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.