Geo Clustering: What, How, and Why?
SUSE has been working on a new DR technology called geo clustering, involving failover procedures that link datacenters that might be far apart.
Join the DZone community and get the full member experience.Join For Free
Disaster Recovery (DR) has been an industry buzzword for decades. But DR can have several meanings: mobile offices which act as work spaces after disasters, backups off-site, recovering data from destroyed disks, live replication between multiple sites, etc. Altogether, the term never fully developed into a single well-defined concept except at a very high level.
So in terms of data protection, how do IT folks delineate between live replication between two different sites and periodic backups between them? SUSE has tossed its hat into the ring by defining a new industry term: “Geo Clustering.”
Geo Clustering does exactly what it says: it takes a High Availability cluster built on standard commodity hardware and replicates the data live across long-distances. If the primary datacenter fails, users can automatically failover services to a working datacenter.
Geo Clustering technologies are host based, which means they can replicate any data that can be written to a local hard-drive, which, in SUSE’s case, includes any application that runs on Linux. This functionality also extends to VMs, physical hosts, hypervisors, and anything else that runs on standard commodity hardware. More information about the combination of software that is used to accomplish this can be found in the Technical Details section below.
SUSE’s new Geo Clustering capability for the SUSE Linux Enterprise High Availability Extension can automatically fail-over a non-working data-center over any distance. SUSE leverages a software stack that is in production in a variety of situations: over campuses with dedicated fiber, metro-area networks with 1Gb/sec connections, across countries with WAN connectivity, and even in some ships which replicate their data over satellite to shore. Unlike proprietary appliances, which have strict requirements, SUSE’s Geo Clustering software works in virtually any production scenario.
So how have users with critical data handled data-center fail-over in the past? On the very expensive end of the cost spectrum, there are traditional proprietary SANs using Array-Based replication. I wouldn’t recommend dabbling in these technologies if you’re trying to avoid vendor lock-in. In addition, the cost doesn’t stop at the hardware and licensing. Users must also pay per gigabyte transferred over long distances for these proprietary hardware solutions. For companies with deep pockets and a preference for large vendors, these are the technologies you come across for rapidly recovering a full-site outage.
On the other end of the cost spectrum are things like LVM snapshots, which are free, but only update the DR system periodically. For users with a more relaxed Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO), daily or even weekly snapshots are an alternative to live replication between SANs. The SUSE-LINBIT solution aims to bridge the gap here: tried-and-tested mission-critical capability based on open software and commodity hardware. It’s the Swiss army knife of DR: a lightweight tool, used worldwide, which can be depended on for multiple uses, and is completely certified as tested, tried, and true.
IDC has reported that the global Enterprise storage market is down 3% between 2015 and 2016. Server sales are up over the same period. The reason? Commodity hardware solutions are hitting mainstream businesses. Over the past decade, hosting providers and major ISPs used these ‘software on commodity hardware’ combinations in order to create their competitive cost advantage over other players. Now, Fortune-500 companies, governments, and even financial institutions are realizing that in order to stay cost competitive, they need to use software-defined technologies.
Try it Out!
LINBIT and SUSE have created a joint SUSE Best Practices paper, a technical guide, which describes how to install and test the new geo-clustering feature. If you have some fresh servers running SUSE Linux Enterprise Server and a lab, just view the guide.
SUSE and LINBIT have collaborated on Open Source DRBD software for local HA clusters for years. In combination with Corosync and Pacemaker, users can automatically fail-over local high availability clusters. SUSE wrapped this up into their SUSE Linux Enterprise High Availability Extension and developed a GUI for users to interface with the software.
In addition to developing the well-known DRBD software, LINBIT has built a Disaster Recovery add-on called DRBD Proxy designed to replicate data over WAN environments. Because Pacemaker wasn’t designed to fail-over services in WAN environments, DR failovers used to be a manual process.
SUSE’ s latest project, ‘Booth’, allows Pacemaker to be used over WAN environments which enables this much-desired automatic fail-over functionality.
Now, with the combination of DRBD, DRBD Proxy, Pacemaker, and Booth, organizations can replace their proprietary WAN replication appliances with standard commodity hardware, and a completely software-based stack.
Published at DZone with permission of Greg Eckert. See the original article here.
Opinions expressed by DZone contributors are their own.