Curator's Note: The content of this article was written by Philip Stoev, Adam Abrevaya, and Paul Lothridge over at the NuoDB Techblog.
We’re often asked about how replication in the NuoDB NewSQL database compares to replication in MySQL. The truth is, “replication” in NuoDB is not the same as your garden variety replication in a traditional RDBMS. In this article we’ll discuss the advantages derived from our unique architecture and the key differences these produce for replication behavior in NuoDB.
From an historical view, the ease of installation and use in small deployments contributed greatly to the popularity of MySQL for powering many sites on the Internet. Eventually, the traffic to any successful web site would grow beyond what a single server can handle, and to further scale-out would typically involve installing some form of MySQL replication. Not always a straightforward experience, MySQL replication has created a proliferation of books, third-party tools and even entire products aiming to tame it and mitigate the associated pains of scale. DBAs and DevOps had to become intimately familiar with its many quirks, and maintaining large replication deployments quickly became a full time job and an expert field all to its own.
The new-generation technologies implemented in NuoDB provide much easier scale-out from one to one hundred servers and beyond, and remove many of the limitations inherent to the approach used in MySQL. Here is a list of the various issues one would run into with MySQL replication and the ways NuoDB’s architecture bypasses them with considerably less development effort and operator involvement.
Masters and Slaves
Replication in the standard MySQL product is based on the concept of master and slaves, where all updates flow from the master to the slave in a pre-configured hierarchy of database machines. In NuoDB, each process, whether a Transaction Engine or Storage Manager, is an equal peer to each other, so there are no masters and slaves. This fact alone greatly simplifies management of many processes, as there are no master machines that require special treatment and no slave machines that must be configured to replicate from a particular master.
Setting up Replication
Easy provisioning of new servers
Adding a new slave to an existing MySQL setup requires several steps. Since it is unlikely that all the binary logs that are required to create a slave from scratch have been preserved, usually an initial, consistent snapshot of the data needs to be transferred to the slave first using some other means before actual replication can begin.
NuoDB provides “one-click” creation for both Storage Managers and Transaction Engines. Storage Managers will automatically synchronize with the rest of the database, even if they start up empty. The Transaction Engines will self-organize and fetch whatever data is required to process queries without having to be pointed to it.
Multi-master out of the box
To MySQL, multi-master, or being able to perform updates on multiple machines at once, does not come out of the box. It requires the use of customized setups or third-party products or tools. By that definition, NuoDB is multi-master from the start, as soon as a second host is added to the database.
Running Applications in a Replicated Environment
A NuoDB cluster is more than a collection of separate databases that are replicated from one another. It it a unified whole that is always consistent and presents a single interface to the application, even if the cluster is spread across multiple data centers. The application can select the appropriate transaction isolation level for each transaction, and all NuoDB servers will work as one to provide the expected transactional semantics.
No difference between reads and writes
In the replication that comes with standard MySQL, SQL update queries can only be sent to the master server. The separation of the write part from the read part of a workload does not come naturally, so dedicated drivers, proxies, or frameworks are required. In NuoDB, reads and writes are treated identically and any host process can handle both requests, so any transaction can be executed from any connection. The application needs to open and manage just a single database connection.
Sending the SELECT queries to a MySQL slave can cause them to return older data that does not reflect updates that have just been made. Unlike in NuoDB, updates to the master and reads from the slave can not participate in the same transaction and therefore can not be fully consistent with each other.
No replication breakages
There are various instances where MySQL replication can “break”, that is, the flow of replicated data from the master to the slaves stops. One example would be if the data on the slave is changed outside of replication, such as by operator error. Issuing an incorrect
CHANGE MASTER command with its multiple parameters can also have unexpected sequences, which may not become visible immediately.
NuoDB does not require that DBAs partake in the inner workings of the replication – there is no need to configure who replicates from whom, starting from what position and against what data. The NuoDB database presents a unified view of a single collection of data, so there is no opportunity for a single change made on a single slave to cause replication to stop. There is no need to explicitly set slaves to be read-only in order to shield them from harm.
Availability and Handling Failure
A failure in a MySQL master causes the cluster to become unavailable for writing until a slave is promoted to take its place and the entire replication topology reconfigured. This process does not happen automatically and requires a separate monitoring mechanism to detect failures of the master.
NuoDB includes built-in monitoring of all processes, which allows non-responsive processes to be removed automatically from the database. In addition, every Transaction Engine is ready to receive the clients of the failed process without an external failover and promotion procedure. Every Storage Manager maintains a complete copy of the database, so that it will continue to record transactions to disk without interruption. As no process is special, the failure of any individual Storage Manager or Transaction Engine does not impact the availability of the database as a whole. New client connections are automatically directed to one of the remaining processes.
Rejoining the database
In MySQL, a node that was previously a master can not rejoin the cluster back in its previous role without going through a re-synchronization process and a change in the replication topology. In NuoDB, a process that has gone offline for some reason can rejoin the database without any special administrative procedures. Bringing it up to speed with the rest of the processes happens automatically and it is ready to begin serving requests. The reduced downtime per process in turn increases the overall availability of the system.
The NuoDB NewSQL database peer-to-peer architecture manages to remove many of the sources of difficulty when establishing database replication. Scale-out, consistency and reliability are achieved with a reduced administration effort and without having to make changes to the application itself. There are no different types of servers dedicated to different tasks (writes, analytics queries, and dedicated stand-by) and no need for application infrastructure code to support scale-out in read queries.