In NuoDB 2.0.1, a region is a property that represents a geographic location for the database. Such locations might include a data center in a major city, country, or building location. A single NuoDB database can have multiple distinct regions. In addition to representing a location, regions can also be used to enforce database consistency, by constraining write transactions to complete only when a specified number of regions observed the write. This transaction level constraint is call "region level commit".
This article describes how to configure a NuoDB database to include geographical regions and region-level commits. To demonstrate this functionality, we run this setup against Jepsen (nuodb-specific fork here), a distributed database stress test. Jepsen subjects databases to network partitions while clients are performing inserts. Other posts about running Jepsen against NuoDB can be found at: Jepsen Part 1, and Jepsen Part 2.
Provisioning a region starts with the NuoDB Broker and or Agent. For simplicity, this article is only going to use Brokers in its examples, however, the same configuration steps apply to Agents as well. Brokers can accept the following flag which designates their region:
The name is an arbitrary string that represents a physical location associated with the Broker. For example, one can indicate that a Broker running in New York City by assigning the string "NYC" as the region argument. The full command is as follows:
java -jar /opt/nuodb/jar/nuoagent.jar --bin-dir /opt/nuodb/bin --broker --port 17600 --domain inventory --password foo --region NYC
This command provisions a host to the region NYC. Subsequent NuoDB Transaction Engines (TEs) and Storage Managers (SMs) started on this host will belong to the NYC region.
Region Level Commit
One of the challenges for a database that spans multiple regions is to ensure that data is consistent across these regions. If a database is running across various regions, data written to one region should eventually be written to all other regions. At the transaction level, it may be desirable to ensure that a write transaction doesn't complete until it receives an acknowledgement from one or more peer regions. For example, a transaction that inserts into an inventory database might need to confirm that the insert took place in at least two data centers before it can commit. NuoDB provides region level commit to enforce this write level constraint.
To enable region-level commit, each TE and SM in the database must run with the following argument:
This argument specifies that write transactions cannot commit until at least n regions have acknowledged the write. The following is an example using this argument to start a SM:
java -jar nuodbmanager.jar --broker host1.foo.com --password bar --command "start process sm host host1.foo.com database inventory archive /var/archive initialize yes options '--commit region:2'"
This section describes the results of the Jepsen stress test run against a NuoDB database configured with three regions and setup with a region level commit requirement of at least two regions. The goal of this experiment is to show how a running NuoDB database can sustain the loss of one a region while handling insert transactions. A real life analogy to this experiment is a database handling inserts across three major US cities at which point, one city is completely isolated due to a major network failure.
The database was configured to run on five different hosts which represents data centers in the US cities New York, LA, and Boston:
Each host was provisioned with a NuoDB Broker using the --region flag that corresponds to it's region:
dottavio@nyc$ java -jar nuoagent.jar --bin-dir /opt/nuodb/bin --broker --port 17600 --domain jepsen --password jepsen --region NYC dottavio@la$ java -jar nuoagent.jar --bin-dir /opt/nuodb/bin --broker --port 17600 --domain jepsen --password jepsen --region LA dottavio@bos$ java -jar nuoagent.jar --bin-dir /opt/nuodb/bin --broker --port 17600 --domain jepsen --password jepsen --region BOS
The NYC and LA regions each contained one SM and two TEs. The BOS region, representing a smaller datacenter, had one SM and one TE. Each SM and TE used the --commit region flag as follows:
# SM example java -jar /opt/nuodb/jar/nuodbmanager.jar --broker nyc.foo.com --password jepsen --command "start process sm host nyc.nuodb.com database jepsen archive /var/archive initialize yes options '--ping-timeout 30 --commit region:2'"
# TE example java -jar /opt/nuodb/jar/nuodbmanager.jar --broker nyc.foo.com --password jepsen --command "start process te host nyc.nuodb.com database jepsen options '--ping-timeout 30 --commit region:2'"
By specifying a region '2' commit, write transactions will only commit if 2 of the 3 cities (i.e., regions) are available.
The argument --ping-timeout enables NuoDB's failure detection system. This feature will shutdown minority partitions of the database when network partitions occur. In the context of this experiment, if regions of the database become isolated they will be removed from the database. For a more detailed description of NuoDB's failure detection see our Failure Detection techblog entry.
A query to NuoDB's system table shows a more detailed description of the database and its regions:
select address, port, georegion from system.nodes order by georegion;
ADDRESS TYPE GEOREGION -------------------- ----------- ---------- bos.nuodb.com Storage BOS bos.nuodb.com Transaction BOS la1.nuodb.com Storage LA la2.nuodb.com Transaction LA nyc1.nuodb.com Transaction NYC nyc2.nuodb.com Storage NYC
Jepsen was configured to partition the LA region from NYC and BOS. The command line to run jepsen was as follows:
lein run nuodb -f partition -u <username> -p <passwd> -X insert -t 17600
The above command will run jepsen against using NuoDB using inserts on port 17600. The '-f partition' argument specifies that a network partition (i.e., the LA region) will occur while inserts are generated.
The jepsen output from the test is included below. Each line in the output represents an insert attempt by the jepsen client. The first column is the insert count, followed by the insert status, followed by how long the insert took in milliseconds.
The network partition occurred after insert #105. Following the network partition, the clients connected to the LA region eventually timeout. This is due to NuoDB's failure detection removal of the LA region from the database. The inserts errors represent the connections to the LA region, there are six in total. The summary of the jepsen output indicates that 1994 inserts out of 2000 were properly acknowledged.
Run will take 200 seconds 0 :ok 9 ▏ 1 :ok 79 ▎ 2 :ok 91 ▎ ... ... ... 103 :ok 4 ▏ 104 :ok 2 ▏ 105 :ok 3 ▏ Partitioned. 106 :error 65006 107 :ok 3 ▏ 108 :error 65005 109 :error 65006 110 :ok 3 ▏ 111 :error 65003 112 :ok 4 ▏ 113 :ok 4 ▏ 114 :ok 32562 115 :ok 3 ▏ 116 :ok 3 ▏ 117 :ok 3 ▏ 118 :ok 3 ▏ 119 :error 65004 120 :ok 5 ▏ 121 :ok 5 ▏ 122 :ok 5 ▏ 123 :ok 4 ▏ 124 :error 65004 125 :ok 3 ▏ 126 :ok 3 ▏ 127 :ok 3 ▏ ... ... ... 1995 :ok 4 ▏ 1996 :ok 4 ▏ 1997 :ok 4 ▏ 1998 :ok 4 ▏ 1999 :ok 3 ▏ 0 unrecoverable timeouts Collecting results. Writes completed in 200.052 seconds 2000 total 1994 acknowledged 1994 survivors all 1994 acked writes out of 2000 succeeded. :-)
Checking the system table again shows that the LA region has been removed by NuoDB's failure detection system.
select address, type, georegion from system.nodes order by georegion;
ADDRESS TYPE GEOREGION -------------------- ----------- ---------- bos.nuodb.com Storage BOS bos.nuodb.com Transaction BOS nyc1.nuodb.com Storage NYC nyc2.nuodb.com Transaction NYC
Region-level commit offers users of NuoDB flexibility at the transaction level with respect to write consistency across geographically seperated regions. We've shown in the previous sections how to setup a NuoDB database using this new region option. In addition to carving the database into distinct regions, it's possible to ensure that at the transaction level writes are acknowleded from a minimum subset of those regions. A simple use-case for this functionality is managing a database that spans a few major US cities. In this use-case, write based transactions only commit when at least two of the three cities acknowledge the write. When one of the three cities is partitioned (due to network failure), subsequent write transactions will continue in the remaining cities. Using Jepsen, we demonstrated NuoDB's resilience to network failure when connections between regions are severed.