In this post, we describe VoltDB's high availability architecture and how it interacts with bonded networks. Together, they offer protection and continuity against network failovers. How do we know? Because we tested it and describe the tests we ran to verify compatibility with various bonded network modes.
What Is a Bonded Network?
Bonded networking is a protocol that allows multiple physical network interfaces to act as one logical interface for higher bandwidth and redundancy for high availability. It is also known as Channel Bonding, NIC Teaming, Link Aggregation Groups (LAG), Trunk Group, EtherChannel and 802.3ad. The Linux driver that bonds the server interfaces is transparent to all higher level applications and shouldn't require any extra configuration by applications.
VoltDB High Availability Feature
VoltDB allows a user to specify how many copies of each record the cluster will store. If you tell the database to store three copies, two machines can fail and the cluster will continue operating. Failure detection and failure handling are automatic and backed by strong consensus guarantees. Availability is configured by specifying which is the number of cluster nodes that can be lost without interrupting the service.
VoltDB uses heartbeats to verify the presence of other nodes in the cluster. If a heartbeat is not received within a specified time limit, that server is assumed to be down and the cluster re-configures itself with the remaining nodes. This time limit is called the heartbeat timeout and is specified as an integer number of seconds.
While waiting for the heartbeats, any transactions that require access to data on the node or nodes that are not reachable will wait until the node(s) have either re-established connectivity and started heartbeating, or until the timeout expires and the VoltDB cluster decides to reconfigure itself to run without the extra copies of the missing data.
Summary of Results
VoltDB can handle a bonded network failover as long as the network can re-establish connectivity within VoltDB's configured heartbeat timeout period. If the heartbeat timeout is too short to handle the bonded network failover, it can result in a node or nodes failing and as long as the K-safety configuration is high enough, the remaining cluster nodes will continue operation.
VoltDB tested five types of network bonding and in all cases, VoltDB was resilient to link failures in all modes. In addition, we verified that VoltDB ran correctly in those modes that also support load balancing. Each network switch had different failover duration when testing high availability and in our controlled environment under a moderate workload we found that a heartbeat timeout of ten seconds was adequate to prevent unnecessary dead host detections.
The network switch configuration can dramatically affect the duration of failovers. As one example, in our tests, we were connected to a switch with spanning tree enabled. A spanning tree topology change was triggered, this caused some ports on the switch to go through listening-and-learning state and temporarily prevent traffic on those ports. The delay between the state change and the ports forwarding again was longer than our heartbeat timeout setting and it caused a node to fail.
If you only require high availability with link redundancy and don't need aggregation, we recommend setting up the bonded network for failover using active-backup mode. This networking mode has the simplest configuration and produced the most reliable failovers in our testing. After setting up the network, we recommend testing the duration of failover under load and setting VoltDB's heartbeat timeout setting to a value that is long enough to prevent a node from being evicted from the cluster.
Bonded Network Modes
Linux Channel Bonding can be configured in one of seven modes and each mode has benefits and limitations. Below is a chart of each mode, its features and whether we can verify VoltDB is resilient to link failures.
Software Setup and Test Configuration
We ran tests on Centos 7.4 and Ubuntu 16.04. We also tested VoltDB inside a VM using lxc containers that shared a bridged interface over the bonded interface and verified no adverse effects from the configuration.
We tested failover and performance by running the VoltDB "Voter" application on a 2-node, k=1 cluster, with a client that continuously generates transactions. We monitored for errors and found no transaction failures while repeatedly killing network links at various intervals. We also varied the VoltDB heartbeat timeout value to find the limitation of heartbeat timeout vs. link failover time.
Hardware and Driver Configuration
Our testing used the Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011). This is the same kernel module used in Ubuntu 14.04, 16.04, and Centos 7.4. In all modes tested we used the default kernel parameters for the Linux 3.7.1 EtherChannel driver on both Ubuntu and CentOs.
We compared two server and switch configurations:
- Ubuntu 16.04 attached to an IBM RackSwitch™ G8264, using dual Intel 10Gb copper interfaces. They used the kernel module - ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 4.2.1-k . Linux Kernel 4.4.0-96-generic.
- Centos 7.4 attached to a Dell Powerconnect 8024F using dual Intel 10Gb copper interfaces. They used the kernel module - ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 4.4.0-k-rh7.4. Linux kernel 3.10.0-693.
We used module defaults for all of our tests; in particular, miimon was used for port fail detection.
We also verified proper failover in a multi-switch configuration using a single server (Ubuntu 16.04) with interfaces attached to each switch above.
With one exception, we were able to configure our equipment so all failover tests worked correctly. When we repeatedly brought the interfaces up and down within three seconds of each other and VoltDB had a heartbeat timeout of one second, the network couldn't keep up and a Dead Host detection was triggered. Increasing the heartbeat timeout to two seconds corrected this issue. In general, any link thrashing with intervals of three or fewer seconds apart caused the network to become unstable. This wasn't specific to VoltDB.
We recommend that the network administrator determine the acceptable maximum duration of multiple links failover and take this into consideration when configuring VoltDB's heartbeat timeout.
In active-backup, continuously thrashing (bringing each interface up and down) every second severely impacted networking in our configuration. This affected the network in general and VoltDB was impacted the same as if a single link had failed and recovered continuously. In our limited testing for our configurations with 2 bonded interfaces, we saw that the best case for any repeated failover was 3 seconds. If the user has interfaces that are constantly bouncing up and down they will experience network issues that are outside the control of VoltDB.
Balanced-rr is the only mode that will allow a single TCP/IP stream to utilize more than one interface's worth of throughput. However, this can result in packets arriving out of order, causing TCP/IP's congestion control system to kick in and retransmit packets. This could add extra latency to VoltDB transactions, so should be measured to make sure that application performance meets all requirements.
Our switch hardware didn't specifically support balanced round-robin traffic, sometimes called trunking, and as such we were not able to fully utilize two interfaces worth of traffic, however, even without hardware support, we were able to verify that throughput was not degraded in this mode and that link failover didn't adversely affect VoltDB with a large enough heartbeat timeout.
The two hardware configurations (Dell/Centos and IBM/Ubuntu) had significantly different failover durations. When this mode was used and there were delays longer than 10 seconds for some failovers, we had to use a heartbeat timeout of up to 30 seconds to maintain stability.
Tools for Testing Bonded Network Failover
It is important to set VoltDB's heartbeat timeout to be longer than the time it takes for the failover to occur. This section describes some of the methods that we used to measure the link failover duration.
A simple method is to start continuous pings across the bonded network to the host as well as an ssh session to the host. You may notice ping timeouts and/or missing sequence numbers. On the ssh session, if you type continuously you will notice a delay in the echo of your keystrokes. Measuring the ping timeouts and the typing delay will give you an estimate of how long the link failover can take. You should do more than one failover of each link in sequence. In our tests, the first failover was usually very fast, but the second failover was often slower.
cat /proc/net/bonding/bond0 command gives the most information about the state of the bond including which links are active/backup, some failover stats and any module parameters used. The output is different for each type of bonding. Likewise, all of the link state changes on the host should be logged on the system.
You may also want to monitor out of order or resent packets, as this may contribute to increased latency for VoltDB transactions.
>watch -n 5 'netstat -s | grep segments' >netstat -s | grep segments 49049707 segments received 56121614 segments send out 6702 segments retransmited 1 bad segments received.
After determining the failover time without VoltDB, you should try the link failover with VoltDB running. VoltDB will issue warnings at regular intervals when heartbeats are skipped. For example:
2017-09-26 15:03:19,466 WARN [ZooKeeperServer] HOST: Have not received a message from host proddb01 for 10.005 seconds 2017-09-26 15:03:29,468 WARN [ZooKeeperServer] HOST: Have not received a message from host proddb01 for 20.005 seconds
If the network interruption from the failover exceeds the heartbeat timeout (set to 30 seconds in this example), then surviving nodes will log an
2017-09-26 15:03:39,467 ERROR [ZooKeeperServer] HOST: DEAD HOST DETECTED, hostname: proddb01
By tuning your network and testing failover, you should be able to determine a heartbeat timeout value that enables VoltDB to maintain high availability during bonded network events.