RocketMQ: HA Design
RocketMQ: HA Design
Explore RocketMQ's HA design.
Join the DZone community and get the full member experience.Join For Free
When any messaging system sends messages, there will be errors. Even though the name servers do check the health of brokers beforehand, there is always delay. Plus, a network outage can happen during the transmission. But the handling of errors can be different.
The CAP Theorem is a fundamental theorem in distributed systems that states that any distributed system can have, at most, two of the following three properties: Consistency, Availability, and Partition tolerance. The following diagram shows the tradeoffs:
While no system design can satisfy all three, if we enhance A and P, the requirement for C will be less critical. And that’s what RocketMQ does with its HA design: The master/slave approach.
RocketMQ ‘s master/slave architecture has two layers of proof: multi-data center deployment, distributed locks, and notification mechanism.
To achieve high availability, different data center deployment is recommended. Zookeeper is the scheduler, it serves as:
- Persistent node, which stores the master status
- Ephemeral node, which stores the current status
- During failover, it will notify observers.
For any write operations, the request will only go to the master, and it will be copied to the slave via sync or async communication. For a read operation, the request will go to the master first. However, in case the master is busy, it will go to the slave.
RocketMQ interacts with Zookeeper to:
- Update current status
- Act as observers to the changes of the cluster.
This design will avoid content loss in the failover. We use a global variable SequenceID to guarantee the consistency of data when syncing.
Another key issue is the performance of recovery. The following is a state machine diagram of the failover.
When the first node starts, the Controller will notify the node to serve as the master.
When the 2nd node (slave) starts, the controller will switch into async status. The master will send data with the slave.
When the sync is almost finished, the Controller will then switch to half async status. All the write ops to the aster will be put on hold until the sync is finished. This guarantees a consistent state of master and slave.
When the slave has the full copy of the master, the controller will switch back to sync mode. From then on, the master will use sync mode to copy data to the slave until the master is down, at which time the slave will be promoted to master. This change can be done in seconds.
HA is a complicated topic. We switch between sync, async, and hybrid mode to balance throughput and consistency. Overall, this mechanism provides a balanced performance between low latency, high throughput, and consistency.
Opinions expressed by DZone contributors are their own.