Battle of the RabbitMQ Queues: Performance Insights on Classic and Quorum
When deciding between RabbitMQ’s Classic and Quorum queues, it’s important to recognize that both have their strengths and weaknesses. Learn more here.
Join the DZone community and get the full member experience.
Join For FreeRabbitMQ is a powerful and widely used message broker that facilitates communication between distributed applications by handling the transmission, storage, and delivery of messages. As a message broker, RabbitMQ serves as an intermediary between producers (applications that send messages) and consumers (applications that receive messages), ensuring reliable message delivery even in complex, distributed environments.
One of the core components of RabbitMQ is the queue, where messages are temporarily stored until they are consumed. Queues play a critical role in RabbitMQ’s architecture, enabling asynchronous communication and decoupling the producers and consumers. This decoupling allows applications to operate independently, promoting scalability, resilience, and fault tolerance.
Understanding the performance characteristics of different types of queues in RabbitMQ is essential for designing efficient system architectures. Queues determine how messages are routed, stored, and consumed, impacting throughput, latency, and durability.
Understanding Classic Queues
Classic Queues in RabbitMQ are the default queue type, designed for high throughput and simplicity. They follow a First In, First Out (FIFO) model, where messages are delivered to consumers in the order they were received, ensuring a predictable message flow. Classic queues are widely used in scenarios where performance and speed are more important than fault tolerance and message durability across multiple nodes.
Key Features of Classic Queues
- Single-node storage: Classic queues are stored on a single RabbitMQ node. Messages are not replicated across other nodes, making the queue faster but less resilient in case of node failure.
- FIFO message processing: Messages are stored and consumed in the order they arrive, ensuring a straightforward processing model, especially for tasks where message order is important.
- Durable and non-durable messages: Classic queues can store messages either in memory (transient) or on disk (persistent). Persistent messages are saved to disk, ensuring that messages are not lost in case of server restarts or crashes, though this comes with a performance trade-off.
- High throughput: Classic queues are optimized for speed and can handle a high volume of messages with low latency. They are best suited for applications where message processing speed is critical, such as real-time systems or log aggregation services.
- Single-node durability: While Classic queues support message persistence, they lack cross-node replication. This means that if the node hosting the queue fails, persistent messages will survive if the node recovers, but there is no built-in redundancy to continue operations across other nodes.
Use Cases for Classic Queues
- Real-time systems: Classic queues are ideal for applications that require high-speed message handling, such as gaming systems, streaming platforms, or monitoring tools.
- Stateless applications: Applications that do not require message replication or high availability across nodes benefit from the simplicity and performance of Classic queues.
- Low latency: For workloads where minimizing delay between message production and consumption is crucial, Classic queues offer low-latency message delivery.
Understanding Quorum Queues
Quorum Queues in RabbitMQ are a newer, highly available, and fault-tolerant queue type designed for systems that require strong durability guarantees and resilience against node failures. Quorum queues leverage the Raft consensus algorithm to replicate messages across multiple nodes, ensuring that message availability is maintained even in the event of hardware or software failures. This makes them ideal for critical applications where message loss or downtime is unacceptable.
Key Features of Quorum Queues
- Leader-follower replication: Quorum queues operate using a leader-follower model. Each quorum queue has a single leader node responsible for handling incoming messages and multiple follower nodes that replicate the leader’s messages. This replication ensures data redundancy, with messages being acknowledged only after a majority of nodes (a quorum) have confirmed the replication.
- Raft consensus algorithm: The Raft algorithm ensures consistency between nodes. When a message is sent to a quorum queue, it is replicated to the followers, and only after a majority (quorum) of followers acknowledge the message does it become "committed." This provides strong durability guarantees, ensuring that the system can recover from failures without losing data.
- Fault tolerance: Quorum queues are designed to survive node failures. If the leader node crashes, a new leader is elected from the followers using the Raft protocol, allowing message processing to continue with minimal disruption. This provides high availability and makes quorum queues resilient in distributed environments.
- Message durability: All messages in a quorum queue are persistent by default, meaning they are written to disk and replicated across multiple nodes. This ensures that messages are not lost, even if the RabbitMQ cluster experiences node failures or restarts.
- High availability: Quorum queues prioritize availability by ensuring that as long as a majority of nodes are operational, message delivery and consumption can continue. This makes them ideal for mission-critical systems that cannot tolerate downtime or data loss.
- No single point of failure: Unlike Classic queues, which are vulnerable to node failure since they reside on a single node, quorum queues eliminate the risk of a single point of failure by distributing messages across multiple nodes in a RabbitMQ cluster.
Use Cases for Quorum Queues
- Financial services: Systems that handle transactions, payments, or sensitive financial data benefit from the fault tolerance and message durability provided by quorum queues. These systems cannot afford to lose messages or experience downtime.
- Mission-critical applications: Applications that require continuous uptime and cannot tolerate message loss, such as healthcare systems, real-time monitoring, or industrial control systems, are ideal for quorum queues.
- Distributed systems: In multi-node or distributed environments where server failures are a possibility, quorum queues ensure that message processing continues seamlessly, even if individual nodes go down.
Performance Benchmarking: Classic vs. Quorum
We have used the RabbitMQ PerfTest tool to evaluate the performance of classic and quorum queues. As part of this analysis, we have gathered the performance statistics for three different scenarios. Each scenario involved varying combinations of publishers and consumers, with a fixed message size and a consistent 30-second time interval.
Classic Queue Performance
SCENARIO
|
SENDING RATE(MSG/S)
|
RECEIVING RATE(MSG/S)
|
99TH PERCENTILE LATENCY in Micro Second
|
---|---|---|---|
Scenario 1 (1Publisher, 1Consumer)
|
13329
|
9897
|
20649010 µs
|
Scenario 2 (1Publisher, 2Consumer)
|
14112
|
9573
|
21415269 µs
|
Scenario 3 (2Publisher, 4Consumer)
|
21829
|
13577
|
27186651 µs
|
Average(across all)
|
16423
|
10349
|
23083643 µs
|
Quorum Queue Performance
SCENARIO
|
SENDING RATE(MSG/S)
|
RECEIVING RATE(MSG/S)
|
99TH PERCENTILE LATENCY in Micro Second
|
---|---|---|---|
Scenario 1 (1Publisher, 1Consumer)
|
9202
|
5581
|
37644181 µs
|
Scenario 2 (1Publisher, 2Consumer)
|
10717
|
5368
|
29972278 µs
|
Scenario 3 (2Publisher, 4Consumer)
|
13132
|
4505
|
32919489 µs
|
Average(across all)
|
11017
|
5151
|
33,511,316 µs
|
Insights From the Tests Performed
1. Throughput (Sending and Receiving Rates)
- Classic Queues:
- Average sending rate: 16,423 msg/s (49% higher than quorum queues)
- Average receiving rate: 10,349 msg/s (2x higher than quorum queues)
- Quorum Queues:
- Average sending rate: 11,017 msg/s
- Average receiving rate: 5,151 msg/s
Classic queues consistently outperform quorum queues in both sending and receiving rates, showing higher throughput across all scenarios. Quorum queues, while more resilient, suffer from throughput degradation, especially in terms of receiving rate.
2. Latency
- Classic Queue:
- Average 99th percentile latency: 23.08 million µs. - Latency remains manageable and scales consistently, even with higher loads (2 producers, 4 consumers).
- Quorum Queue:
- Average 99th percentile latency: 33.51 million µs. - Latency is consistently 40-50% higher than classic queues, with a significant spike under higher loads.
Quorum queues introduce significantly more latency due to replication and fault tolerance. For applications where low latency is crucial, classic queues are the clear choice of queues.
3. Scalability
- Classic Queue:
- Scales linearly with increased load, showing a significant jump in sending and receiving rates between scenarios with 1 producer/2 consumers(scenario-2) and 2 producers/4 consumers(scenario-3).
- Quorum Queue:
- Throughput struggles under high load. In scenario 3 with 2 producers and 4 consumers, the quorum queue shows only a marginal increase in the sending rate and a drop in the receiving rate, demonstrating poor scalability compared to classic queues.
Classic queues scale efficiently, while quorum queues show diminishing returns in throughput and increased latency as more producers and consumers are added.
Making the Right Choice
Choosing between Classic Queues and Quorum Queues in RabbitMQ depends on your system's specific requirements regarding performance, durability, fault tolerance, and resource availability.
When To Choose Classic Queues
- High throughput and low latency requirements
- Non-critical applications
- Single-node or low-cost environments
When To Choose Quorum Queues
- High availability and fault tolerance
- Durable messaging
- Distributed systems
- Mission-critical systems
Trade-Offs Between Classic and Quorum Queues
Aspect |
Classic Queues |
Quorum Queues |
Throughput |
High throughput, low latency |
Lower throughput due to replication overhead |
Durability |
Optional, single-node persistence |
Strong durability with multi-node replication |
Fault Tolerance |
Limited (no replication, single-node failure impacts) |
High fault tolerance (multi-node replication) |
Availability |
Dependent on single-node availability |
High availability with automatic leader election |
Resource Usage |
Low (single node, less disk and memory overhead) |
High (multiple nodes, higher CPU, memory, disk usage) |
Latency |
Low latency (no replication) |
Higher latency due to Raft replication |
Use Case |
High-performance, non-critical apps |
Critical apps, high availability, no message loss |
Conclusion
When deciding between RabbitMQ’s Classic and Quorum queues, it’s important to recognize that both have their strengths and weaknesses. The best choice depends on your system’s specific needs. Ultimately, choosing between Classic and Quorum queues is a decision that hinges on your specific trade-offs between performance and reliability: it’s about finding the right balance between speed and durability. By understanding how each type of queue performs, you can design a RabbitMQ setup that aligns with your goals — whether it’s efficiency, robustness, or a bit of both.
Opinions expressed by DZone contributors are their own.
Comments