Breaking Free from ZooKeeper: Why Kafka’s KRaft Mode Matters
Kafka shifts from ZooKeeper to KRaft mode for better scalability, faster recovery, and lower complexity, using Raft-based quorum for metadata management.
Join the DZone community and get the full member experience.
Join For FreeAny modern distributed system which requires high throughput, scaling, high availability etc., utilizes Kafka as one of its component. Thus, making Kafka a popular platform which need no introduction for itself.
However even though being an integral part of Kafka, Apache ZooKeeper is neither explored nor understood as much it should have. In this article we would briefly touch upon these aspects, and understand the next generation Kafka via KRaft mode and the benefits it bring over ZooKeeper.
Kafka Cluster
To achieve the design goals of high availability and high throughput Kafka utilizes concept of distributed system via multiple nodes (viz. brokers) as its core component. These brokers together form a Kafka cluster.
Typical to any distributed system the Kafka cluster too requires certain aspects of it to be managed. For this purpose Apache ZooKeeper was chosen.
ZooKeeper acts as a Cluster Controller or Consistent Core and is typically responsible for below aspects of a Kafka cluster:
- Cluster Membership – Maintain details of every brokers which are active member of the cluster. Do note that since brokers can dynamically join (typically for scaling out) or exit (due to failures or scale down), managing cluster membership is a crucial aspect towards achieving high availability.
- Leader Election via Controller Broker – Maintain the information of controller broker and co-ordinate with it for the leader election (of topic partitions).
- Cluster Metadata – Maintain the topics information viz. partitions, ISR, consumers, consumer group, current offsets, leaders etc. It also includes partition assignment, tracking and notifying the routing service about changes.
- Cluster Recovery – In extreme case of complete cluster failure, the metadata stored in ZooKeeper is utilized for recovery. However, its imperative to note that the recovered cluster state is as fresh as the latest snapshot of ZooKeeper.
- Service Discovery – ZooKeeper allows service discovery between brokers by acting as a central registry. Also the complete cluster topology is available. This helps brokers to make appropriate decisions (e.g. rebalancing) whenever a broker dynamically joins and/or exits the cluster.
- Access Control List (ACLs) – Maintain the details of access control towards topics/consumers groups allowing brokers for required authorization.
Complete ZooKeeper’s responsibilities stretch way beyond the aspects discussed above and is out of scope of this article.
Limitations of ZooKeeper
Despite being an integral part of Kafka ecosystem, ZooKeeper has certain limitations as described below:
- ZooKeeper in itself requires high availability thus adds to cluster complexity during deployment and operations as well. Moreover, this causes the infrastructure cost increase to maintain ZooKeeper servers.
- Being a consensus based system the throughput goes lower as the cluster grows beyond certain thresholds in terms of number of brokers, topic partitions etc. This is known cause for slowing down leader elections and consumer rebalancing respectively.
- ZooKeeper being central registry of metadata it becomes bottle neck if all clusters members fetch meta data at once. Although this issue has been solved long back via KAFKA-901 it essentially diminished the responsibility of ZooKeeper.
- ZooKeeper utilizes Single Socket Channel and Thread per follower for all communication. This is essential to maintain strict ordered processing. However, the downside is a increased latency and lower throughput for large clusters.
- For a given Kafka cluster there could be at max one Controller broker. The election of Controller broker happens during the startup of cluster and/or after the Controller broker fails. For large cluster it may take a while for this election as every broker tries to register itself as Controller. This eventually could cause the cluster unusable for a brief period of time which may be unacceptable for variety of use cases.
KRaft Mode
To understand the reasoning behind the drastic decision to move away from ZooKeeper altogether lets refer to official KIP-500 itself.
Currently, Kafka uses ZooKeeper to store its metadata about partitions and brokers, and to elect a broker to be the Kafka Controller. We would like to remove this dependency on ZooKeeper. This will enable us to manage metadata in a more scalable and robust way, enabling support for more partitions. It will also simplify the deployment and configuration of Kafka.
In other words — KRaft simplifies cluster management by removing ZooKeeper, eliminating the need for two separate systems with different configurations. Instead, Kafka now alone handles metadata, reducing errors and operational complexity. Metadata is treated as an event stream, allowing quick updates using a single offset, similar to how producers and consumers function.
The key difference and advantages KRaft mode brings w.r.t. ZooKeeper are as follows →
Controller Quorum
Instead of single controller broker in ZooKeeper mode, there is a quorum maintained for controllers. Everything that is currently stored in ZooKeeper, such as topics, partitions, ISRs, configurations, and so on, is stored in this controller.
Using Raft algorithm one broker among the controllers is chosen as leader and termed as active controller. Moreover, Raft is utilized for log replication as well. Essentially all controller brokers thus have all the latest information among themselves and act as hot standbys. This drastically reduces the recovery time, during controller failure, as there is no longer need to transfer all the data to new controller broker.
Side note — In case you wondered how the KRaft name was chosen, you may now have the answer!
Broker Metadata and State
Instead of the controller pushing out updates to the other brokers, brokers fetch updates from the active controller. The fetched metadata is persisted to disk. This enables quick recovery of brokers even if there are hundreds or thousands of partitions.
Moreover, the metadata fetch is delta in nature (most of the time) which means only newer updates are fetched. In few cases when broker lag behind too much from active controller or no cached metadata is there, the active controller sends a full metadata snapshot instead of incremental deltas.
The metadata fetch double up as heartbeat for broker, letting know the controller that broker is still alive. Failure to receive heartbeat thus now result in immediate eviction from cluster. This essentially eliminates edge case where broker might be disconnected from ZooKeeper but still connected to other broker(s) leading to some tricky situation of false durability and divergent state.
Partition Reassignment
Partition reassignment largely remains unchanged. Except that KRaft controller now allows the topic deletion which is still undergoing partition reassignment. The partition reassignment is terminated immediately in case of topic deletion is requested midway thus avoiding any redundant computations.
Shutdown and Recovery Time
With the difference discussed above the controlled shutdown time and recovery from uncontrolled shutdown has improved drastically.
Infrastructure
Although migrating to KRaft mode reduces the complexity of config management towards two different systems, viz. ZooKeeper and Kafka brokers, it doesn’t immediately translate into reduced number of nodes.
Although with KRaft a broker could have process role of both controller or broker or both its highly recommended to utilize nodes with dedicated roles. Thus, for a large cluster there could be N dedicated ZooKeeper nodes, which are to be converted into controller nodes to meet the quorum. Depending upon the cluster setup and requirement the number of nodes may vary between ZooKeeper and Kraft mode.
Challenges
ZooKeeper served the ecosystem well for more than a decade and is battle tested. KRaft is an attempt to solve the challenges faced in ZooKeeper. However, as the adoption of KRaft takes speed its imperative to note that there would be challenges faced with KRaft as well.
Migration Guide
For new clusters KRaft can be used directly as its the default mode starting Kafka 4.0. However, for existing cluster Kafka need to be upgraded to at least version 3.9.
Detailed migration steps are out of scope of this article. Official migration guide can be a good starting point.
Conclusion
The evolution from ZooKeeper to KRaft mode marks a significant milestone in Kafka’s journey toward better scalability, efficiency, and simplicity. ZooKeeper has played a crucial role in Kafka’s architecture for over a decade. However, as Kafka deployments grew larger and more complex, ZooKeeper’s limitations became increasingly evident — ranging from scalability bottlenecks to infrastructure overhead and slower recovery times.
KRaft mode offers a streamlined approach by eliminating ZooKeeper and bringing metadata management directly into Kafka’s architecture. This transition not only improves operational efficiency, reduces recovery times, and simplifies deployment, but also aligns Kafka with modern distributed system patterns, leveraging the Raft consensus algorithm for better fault tolerance and high availability.
For new Kafka clusters, adopting KRaft mode is highly recommended due to its native support for scalability, improved fault tolerance, and reduced infrastructure complexity. For existing clusters, organizations must evaluate their current limitations with ZooKeeper and assess the risks and benefits of migration.
References and Further Read
Published at DZone with permission of Ammar Husain. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments