DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Building Scalable AI-Driven Microservices With Kubernetes and Kafka
  • Big Data Realtime Data Pipeline Architecture
  • Streaming Data Pipeline Architecture
  • Kafka: Powerhouse Messaging

Trending

  • A Guide to Container Runtimes
  • Contextual AI Integration for Agile Product Teams
  • How to Format Articles for DZone
  • Unlocking Data with Language: Real-World Applications of Text-to-SQL Interfaces
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Apache Kafka Topics: Architecture and Partitions

Apache Kafka Topics: Architecture and Partitions

An introductory look into Apache Kafka and the architecture that makes it up. We'll cover topics, partitions, and what they mean for devs and data engineers.

By 
anjita agrawal user avatar
anjita agrawal
·
Jan. 28, 19 · Opinion
Likes (10)
Comment
Save
Tweet
Share
51.0K Views

Join the DZone community and get the full member experience.

Join For Free

What Is a Kafka Topic?

A Kafka topic is essentially a named stream of records. Kafka stores topics in logs. However, a topic log in Apache Kafka is broken up into several partitions. And, further, Kafka spreads those log’s partitions across multiple servers or disks. In other words, we can say a topic in Kafka is a category, stream name, or a feed.

Kafka Topic

In addition, we can say topics in Apache Kafka are a pub-sub style of messaging. Moreover, there can be zero to many subscribers called Kafka consumer groups in a Kafka topic. Basically, these topics in Kafka are broken up into partitions for speed, scalability, as well as size.

How to Create a Kafka Topic

At first, run kafka-topics.sh and specify the topic name, replication factor, and other attributes, to create a topic in Kafka:

/bin/kafka-topics.sh --create \
   --zookeeper <hostname>:<port> \
   --topic <topic-name> \
   --partitions <number-of-partitions> \
   --replication-factor <number-of-replicating-servers>

Now, with one partition and one replica, the below example creates a topic named “test1”:

bin/kafka-topics.sh --create \
   --zookeeper localhost:2181 \
   --replication-factor 1 \
   --partitions 1 \
   --topic text

Further, run the list topic command, to view the topic:

> bin/kafka-topics.sh --list --zookeeper localhost:2181
test1

Make sure, when the applications attempt to produce, consume, or fetch metadata for a nonexistent topic, the auto.create.topics.enable property, when set to true, automatically creates topics.

Kafka Topic Partitions

Further, Kafka breaks topic logs up into several partitions, usually by record key if the key is present and round-robin. A record is stored on a partition while the key is missing (default behavior). By default, the key which helps to determine what partition a Kafka Producer sends the record to is the Record Key.
Basically, to scale a topic across many servers for producer writes, Kafka uses partitions. Also, in order to facilitate parallel consumers, Kafka uses partitions. Moreover, while it comes to failover, Kafka can replicate partitions to multiple Kafka Brokers.

Kafka Topic Log Partition’s Ordering and Cardinality

Well, we can say, only in a single partition, Kafka does maintain a record order, as a partition is also an ordered, immutable record sequence. And, by using the partition as a structured commit log, Kafka continually appends to partitions. In partitions, all records are assigned one sequential id number which we further call an offset. That offset further identifies each record location within the partition.

In addition, in order to scale beyond a size that will fit on a single server, Topic partitions permit Kafka logs. While topics can span many partitions hosted on many servers, topic partitions must fit on servers which host it. Moreover, topic partitions in Apache Kafka are a unit of parallelism. This means that at any one time, a partition can only be worked on by one Kafka consumer in a consumer group. Basically, a consumer in Kafka can only run within their own process or their own thread. Although, Kafka spreads partitions across the remaining consumer in the same consumer group, if a consumer stops.

Kafka Topic Partition Replication

For the purpose of fault tolerance, Kafka can perform replication of partitions across a configurable number of Kafka servers. Basically, there is a leader server and a given number of follower servers in each partition. Also, for a partition, leaders are those who handle all read and write requests.

However, if the leader dies, the followers replicate leaders and take over. Additionally, for parallel consumer handling within a group, Kafka also uses partitions.

Replication: Kafka Partition Leaders, Followers, and ISRs.

By using ZooKeeper, Kafka chooses one broker’s partition replicas as the leader. Also, we can say, for the partition, the broker which has the partition leader handles all reads and writes of records. Moreover, to the leader partition to followers (node/partition pair), Kafka replicates writes.

A follower which is in sync is what we call an ISR (in-sync replica). Although, Kafka chooses a new ISR as the new leader if a partition leader fails.

Kafka Architecture: Kafka Replication – Replicating to Partition 0

Kafka Topic

When all ISRs for partitions write to their log(s), the record is considered “committed.” However, we can only read the committed records from the consumer.

Kafka Topic

kafka Record (computer science) Architecture

Published at DZone with permission of anjita agrawal. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Building Scalable AI-Driven Microservices With Kubernetes and Kafka
  • Big Data Realtime Data Pipeline Architecture
  • Streaming Data Pipeline Architecture
  • Kafka: Powerhouse Messaging

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!