Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Kafka Producer Architecture - Picking the Partition of Records

DZone's Guide to

Kafka Producer Architecture - Picking the Partition of Records

This article covers Kafka Producer Architecture, including how a partition is chosen, producer cadence, partitioning strategies, and consumers.

· Big Data Zone
Free Resource

Learn best practices according to DataOps. Download the free O'Reilly eBook on building a modern Big Data platform.

This article covers some lower level details of Kafka producer architecture. It is a continuation of the Kafka Architecture and Kafka Topic Architecture articles.

This article covers Kafka Producer Architecture with a discussion of how a partition is chosen, producer cadence, and partitioning strategies.

Kafka Producers

Kafka producers send records to topics. The records are sometimes referred to as messages.
The producer picks which partition to send a record to per topic. The producer can send records round-robin. The producer could implement priority systems based on sending records to certain partitions based on the priority of the record.

Generally speaking, producers send records to a partition based on the record’s key. The default partitioner for Java uses a hash of the record’s key to choose the partition or uses a round-robin strategy if the record has no key.
The important concept here is that the producer picks partition.

Kafka Architecture: Kafka Producers

Producers are writing at Offset 12, while at the same time, Consumer Group A is Reading from Offset 9.

Kafka Producers Write Cadence and Partitioning of Records

Producers write at their cadence so the order of Records cannot be guaranteed across partitions. The producers get to configure their consistency/durability level (ack=0, ack=all, ack=1), which we will cover later. Producers pick the partition such that Record/messages go to a given partition based on the data. For example, you could have all the events of a certain ‘employeeId’ go to the same partition. If order within a partition is not needed, a ‘Round Robin’ partition strategy can be used, so Records get evenly distributed across partitions.

Review of Producers

Can producers occasionally write faster than consumers?

Yes. A producer could have a burst of records, and a consumer does not have to be on the same page as the consumer.

What is the default partition strategy for producers without using a key?

Round-Robin

What is the default partition strategy for Producers using a key?

Records with the same key get sent to the same partition.

What picks which partition a record is sent to?

The Producer picks which partition a record goes to.

Kafka Consumer Architecture

Please continue reading about Kafka Architecture. The next article covers Kafka Consumer Architecture with a discussion of how records are divided up among consumers in a consumer group, consumer failover, and consumer load balancing.


Jean-Paul Azar works at Cloudurable. Cloudurable provides Kafka trainingKafka consultingKafka support and helps setting up Kafka clusters in AWS.

Find the perfect platform for a scalable self-service model to manage Big Data workloads in the Cloud. Download the free O'Reilly eBook to learn more.

Topics:
kafka ,software architecture ,integration ,producer ,consumer

Published at DZone with permission of Jean-Paul Azar. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}