Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Kafka Producer Delivery Semantics

DZone 's Guide to

Kafka Producer Delivery Semantics

We take a deep dive into the various delivery semantics available for use in Apache Kafka.

· Big Data Zone ·
Free Resource

This article is a continuation of Part 1, Kafka Technical Overview and Part 2, Kafka Producer Overview articles. Let's look into different delivery semantics and how to achieve them using producer and broker properties.

Delivery Semantics

Based on broker and producer configuration, all three delivery semantics— “at most once”, “at least once” and “exactly once” — are supported.

Different delivery semantics

Different delivery semantics

At Most Once

In an 'at most once' delivery semantics, a message should be delivered maximum only once. It's acceptable to lose a message rather than delivering a message twice in this semantic. A few use cases of at most once includes metrics collection, log collection, and so on. Applications adopting at most semantics can easily achieve higher throughput and low latency.

At most delivery semantic

At most delivery semantic

At Least Once

In 'at least once' delivery semantics it is acceptable to deliver a message more than once but no message should be lost. The producer ensures that all messages are delivered for sure, even though it may result in message duplication. This is the most preferred semantics system out of them all. Applications adopting at least once semantics may have moderate throughput and moderate latency.

At least once semantic retry

At least once semantic retry

Exactly Once

In 'exactly one' delivery semantics, a message must be delivered only once and no message should be lost. This is the most difficult delivery semantic of all. Applications adopting exactly once semantics may have lower throughput and higher latency than the other two semantic systems we've looked at.

Exactly once delivery semantics

Exactly once delivery semantics

Delivery Semantics Summary

The table below summarizes the behavior of all delivery semantics.

Delivery semantics summary

Delivery semantics summary

Producer Delivery Semantics

Different delivery semantics can be achieved in Kafka using the Acks property of the producer and the min.insync.replica property of the broker (considered only when acks = all).

Acks = 0

When the acks property is set to zero you get at most once delivery semantics. Kafka producer sends the record to the broker and doesn't wait for any response. Messages, once sent, will not be retried in this setting. The producer uses the “send and forget approach” with acks = 0.

Kafka producer Acks = 0

Kafka producer Acks = 0

Data Loss

In this mode, the chances of data loss occurring are high, as the producer does not confirm the message was received by the broker. The message may not have even reached the broker or broker failure soon after message delivery can result in data loss.

Kafka producer Acks = 0 - data loss

Kafka producer Acks = 0 - data loss

Acks = 1

Kafka producer Acks = 1

Kafka producer Acks = 1

When this property is set to 1 you can achieve at least once delivery semantics. A Kafka producer sends the record to the broker and waits for a response from the broker. If no acknowledgment is received for the message sent, then the producer will retry sending the messages based on a retry configuration. The retries property, by default, is set to 0; make sure this is set to the desired number or Max.INT.

Kafka producer Acks = 1 - retry

Kafka producer Acks = 1 - retry

Data Loss

In this mode, chances for data loss are moderate as the producer confirms that the message was received by the broker (leader partition). As the replication of the follower partition happens after the acknowledgment this may still result in data loss. For example, after sending the acknowledgment and before the replication, if the broker goes down this may result in data loss as the producer will not resend the message.

Kafka producer Acks = 1 - data loss

Kafka producer Acks = 1 - data loss

Acks = All

Kafka producer Acks = all

Kafka producer Acks = all

When the acks property is set to all, you can achieve exactly once delivery semantics. The Kafka producer sends the record to the broker and waits for a response from the broker. If no acknowledgment is received for the message sent, then the producer will retry sending the messages based on the retry config being set to n. The broker sends acknowledgment only after replication based on the min.insync.replica property.

For example, a topic may have a replication factor of 3 and a min.insync.replica of 2. In this case, an acknowledgment will be sent after the second replication is complete. In order to achieve exactly once delivery semantics the broker has to be idempotent. Acks = all should be used in conjunction with min.insync.replicas.

Data Loss

Kafka producer Acks = all - data loss

Kafka producer Acks = all - data loss

In this mode, the chances of data loss are low as the producer confirms that the message was received by the broker (leader and follower partition) only after replication. As the replication of the follower partition happens before the acknowledgment of data loss, the chances of actually losing data are minimal. For example, before replication and sending acknowledgment, if the broker goes down, the producer will not receive the acknowledgment and will send the message again to the newly elected leader partition.

Exception

When there are not enough nodes to replicate as per the min.insync.replica property, then the broker will return an exception instead of an acknowledgment.

Kafka producer Acks = all - exception

Kafka producer Acks = all - exception

Safe Producer

In order to create a safe producer that ensures minimal data loss, use the below producer properties.

Producer Properties

  • Acks = all (default 1) – Ensures replication before acknowledgement.
  • Retries = MAX_INT (default 0) – Retry in case of exceptions.
  • Max.in.flight.requests.per.connection = 5 (default) – Parallel connections to broker.

Broker Properties

  • Min.insync.replicas = 2 (at least 2) – Ensures a minimum In Sync Replica (ISR).

Acks Impact

The table below summarizes the impact of acks property on latency, throughput, and durability.

Kafka producer Acks property impact

Kafka producer Acks property impact

Summary

Configure Kafka producers and brokers to achieve the desired delivery semantics based on the following properties.

  • Acks
  • Retries
  • Max.in.flight.requests.per.connection
  • Min.insync.replicas

In Part 4 of the series, we'll look into Kafka consumers, consumer groups, and how to achieve different Kafka consumer delivery semantics.

Topics:
kafka ,kafka apache ,kafka tutorial ,big data

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}