Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Apache Kafka In Action

DZone 's Guide to

Apache Kafka In Action

Produce and consumer message from a Topic in Kafka.

· Big Data Zone ·
Free Resource

Challenges and Limitations in Messaging

Messaging is a fairly simple paradigm for the transfer of data between applications and data stores. However, there are multiple challenges that are associated with it:

  1. Limited scalability due to a broker becoming a bottleneck.
  2. Strained message brokers due to bigger message size.
  3. Consumers being in a position to consume the messages at a reasonable rate.
  4. Consumers exhibiting non-fault tolerance by making sure that messages consumed are not gone forever.

Messaging Limitations Due to:

High Volume

Messaging applications are hosted on a single host or a node. As a result, there is a possibility of the broker becoming the bottleneck due to a single host or local storage.

Also, if the Subscribers consumes data slowly or if there is no consumption of data, there is a possibility of the broker or publisher going down, which may result in complete denial of service.

Application Faults

There is a possibility of a bug in the Subscriber logic. Because of this, data might be processed incorrectly, poisoned, or adulterated. If the bug in the Subscriber is fixed, there must be a way to fetch the old data for processing. If the Subscriber stored the data, it would be helpful.

We also might have to reprocess all the messages once the bug is fixed.

Middleware logic

Different apps that act as a publisher-subscriber have custom logic to write to a broker. Each of them has different error handling. Maintaining data consistency, in this case, could be difficult.

How Does Kafka Solve These Issues?

  1. Provides high-throughput for large volumes of data that are in Terabytes or beyond.
  2. Is horizontally scalable, so it is able to scale out by adding machines to seamlessly share the load.
  3. Provides reliability, so none of the data will be lost in case of failure.
  4. Has Publishers and Consumers loosely coupled, so they are only involved in data exchange.
  5. It makes use of Pub-Sub messaging semantic, so independent applications send data on the topic and interested Subscribers can consume data on the topic.

Let's follow this video tutorial to learn how to: 

  1. Set up the Apache Kafka cluster.
  2. Set up the Zookeeper and Broker.
  3. Produce some messages on the Topic.
  4. Consume the messages from the same Topic.

Steps of the video tutorial above are listed below:

Step 1 - cd /../kafka_2.12-2.3.0

Step 2 - Start Zookepeer
bin/zookeeper-server-start.sh config/zookeeper.properties

Step 2 - Kafka Broker
bin/kafka-server-start.sh config/server.properties

Step 3 - Create Topic
bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic Demo 

Step 4 - List the created Topics
bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Step 5 - Start the Producer
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic Demo

Step 6 - Start the Consumer
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic Demo --from-beginning


References

https://kafka.apache.org/

Topics:
apache kafka ,apache kafka tutorial ,apache kafka tutorial for beginners ,event sourcing ,cqrs ,big data ,messaging ,async asynchronous ,pubsub ,apache kafka use cases

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}