Over a million developers have joined DZone.

A Beginner's Guide to Apache Kafka

A bare bones, bare necessities guide to what Apache Kafka can do and why it is popular.

· Integration Zone

Today’s data climate is fast-paced and it’s not slowing down. Here’s why your current integration solution is not enough. Brought to you in partnership with Liaison Technologies.

A normal messaging queue is not capable of handling big data, which is where a Distributed Messaging Queue comes to the rescue.

Features of a Distributed Messaging System

  • It should be scalable, meaning it should easily scale to thousands of nodes.
  • It should be fault tolerant in such a way that it should work even if some nodes in a cluster go down.
  • It should support replication.
  • There shouldn't be a single point of failure, the  system should work even if some node goes down.
  • It should have higher throughput, it should handle millions of messages per second.

This is where Apache Kafka fits in the world of distributed messaging.

Features of Apache Kafka

  • It can easily scale to thousands of nodes in no time.
  • It is durable. Messages are persisted into file system and even replicated across clusters.
  • It is fault tolerant.
  • It has no single point of failure.
  • It supports replication in such a way that messages are replicated across a cluster.
  • It has higher throughput.
  • It is a peer-to-peer architecture and doesn’t follow master-slave.
  • It is open sourced by LinkedIn to the Apache Community.

Please see this architecture diagram of Apache Kafka below:

Apache Kafka- Architecture

Apache Kafka consists of the following components mentioned below:

  1. The producer sends a message to the broker through the push mechanism.

  2. The consumer reads data from the broker through the pull mechanism.

  3. The broker is a very lightweight component that handles just TCP connections and writes data to a append only log file.

  4. Zookeeper acts a coordinator between the broker and consumer.

Is iPaaS solving the right problems? Not knowing the fundamental difference between iPaaS and iPaaS+ could cost you down the road. Brought to you in partnership with Liaison Technologies.

big data ,apache kafka

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}