Over a million developers have joined DZone.

A Beginner's Guide to Apache Kafka

DZone's Guide to

A Beginner's Guide to Apache Kafka

A bare bones, bare necessities guide to what Apache Kafka can do and why it is popular.

· Big Data Zone ·
Free Resource

Learn how to operationalize machine learning and data science projects to monetize your AI initiatives. Download the Gartner report now.

A normal messaging queue is not capable of handling big data, which is where a Distributed Messaging Queue comes to the rescue.

Features of a Distributed Messaging System

  • It should be scalable, meaning it should easily scale to thousands of nodes.
  • It should be fault tolerant in such a way that it should work even if some nodes in a cluster go down.
  • It should support replication.
  • There shouldn't be a single point of failure, the  system should work even if some node goes down.
  • It should have higher throughput, it should handle millions of messages per second.

This is where Apache Kafka fits in the world of distributed messaging.

Features of Apache Kafka

  • It can easily scale to thousands of nodes in no time.
  • It is durable. Messages are persisted into file system and even replicated across clusters.
  • It is fault tolerant.
  • It has no single point of failure.
  • It supports replication in such a way that messages are replicated across a cluster.
  • It has higher throughput.
  • It is a peer-to-peer architecture and doesn’t follow master-slave.
  • It is open sourced by LinkedIn to the Apache Community.

Please see this architecture diagram of Apache Kafka below:

Apache Kafka- Architecture

Apache Kafka consists of the following components mentioned below:

  1. The producer sends a message to the broker through the push mechanism.

  2. The consumer reads data from the broker through the pull mechanism.

  3. The broker is a very lightweight component that handles just TCP connections and writes data to a append only log file.

  4. Zookeeper acts a coordinator between the broker and consumer.

Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Our Chief Data Scientist discusses the source of most headlines about AI failures here.

big data ,apache kafka

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}