DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Event-Driven Pipelines With Apache Pulsar and Go
  • Contract-First Integration: Building Scalable Systems With Flyway, OpenAPI, and Kafka
  • Kafka and Spark Structured Streaming in Enterprise: The Patterns That Hold Up Under Pressure
  • Exactly-Once Processing: Myth vs Reality

Trending

  • Advanced Error Handling and Retry Patterns in Enterprise REST Integrations
  • Good Data, Bad Metric: A Mutation Testing Pattern for Analytics Engineering
  • Liquid Glass, Material 3, and a Lot of Plumbing
  • Mastering Fluent Bit: Beginners' Guide for Contributing to Our CNCF Project Website
  1. DZone
  2. Data Engineering
  3. Big Data
  4. A Beginner's Guide to Apache Kafka

A Beginner's Guide to Apache Kafka

A bare bones, bare necessities guide to what Apache Kafka can do and why it is popular.

By 
Shiv Shet user avatar
Shiv Shet
·
Mar. 23, 16 · Tutorial
Likes (7)
Comment
Save
Tweet
Share
10.8K Views

Join the DZone community and get the full member experience.

Join For Free

A normal messaging queue is not capable of handling big data, which is where a Distributed Messaging Queue comes to the rescue.

Features of a Distributed Messaging System

  • It should be scalable, meaning it should easily scale to thousands of nodes.
  • It should be fault tolerant in such a way that it should work even if some nodes in a cluster go down.
  • It should support replication.
  • There shouldn't be a single point of failure, the  system should work even if some node goes down.
  • It should have higher throughput, it should handle millions of messages per second.

This is where Apache Kafka fits in the world of distributed messaging.

Features of Apache Kafka

  • It can easily scale to thousands of nodes in no time.
  • It is durable. Messages are persisted into file system and even replicated across clusters.
  • It is fault tolerant.
  • It has no single point of failure.
  • It supports replication in such a way that messages are replicated across a cluster.
  • It has higher throughput.
  • It is a peer-to-peer architecture and doesn’t follow master-slave.
  • It is open sourced by LinkedIn to the Apache Community.

Please see this architecture diagram of Apache Kafka below:

Apache Kafka- Architecture

Apache Kafka consists of the following components mentioned below:

  1. The producer sends a message to the broker through the push mechanism.

  2. The consumer reads data from the broker through the pull mechanism.

  3. The broker is a very lightweight component that handles just TCP connections and writes data to a append only log file.

  4. Zookeeper acts a coordinator between the broker and consumer.

kafka

Opinions expressed by DZone contributors are their own.

Related

  • Event-Driven Pipelines With Apache Pulsar and Go
  • Contract-First Integration: Building Scalable Systems With Flyway, OpenAPI, and Kafka
  • Kafka and Spark Structured Streaming in Enterprise: The Patterns That Hold Up Under Pressure
  • Exactly-Once Processing: Myth vs Reality

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook