Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Apache Pulsar: Distributed Pub-Sub Messaging System

DZone's Guide to

Apache Pulsar: Distributed Pub-Sub Messaging System

Apache Pulsar is an open-source distributed pub-sub messaging system originally created at Yahoo and part of the Apache Software Foundation. Pulsar is a mult...

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

Apache Pulsar is an open-source distributed pub-sub messaging system originally created at Yahoo! that is part of the Apache Software Foundation.

Pulsar is a multi-tenant, high-performance solution for server-to-server messaging.

Pulsar's key features include:

Architecture Overview

At the highest level, a Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can replicate data amongst themselves.

The diagram below provides an illustration of a Pulsar cluster:

Pulsar Comparison With Apache Kafka

The table below lists the similarities and differences between Apache Pulsar and Apache Kafka:


Kafka Pulsar
Concepts Producer-topic-consumer group-consumer Producer-topic-subscription-consumer
Consumption More focused on streaming, exclusive messaging on partitions. No shared consumption. Unified messaging model and API.
  • Streaming via exclusive, failover subscription
  • Queuing via shared subscription
Acking Simple offset management
  • Prior to Kafka 0.8, offsets are stored in ZooKeeper
  • After Kafka 0.8, offsets are stored on offset topics
Unified messaging model and API.
  • Streaming via exclusive, failover subscription
  • Queuing via shared subscription
Retention Messages are deleted based on retention. If a consumer doesn’t read messages before retention period, it will lose data. Messages are only deleted after all subscriptions consumed them. No data loss even the consumers of a subscription are down for a long time.

Messages are allowed to keep for a configured retention period time even after all subscriptions consume them.

TTL No TTL support

Supports message TTL

Conclusion

Apache Pulsar is an effort undergoing incubation at The Apache Software Foundation (ASF) sponsored by the Apache Incubator PMC. It seems that it will be a competitive alternative to Apache Kafka due to its unique features.

References

  1. Apache Pulsar homepage

  2. Yahoo! Open Source homepage

  3. Apache homepage

  4. Pulsar concepts and architecture documentation

  5. Comparing Pulsar and Kafka: Unified queueing and streaming

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:
big data ,apache pulsar ,tutorial ,pub-sub ,architecture

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}