DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Data
  4. Data Streaming Made Easy With Apache Kafka

Data Streaming Made Easy With Apache Kafka

Learn how to use Python in tandem with Apache Kafka to create producers and consumers in a data stream.

John Hammink user avatar by
John Hammink
·
May. 27, 19 · Tutorial
Like (11)
Save
Tweet
Share
20.21K Views

Join the DZone community and get the full member experience.

Join For Free

Apache Kafka Message Streams

In a previous post, we introduced Apache Kafka, where we examined the rationale behind the pub-sub subscription model. In another, we examined some scenarios where loosely coupled components, like some of those in a microservices architecture (MSA), could be well served with the asynchronous communication that Apache Kafka provides.

Apache Kafka is a distributed, partitioned, replicated commit log service. It provides all of the functionality of a messaging system, with a distinctive design.

In this post, we’ll look at the Pub/Sub model and examine why it’s an excellent choice for asynchronous communication and loose coupling of components. Then, we’ll set up an Aiven Kafka instance, implement a producer and a consumer, and finally run it all together.

The Pub/Sub Model: Producers and Consumers

Kafka schematicSource: Martin Kleppmann’s Kafka Summit 2018 Presentation, Is Kafka a Database?

As noted previously, Apache Kafka passes messages via a publish-subscribe model where software components called producers append events to distributed logs called topics, which are essentially named, append-only data feeds.

Consumers, on the other hand, consume data from these topics by offset (the record number in the topic). When consumers get to decide what they will consume, it is less complicated than it would be with producer-based routing rules.

In other words, the producer simply appends messages to the topic, and the consumer decides, by offset, which messages to consume from the topic!

The producer/consumer or pub-sub model serves communication models, e.g. those within MSAs, particularly well when loose coupling between components and asynchronous communication is required. Why?

Because there is no time — or task — dependency between when the producer creates and sends a message and the consumer receives and/or acts on it. In other words, the producer produces messages in its own time — and at its own pace — and the consumer does the same, without worrying about synchronization locks.

Our demo, which we’ll walk through in the following sections, illustrates this.

Setting Up Your Aiven Kafka Instance

We’ll start by spinning up an Aiven Kafka service in the cloud. In Aiven’s Services pane, simply create a new Kafka service instance as shown below: gif of creating new service in aiven console

Creating a New Kafka Service in the Aiven Console

From the Connection parameters section within the Kafka service’s overview pane, copy the URL (you’ll reference it in your producer and consumer code later on). image of service overview pane in aiven console

Kafka Service Overview in the Aiven Console

Also, generate the following files and copy them to the local directory where your producer and consumer files will be created:

  • ca.pem
  • service.cert
  • service.key

image of keys and certificates in aiven console

Keys and Certificates in the Aiven console

Finally, from the Kafka service’s Topics tab, generate a topic called demo-topic. This is the topic you’ll be writing to: gif of creating a kafka topic in the aiven console

Creating a Kafka Topic in the Aiven Console

Introducing the ibeacon

The ibeacon is a hardware transmitter and a protocol developed by Apple and introduced at the WWDC in 2013. Additional vendors have followed up with beacon systems of their own, following the same protocol. The technology allows smartphones, tablets, and other devices to do certain things when in a specified proximity to a given beacon.

These include actions like determining a device’s (and customer’s) physical location, tracking customers, sending targeted ads to customer devices based on where in the retail space the customers are located, and so on. The ibeacon protocol and methods are similar, in function, to those derived from a GPS, but more accurate to smaller locations and less battery-intensive.

For our purposes, we’ll be simulating an ibeacon that sends events with the following fields:

  • uuid: universally unique identifier for a specific beacon. Ours are 27 chars in length.
  • Major: value assigned to a specific ibeacon to incrementally fine-tune identification beyond uuid.Majorcorresponds to a specific group of beacons, for example, those located in a specific department or floor.

Major values are unsigned integers between 0 and 65535

  • Measured power: this is a factory-calibrated, read-only constant which indicates what the expected RSSI is at a distance of one meter to the beacon.
  • rssi: stands for Received Signal Strength Indicator, and indicates the beacon’s signal strength from the point of view of the receiver, for example, customer’s tablet. Combined with measured power, these two settings let you estimate the distance between the device and the beacon.
  • accuracy: some ibeacon devices also calculate a string for signal accuracy.
  • proximity: ‘Near’, ‘Immediate’, ‘Far’, and ‘Distant.’

For demo purposes, we’ve made some shortcuts. As we’ve configured it, our producer will produce the events at a pace that’s faster than the consumer consumes them. This is actually OK, as we want to illustrate how Kafka provides some temporary persistence for stored messages that have not yet been delivered.

Setting Up Your Producer

The code below should give you an overview: go here to view the whole code block.

for i in range(iterator):
    #slow down / throttle produce calls somewhat for demo
    sleep(1)
    # get values
    uuid = str(generate_uuid(length=27))
    major = str(generate_major(length=5))
    measured_power = str(generate_measuredPower())
    rssi = str(generate_rssi())
    accuracy_substring = str(generate_accuracy_substring(length=16))
    proximity_choice = str(choose_proximity())

    message = "message number " + str(i) + " uuid: " + uuid + " major: " + " measured_power: " + measured_power + " rssi: " + rssi + " accuracy_substring: " + accuracy_substring + " proximity_choice: " + proximity_choice

    print("Sending: {}".format(message))
    producer.send("demo-topic", message.encode("utf-8"))
    producer.flush()

This code simulates an ibeacon — it sets up and calculates the kind of fields that your ibeacon is likely to send, packs the fields into a message, and sends the message along as a Kafka producer message. For the sake of simplicity, we’ve avoided JSON handling for now. Rinse, lather, repeat.

We’ve added a short sleep between message iterations so that you can actually see the messages go by on the console as they are produced.

Setting Up Your Consumer

The code below should give you an overview: go here to view the whole code block.

while True:
    raw_msgs = consumer.poll(timeout_ms=100000)
    for tp, msgs in raw_msgs.items():
        for msg in msgs:
            print("Received: {}".format(msg.value))

Running it all Together

Open two command prompts, set to your current directory.

  • In one, enter: $ python ibeacon_producer.py 500000

In fact, you can set the tail value to anything you want, but setting it to a higher value simply ensures that the producer will run continuously, and the consumer, to be run on the next step, doesn’t immediately time out.

  • In the second command prompt, enter: $ python ibeacon_consumer.py

Now you should have the producer and consumer running side by side, as follows: gif of kafka producer and consumer streams

Kafka Producer (left) and Consumer (right) Streams

In this example, we have the producer writing to the log faster than the consumer is reading it (it’s built into the scripts, also). This was done so intentionally, using sleep commands in each, to illustrate that the consumer need not be tightly coupled to the producer and can consume events — the ones it chooses — at its own pace.

Wrapping Up

We’ve reviewed the producer-consumer (or pub/sub) communication model and looked at why it may be quite useful for systems that require asynchronous communication between components.

Then, we’ve set up an Aiven Kafka Instance, implemented a producer and a consumer, and then run the lot together. Our consumer ingests messages slower than our producer writes them, thus demonstrating how Kafka can serve as a temporary persistence store for messages that have yet to be read.

We hope you find this example useful and illustrative!

kafka Data (computing) Database microservice

Published at DZone with permission of John Hammink, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Comparing Map.of() and New HashMap() in Java
  • When to Choose Redpanda Instead of Apache Kafka
  • Building a RESTful API With AWS Lambda and Express
  • [DZone Survey] Share Your Expertise and Take our 2023 Web, Mobile, and Low-Code Apps Survey

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: