DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. IoT
  4. Deep Learning KSQL UDF for Streaming Detection of MQTT IoT Sensor Data

Deep Learning KSQL UDF for Streaming Detection of MQTT IoT Sensor Data

Want to learn more about how to build a hybrid for machine learning? Check out this post on using Apache Kafka and sensor data for streaming MQTT detection.

Kai Wähner user avatar by
Kai Wähner
CORE ·
Aug. 03, 18 · Tutorial
Like (9)
Save
Tweet
Share
11.20K Views

Join the DZone community and get the full member experience.

Join For Free

I built a scenario for a hybrid machine learning infrastructure, leveraging Apache Kafka as a scalable central nervous system. The public cloud is used for training analytic models at an extreme scale (e.g. using TensorFlow and TPUs on Google Cloud Platform (GCP) via Google ML Engine. The predictions (i.e. model inference) are executed on-premise at the edge in a local Kafka infrastructure (e.g. leveraging Kafka Streams or KSQL for streaming analytics).

This post focuses on the on-premise deployment. I created a Github project with a KSQL UDF for sensor analytics. It leverages the new API features of KSQL to build UDF/UDAF functions easily with Java to do continuous stream processing on incoming events.

Use Case: Connected Cars and Real-Time Streaming Analytics Using Deep Learning

Continuously processing millions of events from connected devices (sensors of cars in this example)

Image title

I built different analytic models for this purpose. They are trained on public cloud leveraging TensorFlow, H2O, and Google ML Engine. Model creation is not the focus of this example.The final model is ready for production and can be deployed for doing predictions in real time.

Model serving can be done via a model server or natively embedded into the stream processing application. See the trade-offs of RPC vs. Stream Processing for model deployment and a “TensorFlow + gRPC + Kafka Streams” example here.

Demo: Model Inference at the Edge With MQTT, Kafka, and KSQL

The Github project generates car sensor data, forwards it via Confluent MQTT Proxy to Kafka cluster for KSQL processing and real-time analytics.

This project focuses on the ingestion of data into Kafka via MQTT and the processing of data via KSQL:

Image title

A great benefit of Confluent MQTT Proxy is simplicity for realizing IoT scenarios without the need for an MQTT Broker. You can forward messages directly from the MQTT devices to Kafka via the MQTT Proxy. This reduces efforts and costs significantly. This is a perfect solution, if you “just” want to communicate between Kafka and MQTT devices.

If you want to see the other part of the story (integration with sink applications like Elasticsearch / Grafana), please take a look at the Github project “KSQL for streaming IoT data." This realizes the integration with ElasticSearch and Grafana via Kafka Connect and the Elastic connector.

KSQL UDF Source Code

It is pretty easy to develop UDFs. Just implement the function in one Java method within a UDF class:

            @Udf(description = "apply analytic model to sensor input")
            public String anomaly(String sensorinput){ "YOUR LOGIC" }


Here is the full source code for the Anomaly Detection KSQL UDF.

How to Run the Demo With Apache Kafka and MQTT Proxy?

All steps to execute the demo are described in the Github project. You just need to install the Confluent Platform and, then, follow these steps to deploy the UDF, create MQTT events, and process them via KSQL, leveraging the analytic model.

I use Mosquitto to generate MQTT messages. Of course, you can use any other MQTT client, too. That is the great benefit of an open and standardized protocol.

Hybrid Cloud Architecture for Apache Kafka and Machine Learning

If you want to learn more about the concepts behind a scalable, vendor-agnostic machine learning infrastructure, take a look at my presentation on Slideshare or watch the recording of the corresponding Confluent webinar, “Unleashing Apache Kafka and TensorFlow in the Cloud."

MQTT kafka Machine learning Deep learning Data (computing) IoT Stream processing

Published at DZone with permission of Kai Wähner, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Kubernetes vs Docker: Differences Explained
  • Easy Smart Contract Debugging With Truffle’s Console.log
  • What Is Policy-as-Code? An Introduction to Open Policy Agent
  • Kotlin Is More Fun Than Java And This Is a Big Deal

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: