Apache Kafka 2.4.1 + Spark Streaming 3.0.0 Preview + Scala Example Code

DZone 's Guide to

Apache Kafka 2.4.1 + Spark Streaming 3.0.0 Preview + Scala Example Code

In this article, take a look at Apache Kafka and a Spark streaming preview plus a Scala code example.

· Integration Zone ·
Free Resource

Ok, let's get straight into the code. Here is the example code on how to integrate spark streaming with Kafka. In this example, I will be getting data from two Kafka topics, then transforming the data (map, flatmap, join), then sending the transformed data to the Kafka topics.

The below code is done in Scala because Spark does well with Scala. Spark version used here is 3.0.0-preview and Kafka version used here is 2.4.1. I suggest you use Scala IDE build of Eclipse SDK IDE for coding.

  1. Firstly, get all the below-listed JARS required. You can download JARS from the Maven website.
  • Spark 3.0.0 preview
  • spark-core_2.12–3.0.0-preview
  • spark-streaming_2.12–3.0.0-preview
  • spark-streaming-kafka-0–10_2.12–3.0.0-preview
  • kafka-clients-2.4.1
  • streaming-kafka_2.12–0.8.0
  • spark-token-provider-kafka-0–10_2.12

2. Create a Scala project and create a Scala object called “KafkaStreaming” just like below.


3. Add all the above-listed Jars to the project.

4. Create topics test, test1, vowel, consonant, number in Kafka. Given command to create topics in Kafka.

kafka-topics.bat — create — zookeeper localhost:2181 — replication-factor 1 — partitions 1 — topic test 

5. Below is the Scala code to read Strings from two Kafka topics test, test1 with a batch interval of 15 sec (You can change it, I am not pushing). After reading, we are creating jobs to find the count of all vowels, consonants (includes special characters also, feel free to change the logic when you use it), numbers in the data, then sending the vowels details to vowel topic, consonants details to consonant, and numbers details to number topic.

This is like read from two topics. Then transform (includes map, flatmap, join) and send the transformed data to three Kafka topics.


6. Once coding class is done, run the program as a Scala application, and your Spark UI is ready on localhost:4040 if your port 4040 is free. Below is the Spark dashboard.

7. Start sending data (string data) in Kafka topics test, test1. You will see data printing in the console and you can see tasks and jobs time in the Spark dashboard.

8. You can check data streaming at the other end in topics vowel, consonant, and number in KafkaRun the below command in CMD to see for the vowel topic, just like do for other topics in separate command prompts.

kafka-console-consumer.bat — bootstrap-server localhost:9092 — topic vowel 


apache spark ,integration ,kafka apache ,scala 2.13 ,spark ,spark streaming ,tutorial

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}