DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Kafka on Kubernetes, the Strimzi Way (Part 2)
  • Building Hybrid Multi-Cloud Event Mesh With Apache Camel and Kubernetes
  • Data Ingestion Into Azure Data Explorer Using Kafka Connect
  • Kafka on Kubernetes, the Strimzi Way! (Part 4)

Trending

  • How Clojure Shapes Teams and Products
  • SQL Server Index Optimization Strategies: Best Practices with Ola Hallengren’s Scripts
  • Rust and WebAssembly: Unlocking High-Performance Web Apps
  • Integrating Jenkins With Playwright TypeScript: A Complete Guide
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Kafka on Kubernetes, the Strimzi Way! (Part 1)

Kafka on Kubernetes, the Strimzi Way! (Part 1)

In this article, learn how to run Kafka on Kubernetes.

By 
Abhishek Gupta user avatar
Abhishek Gupta
DZone Core CORE ·
Jul. 30, 20 · Tutorial
Likes (10)
Comment
Save
Tweet
Share
10.8K Views

Join the DZone community and get the full member experience.

Join For Free

Some of my previous blog posts (such as Kafka Connect on Kubernetes, the easy way!), demonstrate how to use Kafka Connect in a Kubernetes-native way. This is the first in a series of blog posts which will cover Apache Kafka on Kubernetes using the Strimzi Operator. In this post, we will start off with the simplest possible setup i.e. a single node Kafka (and Zookeeper) cluster and learn:

  • Strimzi overview and setup
  • Kafka cluster installation
  • Kubernetes resources used/created behind the scenes
  • Test the Kafka setup using clients within the Kubernetes cluster

The code is available on GitHub - https://github.com/abhirockzz/kafka-kubernetes-strimzi

What Do I Need to Try This Out?

kubectl - https://kubernetes.io/docs/tasks/tools/install-kubectl/

I will be using Azure Kubernetes Service (AKS) to demonstrate the concepts, but by and large it is independent of the Kubernetes provider (e.g. feel free to use a local setup such as minikube). If you want to use AKS, all you need is a Microsoft Azure account which you can get for FREE if you don't have one already.

Install Helm

I will be using Helm to install Strimzi. Here is the documentation to install Helm itself - https://helm.sh/docs/intro/install/

You can also use the YAML files directly to install Strimzi. Check out the quick start guide here - https://strimzi.io/docs/quickstart/latest/#proc-install-product-str

(optional) Setup Azure Kubernetes Service

Azure Kubernetes Service (AKS) makes it simple to deploy a managed Kubernetes cluster in Azure. It reduces the complexity and operational overhead of managing Kubernetes by offloading much of that responsibility to Azure. Here are examples of how you can setup an AKS cluster using

  • Azure CLI, - Azure portal or
  • ARM template

Once you setup the cluster, you can easily configure kubectl to point to it

Java
 




x


 
1
az aks get-credentials --resource-group <CLUSTER_RESOURCE_GROUP> --name <CLUSTER_NAME>


Wait, What Is Strimzi?

from the Strimzi documentation

Strimzi simplifies the process of running Apache Kafka in a Kubernetes cluster. Strimzi provides container images and Operators for running Kafka on Kubernetes. It is a part of the Cloud Native Computing Foundation as a Sandbox project (at the time of writing)

Strimzi Operators are fundamental to the project. These Operators are purpose-built with specialist operational knowledge to effectively manage Kafka. Operators simplify the process of: Deploying and running Kafka clusters and components, Configuring and securing access to Kafka, Upgrading and managing Kafka and even taking care of managing topics and users.

Here is a diagram which shows a 10,000 feet overview of the Operator roles:

Install Strimzi

Installing Strimzi using Helm is pretty easy:

Java
 




xxxxxxxxxx
1


 
1
//add helm chart repo for Strimzi
2
helm repo add strimzi https://strimzi.io/charts/
3

          
4
//install it! (I have used strimzi-kafka as the release name)
5
helm install strimzi-kafka strimzi/strimzi-kafka-operator



This will install the Strimzi Operator (which is nothing but a Deployment), Custom Resource Definitions and other Kubernetes components such as Cluster Roles, Cluster Role Bindings and Service Accounts

For more details, check out this link

To delete, simply helm uninstall strimzi-kafka

To confirm that the Strimzi Operator had been deployed, check it's Pod (it should transition to Running status after a while)

Java
 




xxxxxxxxxx
1


 
1
kubectl get pods -l=name=strimzi-cluster-operator
2

          
3
NAME                                        READY   STATUS    RESTARTS   AGE
4
strimzi-cluster-operator-5c66f679d5-69rgk   1/1     Running   0          43s



Check the Custom Resource Definitions as well:

Java
 




xxxxxxxxxx
1
11


 
1
kubectl get crd | grep strimzi
2

          
3
kafkabridges.kafka.strimzi.io           2020-04-13T16:49:36Z
4
kafkaconnectors.kafka.strimzi.io        2020-04-13T16:49:33Z
5
kafkaconnects.kafka.strimzi.io          2020-04-13T16:49:36Z
6
kafkaconnects2is.kafka.strimzi.io       2020-04-13T16:49:38Z
7
kafkamirrormaker2s.kafka.strimzi.io     2020-04-13T16:49:37Z
8
kafkamirrormakers.kafka.strimzi.io      2020-04-13T16:49:39Z
9
kafkas.kafka.strimzi.io                 2020-04-13T16:49:40Z
10
kafkatopics.kafka.strimzi.io            2020-04-13T16:49:34Z
11
kafkausers.kafka.strimzi.io             2020-04-13T16:49:33Z


kafkas.kafka.strimzi.io CRD represents Kafka clusters in Kubernetes

Now that we have the "brain" (the Strimzi Operator) wired up, let's use it!

Time to Create a Kafka Cluster!

As mentioned, we will keep things simple and start off with the following setup (which we will incrementally update as a part of subsequent posts in this series):

  • A single node Kafka cluster (and Zookeeper)
  • Available internally to clients in the same Kubernetes cluster
  • No encryption, authentication or authorization
  • No persistence (uses emptyDir volume)

To deploy a Kafka cluster all we need to do is create a Strimzi Kafka resource. This is what it looks like:

Java
 




xxxxxxxxxx
1
21


 
1
apiVersion: kafka.strimzi.io/v1beta1
2
kind: Kafka
3
metadata:
4
  name: my-kafka-cluster
5
spec:
6
  kafka:
7
    version: 2.4.0
8
    replicas: 1
9
    listeners:
10
      plain: {}
11
    config:
12
      offsets.topic.replication.factor: 1
13
      transaction.state.log.replication.factor: 1
14
      transaction.state.log.min.isr: 1
15
      log.message.format.version: "2.4"
16
    storage:
17
      type: ephemeral
18
  zookeeper:
19
    replicas: 1
20
    storage:
21
      type: ephemeral


For a detailed Kafka CRD reference, please check out the documentation - https://strimzi.io/docs/operators/master/using.html#type-Kafka-reference

We define the name (my-kafka-cluster) of cluster in metadata.name. Here is a summary of attributes in spec.kafka:

  • version - The Kafka broker version (defaults to 2.5.0 at the time of writing, but we're using 2.4.0)
  • replicas - Kafka cluster size i.e. the number of Kafka nodes (Pods in the cluster)
  • listeners - Configures listeners of Kafka brokers. In this example we are using the plain listener which means that the cluster will be accessible to internal clients (in the same Kubernetes cluster) on port 9092 (no encryption, authentication or authorization involved). Supported types are plain, tls, external (See https://strimzi.io/docs/operators/master/using.html#type-KafkaListeners-reference). It is possible to configure multiple listeners (we will cover this in subsequent blogs posts)
  • config - These are key-value pairs used as Kafka broker config properties
  • storage - Storage for Kafka cluster. Supported types are ephemeral, persistent-claim and jbod. We are using ephemeral in this example which means that the emptyDir volume is used and the data is only associated with the lifetime of the Kafka broker Pod (a future blog post will cover persistent-claim storage)

Zookeeper cluster details (spec.zookeeper) are similar to that of Kafka. In this case we just configuring the no. of replicas and storage type. Refer to https://strimzi.io/docs/operators/master/using.html#type-ZookeeperClusterSpec-reference for details

To create the Kafka cluster:

Java
 




xxxxxxxxxx
1


 
1
kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-1/kafka.yaml


What's Next?

The Strimzi operator spins into action and creates many Kubernetes resources in response to the Kafka CRD instance we just created.

The following resources are created:

  • StatefulSet - Kafka and Zookeeper clusters are exist in the form of StatefulSets which is used to manage stateful workloads in Kubernetes. Please refer https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ and related material for details
  • Service - Kubernetes ClusterIP Service for internal access
  • ConfigMap - Kafka and Zookeeper configuration is stored in Kubernetes ConfigMaps
  • Secret - Kubernetes Secrets to store private keys and certificates for Kafka cluster components and clients. These are used for TLS encryption and authentication (covered in subsequent blog posts)

Kafka Custom Resource

Java
 




xxxxxxxxxx
1


 
1
kubectl get kafka
2

          
3
NAME               DESIRED KAFKA REPLICAS   DESIRED ZK REPLICAS
4
my-kafka-cluster   1                        1



StatefulSet and Pod

Check Kafka and Zookeeper StatefulSets using:

Java
 




xxxxxxxxxx
1


 
1
kubectl get statefulset/my-kafka-cluster-zookeeper
2
kubectl get statefulset/my-kafka-cluster-kafka



Kafka and Zookeeper Pods

Java
 




xxxxxxxxxx
1


 
1
kubectl get pod/my-kafka-cluster-zookeeper-0
2
kubectl get pod/my-kafka-cluster-kafka-0


ConfigMap

Individual ConfigMaps are created to store Kafka and Zookeeper configurations

Java
 




xxxxxxxxxx
1


 
1
kubectl get configmap
2

          
3
my-kafka-cluster-kafka-config         4      19m
4
my-kafka-cluster-zookeeper-config     2      20m



Let's peek into the Kafka configuration

Java
 




xxxxxxxxxx
1


 
1
kubectl get configmap/my-kafka-cluster-kafka-config -o yaml



The output is quite lengthy but I will highlight the important bits. As part of the data section, there are two config properties for the Kafka broker - log4j.properties and server.config.

Here is a snippet of the server.config. Notice the advertised.listeners (highlights the internal access over port 9092) and User provided configuration (the one we specified in the yaml manifest)

Java
 




xxxxxxxxxx
1
32


 
1
    ##############################
2
    ##############################
3
    # This file is automatically generated by the Strimzi Cluster Operator
4
    # Any changes to this file will be ignored and overwritten!
5
    ##############################
6
    ##############################
7

          
8
    broker.id=${STRIMZI_BROKER_ID}
9
    log.dirs=/var/lib/kafka/data/kafka-log${STRIMZI_BROKER_ID}
10

          
11
    ##########
12
    # Plain listener
13
    ##########
14

          
15
    ##########
16
    # Common listener configuration
17
    ##########
18
    listeners=REPLICATION-9091://0.0.0.0:9091,PLAIN-9092://0.0.0.0:9092
19
    advertised.listeners=REPLICATION-9091://my-kafka-cluster-kafka-${STRIMZI_BROKER_ID}.my-kafka-cluster-kafka-brokers.default.svc:9091,PLAIN-9092://my-kafka-cluster-kafka-${STRIMZI_BROKER_ID}.my-kafka-cluster-kafka-brokers.default.svc:9092
20
    listener.security.protocol.map=REPLICATION-9091:SSL,PLAIN-9092:PLAINTEXT
21
    inter.broker.listener.name=REPLICATION-9091
22
    sasl.enabled.mechanisms=
23
    ssl.secure.random.implementation=SHA1PRNG
24
    ssl.endpoint.identification.algorithm=HTTPS
25

          
26
    ##########
27
    # User provided configuration
28
    ##########
29
    log.message.format.version=2.4
30
    offsets.topic.replication.factor=1
31
    transaction.state.log.min.isr=1
32
    transaction.state.log.replication.factor=1


Service

If you query for Services, you should see something similar to this:

Java
 




xxxxxxxxxx
1


 
1
kubectl get svc
2

          
3
NAME                                TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)
4
my-kafka-cluster-kafka-bootstrap    ClusterIP   10.0.240.137   <none>        9091/TCP,9092/TCP
5
my-kafka-cluster-kafka-brokers      ClusterIP   None           <none>        9091/TCP,9092/TCP
6
my-kafka-cluster-zookeeper-client   ClusterIP   10.0.143.149   <none>        2181/TCP
7
my-kafka-cluster-zookeeper-nodes    ClusterIP   None           <none>        2181/TCP,2888/TCP,3888/TCP



my-kafka-cluster-kafka-bootstrap makes it possible for internal Kubernetes clients to access the Kafka cluster and my-kafka-cluster-kafka-brokers is the Headless service corresponding to the StatefulSet

Secret

Although we're not using them, it's helpful to look at the Secrets created by Strimzi:

Java
 




xxxxxxxxxx
1
11


 
1
kubectl get secret
2

          
3
my-kafka-cluster-clients-ca               Opaque
4
my-kafka-cluster-clients-ca-cert          Opaque
5
my-kafka-cluster-cluster-ca               Opaque
6
my-kafka-cluster-cluster-ca-cert          Opaque
7
my-kafka-cluster-cluster-operator-certs   Opaque
8
my-kafka-cluster-kafka-brokers            Opaque
9
my-kafka-cluster-kafka-token-vb2qt        kubernetes.io/service-account-token
10
my-kafka-cluster-zookeeper-nodes          Opaque
11
my-kafka-cluster-zookeeper-token-xq8m2    kubernetes.io/service-account-token


  • my-kafka-cluster-cluster-ca-cert - Cluster CA certificate to sign Kafka broker certificates, and is used by a connecting client to establish a TLS encrypted connection
  • my-kafka-cluster-clients-ca-cert - Client CA certificate for a user to sign its own client certificate to allow mutual authentication against the Kafka cluster

Ok, but does it work?

Let's take it for a spin!

Create a producer Pod:

Java
 




xxxxxxxxxx
1


 
1
export KAFKA_CLUSTER_NAME=my-kafka-cluster
2

          
3
kubectl run kafka-producer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list $KAFKA_CLUSTER_NAME-kafka-bootstrap:9092 --topic my-topic



In another terminal, create a consumer Pod:

Java
 




xxxxxxxxxx
1


 
1
export KAFKA_CLUSTER_NAME=my-kafka-cluster
2

          
3
kubectl run kafka-consumer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server $KAFKA_CLUSTER_NAME-kafka-bootstrap:9092 --topic my-topic --from-beginning


The above demonstration was taken from the Strimzi doc - https://strimzi.io/docs/operators/master/deploying.html#deploying-example-clients-str

You can use other clients as well

We're Just Getting Started...

We started small, but have a Kafka cluster on Kubernetes, and it works (hopefully for you as well!). As I mentioned before, this is the beginning of a multi-part blog series. Stay tuned for upcoming posts where we will explore other aspects such as external client access, TLS access, authentication, persistence, etc.

kafka Kubernetes cluster azure Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • Kafka on Kubernetes, the Strimzi Way (Part 2)
  • Building Hybrid Multi-Cloud Event Mesh With Apache Camel and Kubernetes
  • Data Ingestion Into Azure Data Explorer Using Kafka Connect
  • Kafka on Kubernetes, the Strimzi Way! (Part 4)

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!