{{announcement.body}}
{{announcement.title}}

Kafka on Kubernetes, the Strimzi Way! (Part 1)

DZone 's Guide to

Kafka on Kubernetes, the Strimzi Way! (Part 1)

In this article, learn how to run Kafka on Kubernetes.

· Big Data Zone ·
Free Resource

Some of my previous blog posts (such as Kafka Connect on Kubernetes, the easy way!), demonstrate how to use Kafka Connect in a Kubernetes-native way. This is the first in a series of blog posts which will cover Apache Kafka on Kubernetes using the Strimzi Operator. In this post, we will start off with the simplest possible setup i.e. a single node Kafka (and Zookeeper) cluster and learn:

  • Strimzi overview and setup
  • Kafka cluster installation
  • Kubernetes resources used/created behind the scenes
  • Test the Kafka setup using clients within the Kubernetes cluster

The code is available on GitHub - https://github.com/abhirockzz/kafka-kubernetes-strimzi

What Do I Need to Try This Out?

kubectl - https://kubernetes.io/docs/tasks/tools/install-kubectl/

I will be using Azure Kubernetes Service (AKS) to demonstrate the concepts, but by and large it is independent of the Kubernetes provider (e.g. feel free to use a local setup such as minikube). If you want to use AKS, all you need is a Microsoft Azure account which you can get for FREE if you don't have one already.

Install Helm

I will be using Helm to install Strimzi. Here is the documentation to install Helm itself - https://helm.sh/docs/intro/install/

You can also use the YAML files directly to install Strimzi. Check out the quick start guide here - https://strimzi.io/docs/quickstart/latest/#proc-install-product-str

(optional) Setup Azure Kubernetes Service

Azure Kubernetes Service (AKS) makes it simple to deploy a managed Kubernetes cluster in Azure. It reduces the complexity and operational overhead of managing Kubernetes by offloading much of that responsibility to Azure. Here are examples of how you can setup an AKS cluster using

Once you setup the cluster, you can easily configure kubectl to point to it

Java
 




x


 
1
az aks get-credentials --resource-group <CLUSTER_RESOURCE_GROUP> --name <CLUSTER_NAME>


Wait, What Is Strimzi?

from the Strimzi documentation

Strimzi simplifies the process of running Apache Kafka in a Kubernetes cluster. Strimzi provides container images and Operators for running Kafka on Kubernetes. It is a part of the Cloud Native Computing Foundation as a Sandbox project (at the time of writing)

Strimzi Operators are fundamental to the project. These Operators are purpose-built with specialist operational knowledge to effectively manage Kafka. Operators simplify the process of: Deploying and running Kafka clusters and components, Configuring and securing access to Kafka, Upgrading and managing Kafka and even taking care of managing topics and users.

Here is a diagram which shows a 10,000 feet overview of the Operator roles:

Install Strimzi

Installing Strimzi using Helm is pretty easy:

Java
 




xxxxxxxxxx
1


 
1
//add helm chart repo for Strimzi
2
helm repo add strimzi https://strimzi.io/charts/
3
 
          
4
//install it! (I have used strimzi-kafka as the release name)
5
helm install strimzi-kafka strimzi/strimzi-kafka-operator



This will install the Strimzi Operator (which is nothing but a Deployment), Custom Resource Definitions and other Kubernetes components such as Cluster Roles, Cluster Role Bindings and Service Accounts

For more details, check out this link

To delete, simply helm uninstall strimzi-kafka

To confirm that the Strimzi Operator had been deployed, check it's Pod (it should transition to Running status after a while)

Java
 




xxxxxxxxxx
1


 
1
kubectl get pods -l=name=strimzi-cluster-operator
2
 
          
3
NAME                                        READY   STATUS    RESTARTS   AGE
4
strimzi-cluster-operator-5c66f679d5-69rgk   1/1     Running   0          43s



Check the Custom Resource Definitions as well:

Java
 




xxxxxxxxxx
1
11


 
1
kubectl get crd | grep strimzi
2
 
          
3
kafkabridges.kafka.strimzi.io           2020-04-13T16:49:36Z
4
kafkaconnectors.kafka.strimzi.io        2020-04-13T16:49:33Z
5
kafkaconnects.kafka.strimzi.io          2020-04-13T16:49:36Z
6
kafkaconnects2is.kafka.strimzi.io       2020-04-13T16:49:38Z
7
kafkamirrormaker2s.kafka.strimzi.io     2020-04-13T16:49:37Z
8
kafkamirrormakers.kafka.strimzi.io      2020-04-13T16:49:39Z
9
kafkas.kafka.strimzi.io                 2020-04-13T16:49:40Z
10
kafkatopics.kafka.strimzi.io            2020-04-13T16:49:34Z
11
kafkausers.kafka.strimzi.io             2020-04-13T16:49:33Z


kafkas.kafka.strimzi.io CRD represents Kafka clusters in Kubernetes

Now that we have the "brain" (the Strimzi Operator) wired up, let's use it!

Time to Create a Kafka Cluster!

As mentioned, we will keep things simple and start off with the following setup (which we will incrementally update as a part of subsequent posts in this series):

  • A single node Kafka cluster (and Zookeeper)
  • Available internally to clients in the same Kubernetes cluster
  • No encryption, authentication or authorization
  • No persistence (uses emptyDir volume)

To deploy a Kafka cluster all we need to do is create a Strimzi Kafka resource. This is what it looks like:

Java
 




xxxxxxxxxx
1
21


 
1
apiVersion: kafka.strimzi.io/v1beta1
2
kind: Kafka
3
metadata:
4
  name: my-kafka-cluster
5
spec:
6
  kafka:
7
    version: 2.4.0
8
    replicas: 1
9
    listeners:
10
      plain: {}
11
    config:
12
      offsets.topic.replication.factor: 1
13
      transaction.state.log.replication.factor: 1
14
      transaction.state.log.min.isr: 1
15
      log.message.format.version: "2.4"
16
    storage:
17
      type: ephemeral
18
  zookeeper:
19
    replicas: 1
20
    storage:
21
      type: ephemeral


For a detailed Kafka CRD reference, please check out the documentation - https://strimzi.io/docs/operators/master/using.html#type-Kafka-reference

We define the name (my-kafka-cluster) of cluster in metadata.name. Here is a summary of attributes in spec.kafka:

  • version - The Kafka broker version (defaults to 2.5.0 at the time of writing, but we're using 2.4.0)
  • replicas - Kafka cluster size i.e. the number of Kafka nodes (Pods in the cluster)
  • listeners - Configures listeners of Kafka brokers. In this example we are using the plain listener which means that the cluster will be accessible to internal clients (in the same Kubernetes cluster) on port 9092 (no encryption, authentication or authorization involved). Supported types are plain, tls, external (See https://strimzi.io/docs/operators/master/using.html#type-KafkaListeners-reference). It is possible to configure multiple listeners (we will cover this in subsequent blogs posts)
  • config - These are key-value pairs used as Kafka broker config properties
  • storage - Storage for Kafka cluster. Supported types are ephemeral, persistent-claim and jbod. We are using ephemeral in this example which means that the emptyDir volume is used and the data is only associated with the lifetime of the Kafka broker Pod (a future blog post will cover persistent-claim storage)

Zookeeper cluster details (spec.zookeeper) are similar to that of Kafka. In this case we just configuring the no. of replicas and storage type. Refer to https://strimzi.io/docs/operators/master/using.html#type-ZookeeperClusterSpec-reference for details

To create the Kafka cluster:

Java
 




xxxxxxxxxx
1


 
1
kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-1/kafka.yaml


What's Next?

The Strimzi operator spins into action and creates many Kubernetes resources in response to the Kafka CRD instance we just created.

The following resources are created:

  • StatefulSet - Kafka and Zookeeper clusters are exist in the form of StatefulSets which is used to manage stateful workloads in Kubernetes. Please refer https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ and related material for details
  • Service - Kubernetes ClusterIP Service for internal access
  • ConfigMap - Kafka and Zookeeper configuration is stored in Kubernetes ConfigMaps
  • Secret - Kubernetes Secrets to store private keys and certificates for Kafka cluster components and clients. These are used for TLS encryption and authentication (covered in subsequent blog posts)

Kafka Custom Resource

Java
 




xxxxxxxxxx
1


 
1
kubectl get kafka
2
 
          
3
NAME               DESIRED KAFKA REPLICAS   DESIRED ZK REPLICAS
4
my-kafka-cluster   1                        1



StatefulSet and Pod

Check Kafka and Zookeeper StatefulSets using:

Java
 




xxxxxxxxxx
1


 
1
kubectl get statefulset/my-kafka-cluster-zookeeper
2
kubectl get statefulset/my-kafka-cluster-kafka



Kafka and Zookeeper Pods

Java
 




xxxxxxxxxx
1


 
1
kubectl get pod/my-kafka-cluster-zookeeper-0
2
kubectl get pod/my-kafka-cluster-kafka-0


ConfigMap

Individual ConfigMaps are created to store Kafka and Zookeeper configurations

Java
 




xxxxxxxxxx
1


 
1
kubectl get configmap
2
 
          
3
my-kafka-cluster-kafka-config         4      19m
4
my-kafka-cluster-zookeeper-config     2      20m



Let's peek into the Kafka configuration

Java
 




xxxxxxxxxx
1


 
1
kubectl get configmap/my-kafka-cluster-kafka-config -o yaml



The output is quite lengthy but I will highlight the important bits. As part of the data section, there are two config properties for the Kafka broker - log4j.properties and server.config.

Here is a snippet of the server.config. Notice the advertised.listeners (highlights the internal access over port 9092) and User provided configuration (the one we specified in the yaml manifest)

Java
 




xxxxxxxxxx
1
32


 
1
    ##############################
2
    ##############################
3
    # This file is automatically generated by the Strimzi Cluster Operator
4
    # Any changes to this file will be ignored and overwritten!
5
    ##############################
6
    ##############################
7
 
          
8
    broker.id=${STRIMZI_BROKER_ID}
9
    log.dirs=/var/lib/kafka/data/kafka-log${STRIMZI_BROKER_ID}
10
 
          
11
    ##########
12
    # Plain listener
13
    ##########
14
 
          
15
    ##########
16
    # Common listener configuration
17
    ##########
18
    listeners=REPLICATION-9091://0.0.0.0:9091,PLAIN-9092://0.0.0.0:9092
19
    advertised.listeners=REPLICATION-9091://my-kafka-cluster-kafka-${STRIMZI_BROKER_ID}.my-kafka-cluster-kafka-brokers.default.svc:9091,PLAIN-9092://my-kafka-cluster-kafka-${STRIMZI_BROKER_ID}.my-kafka-cluster-kafka-brokers.default.svc:9092
20
    listener.security.protocol.map=REPLICATION-9091:SSL,PLAIN-9092:PLAINTEXT
21
    inter.broker.listener.name=REPLICATION-9091
22
    sasl.enabled.mechanisms=
23
    ssl.secure.random.implementation=SHA1PRNG
24
    ssl.endpoint.identification.algorithm=HTTPS
25
 
          
26
    ##########
27
    # User provided configuration
28
    ##########
29
    log.message.format.version=2.4
30
    offsets.topic.replication.factor=1
31
    transaction.state.log.min.isr=1
32
    transaction.state.log.replication.factor=1


Service

If you query for Services, you should see something similar to this:

Java
 




xxxxxxxxxx
1


 
1
kubectl get svc
2
 
          
3
NAME                                TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)
4
my-kafka-cluster-kafka-bootstrap    ClusterIP   10.0.240.137   <none>        9091/TCP,9092/TCP
5
my-kafka-cluster-kafka-brokers      ClusterIP   None           <none>        9091/TCP,9092/TCP
6
my-kafka-cluster-zookeeper-client   ClusterIP   10.0.143.149   <none>        2181/TCP
7
my-kafka-cluster-zookeeper-nodes    ClusterIP   None           <none>        2181/TCP,2888/TCP,3888/TCP



my-kafka-cluster-kafka-bootstrap makes it possible for internal Kubernetes clients to access the Kafka cluster and my-kafka-cluster-kafka-brokers is the Headless service corresponding to the StatefulSet

Secret

Although we're not using them, it's helpful to look at the Secrets created by Strimzi:

Java
 




xxxxxxxxxx
1
11


 
1
kubectl get secret
2
 
          
3
my-kafka-cluster-clients-ca               Opaque
4
my-kafka-cluster-clients-ca-cert          Opaque
5
my-kafka-cluster-cluster-ca               Opaque
6
my-kafka-cluster-cluster-ca-cert          Opaque
7
my-kafka-cluster-cluster-operator-certs   Opaque
8
my-kafka-cluster-kafka-brokers            Opaque
9
my-kafka-cluster-kafka-token-vb2qt        kubernetes.io/service-account-token
10
my-kafka-cluster-zookeeper-nodes          Opaque
11
my-kafka-cluster-zookeeper-token-xq8m2    kubernetes.io/service-account-token


  • my-kafka-cluster-cluster-ca-cert - Cluster CA certificate to sign Kafka broker certificates, and is used by a connecting client to establish a TLS encrypted connection
  • my-kafka-cluster-clients-ca-cert - Client CA certificate for a user to sign its own client certificate to allow mutual authentication against the Kafka cluster

Ok, but does it work?

Let's take it for a spin!

Create a producer Pod:

Java
 




xxxxxxxxxx
1


 
1
export KAFKA_CLUSTER_NAME=my-kafka-cluster
2
 
          
3
kubectl run kafka-producer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list $KAFKA_CLUSTER_NAME-kafka-bootstrap:9092 --topic my-topic



In another terminal, create a consumer Pod:

Java
 




xxxxxxxxxx
1


 
1
export KAFKA_CLUSTER_NAME=my-kafka-cluster
2
 
          
3
kubectl run kafka-consumer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server $KAFKA_CLUSTER_NAME-kafka-bootstrap:9092 --topic my-topic --from-beginning


The above demonstration was taken from the Strimzi doc - https://strimzi.io/docs/operators/master/deploying.html#deploying-example-clients-str

You can use other clients as well

We're Just Getting Started...

We started small, but have a Kafka cluster on Kubernetes, and it works (hopefully for you as well!). As I mentioned before, this is the beginning of a multi-part blog series. Stay tuned for upcoming posts where we will explore other aspects such as external client access, TLS access, authentication, persistence, etc.

Topics:
big data, cncf, docker, kafka, kubernetes, strimzi, tutorial

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}