DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Building Hybrid Multi-Cloud Event Mesh With Apache Camel and Kubernetes
  • Kafka on Kubernetes, the Strimzi Way (Part 2)
  • Kafka on Kubernetes, the Strimzi Way! (Part 1)
  • Kafka Connect on Kubernetes The Easy Way!

Trending

  • A Guide to Auto-Tagging and Lineage Tracking With OpenMetadata
  • AI Agents: A New Era for Integration Professionals
  • Infrastructure as Code (IaC) Beyond the Basics
  • Driving DevOps With Smart, Scalable Testing
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Kafka on Kubernetes, the Strimzi Way! (Part 4)

Kafka on Kubernetes, the Strimzi Way! (Part 4)

In this part, we will configure persistence for our Kafka cluster

By 
Abhishek Gupta user avatar
Abhishek Gupta
DZone Core CORE ·
Aug. 07, 20 · Tutorial
Likes (6)
Comment
Save
Tweet
Share
9.9K Views

Join the DZone community and get the full member experience.

Join For Free

Welcome to part four of this blog series! So far, we have a Kafka single-node cluster with TLS encryption on top of which we configured different authentication modes (TLS and SASL SCRAM-SHA-512), defined users with the User Operator, connected to the cluster using CLI and Go clients, and saw how easy it is to manage Kafka topics with the Topic Operator. So far, our cluster used ephemeral persistence, which in the case of a single-node cluster, means that we will lose data if the Kafka or Zookeeper nodes (Pods) are restarted due to any reason.

Let's march on! In this part we will cover:

  • How to configure Strimzi to add persistence for our cluster.
  • Explore the components such as PersistentVolume and PersistentVolumeClaim
  • How to modify the storage quality.
  • Try and expand the storage size for our Kafka cluster.

The code is available on GitHub - https://github.com/abhirockzz/kafka-kubernetes-strimzi/

What Do I Need to Go Through This Tutorial?

kubectl - https://kubernetes.io/docs/tasks/tools/install-kubectl/

I will be using Azure Kubernetes Service (AKS) to demonstrate the concepts, but by and large it is independent of the Kubernetes provider. If you want to use AKS, all you need is a Microsoft Azure account which you can get for FREE if you don't have one already.

I will not be repeating some of the common sections (such as Installation/Setup (Helm, Strimzi, Azure Kubernetes Service), Strimzi overview) in this or subsequent part of this series and would request you to refer to part one

Add Persistence

We will start off by creating a persistent cluster. Here is a snippet of the specification (you can access the complete YAML on GitHub)

YAML
xxxxxxxxxx
1
19
 
1
apiVersion: kafka.strimzi.io/v1beta1
2
kind: Kafka
3
metadata:
4
  name: my-kafka-cluster
5
spec:
6
  kafka:
7
    version: 2.4.0
8
    replicas: 1
9
    storage:
10
      type: persistent-claim
11
      size: 2Gi
12
      deleteClaim: true
13
....
14
  zookeeper:
15
    replicas: 1
16
    storage:
17
      type: persistent-claim
18
      size: 1Gi
19
      deleteClaim: true


The key things to notice:

  • storage.type is persistent-claim (as opposed to ephemeral) in previous examples.
  • storage.size for Kafka and Zookeeper nodes is 2Gi and 1Gi respectively.
  • deleteClaim: true means that the corresponding PersistentVolumeClaims will be deleted when the cluster is deleted/un-deployed.

You can take a look at the reference for storage https://strimzi.io/docs/operators/master/using.html#type-PersistentClaimStorage-reference

To create the cluster:

Shell
xxxxxxxxxx
1
 
1
kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-4/kafka-persistent.yaml


Let's see the what happens in response to the cluster creation

Strimzi Kubernetes Magic...

Strimzi does all the heavy lifting of creating required Kubernetes resources in order to operate the cluster. We covered most of these in part 1 - StatefulSet (and Pods), LoadBalancer Service, ConfigMap, Secret etc. In this blog, we will just focus on the persistence related components - PersistentVolume and PersistentVolumeClaim

To check the PersistentVolumeClaims

Shell
xxxxxxxxxx
1
 
1
kubectl get pvc
2
3
4
NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
5
data-my-kafka-cluster-kafka-0       Bound    pvc-b4ece32b-a46c-4fbc-9b58-9413eee9c779   2Gi        RWO            default        94s
6
data-my-kafka-cluster-zookeeper-0   Bound    pvc-d705fea9-c443-461c-8d18-acf8e219eab0   1Gi        RWO            default        3m20s


... and the PersistentVolumes they are Bound to

Shell
xxxxxxxxxx
1
 
1
kubectl get pv
2
3
4
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                       STORAGECLASS   REASON   AGE
5
pvc-b4ece32b-a46c-4fbc-9b58-9413eee9c779   2Gi        RWO            Delete           Bound    default/data-my-kafka-cluster-kafka-0       default                 107s
6
pvc-d705fea9-c443-461c-8d18-acf8e219eab0   1Gi        RWO            Delete           Bound    default/data-my-kafka-cluster-zookeeper-0   default                 3m35s

Notice that the disk size is as specified in the manifest ie. 2 and 1 Gib for Kafka and Zookeeper respectively

Where Is the Data?

If we want to see the data itself, let's first check the ConfigMap which stores the Kafka server config:

Shell
xxxxxxxxxx
1
 
1
export CLUSTER_NAME=my-kafka-cluster
2
kubectl get configmap/${CLUSTER_NAME}-kafka-config -o yaml


In server.config section, you will find an entry as such:

Shell
xxxxxxxxxx
1
 
1
##########
2
# Kafka message logs configuration
3
##########
4
log.dirs=/var/lib/kafka/data/kafka-log${STRIMZI_BROKER_ID}


This tells us that the Kafka data is stored in /var/lib/kafka/data/kafka-log${STRIMZI_BROKER_ID}. In this case STRIMZI_BROKER_ID is 0 since we all we have is a single node

With this info, let's look the the Kafka Pod:

Shell
xxxxxxxxxx
1
 
1
export CLUSTER_NAME=my-kafka-cluster
2
kubectl get pod/${CLUSTER_NAME}-kafka-0 -o yaml


If you look into the kafka container section, you will notice the following:

One of the volumes configuration:

YAML
xxxxxxxxxx
1
 
1
volumes:
2
- name: data
3
  persistentVolumeClaim:
4
    claimName: data-my-kafka-cluster-kafka-0


The volume named data is associated with the data-my-kafka-cluster-kafka-0 PVC, and the corresponding volumeMounts uses this volume to ensure that Kafka data is stored in /var/lib/kafka/data

YAML
xxxxxxxxxx
1
 
1
volumeMounts:
2
- mountPath: /var/lib/kafka/data
3
  name: data


To see the contents,

Shell
xxxxxxxxxx
1
 
1
export STRIMZI_BROKER_ID=0
2
kubectl exec -it my-kafka-cluster-kafka-0 -- ls -lrt /var/lib/kafka/data/kafka-log${STRIMZI_BROKER_ID}

You can repeat the same for Zookeeper node as well

What About the Cloud?

As mentioned before, in the case of AKS, the data will end up being stored in an Azure Managed Disk. The type of disk is as per the default storage class in your AKS cluster. In my case, it is:

Shell
xxxxxxxxxx
1
 
1
kubectl get sc
2
3
azurefile           kubernetes.io/azure-file   58d
4
azurefile-premium   kubernetes.io/azure-file   58d
5
default (default)   kubernetes.io/azure-disk   2d18h
6
managed-premium     kubernetes.io/azure-disk   2d18h
7
8
//to get details of the storage class
9
kubectl get sc/default -o yaml

More on the semantics for default storage class in AKS in the documentation

To query the disk in Azure, extract the PersistentVolume info using kubectl get pv/<name of kafka pv> -o yaml and get the ID of the Azure Disk i.e. spec.azureDisk.diskURI

You can use the Azure CLI command az disk show command

Shell
xxxxxxxxxx
1
 
1
az disk show --ids <diskURI value>


You will see that the storage type as defined in sku section is StandardSSD_LRS which corresponds to a Standard SSD

This table provides a comparison of different Azure Disk types

JSON
xxxxxxxxxx
1
 
1
  "sku": {
2
    "name": "StandardSSD_LRS",
3
    "tier": "Standard"
4
  }


... and the tags attribute highlight the PV and PVC association

JSON
xxxxxxxxxx
1
 
1
  "tags": {
2
    "created-by": "kubernetes-azure-dd",
3
    "kubernetes.io-created-for-pv-name": "pvc-b4ece32b-a46c-4fbc-9b58-9413eee9c779",
4
    "kubernetes.io-created-for-pvc-name": "data-my-kafka-cluster-kafka-0",
5
    "kubernetes.io-created-for-pvc-namespace": "default"
6
  }

You can repeat the same for Zookeeper disks as well

Quick Test...

Follow these steps to confirm that the cluster is working as expected..

Create a producer Pod:

Shell
xxxxxxxxxx
1
 
1
export KAFKA_CLUSTER_NAME=my-kafka-cluster
2
3
kubectl run kafka-producer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list $KAFKA_CLUSTER_NAME-kafka-bootstrap:9092 --topic my-topic


In another terminal, create a consumer Pod:

Shell
xxxxxxxxxx
1
 
1
export KAFKA_CLUSTER_NAME=my-kafka-cluster
2
3
kubectl run kafka-consumer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server $KAFKA_CLUSTER_NAME-kafka-bootstrap:9092 --topic my-topic --from-beginning

What if(s) ...

Let's explore how to tackle a couple of requirements which you'll come across:

  • Using a different storage type - In case of Azure for example, you might want to use Azure Premium SSD for production workloads
  • Re-sizing the storage - at some point you'll want to add storage to your Kafka cluster

Change the Storage Type

Recall that the default behavior is for Strimzi to create a PersistentVolumeClaim that references the default Storage Class. To customize this, you can simply include the class attribute in the storage specification in spec.kafka (and/or spec.zookeeper).

In Azure, the managed-premium storage class corresponds to a Premium SSD: kubectl get sc/managed-premium -o yaml

Here is a snippet from the storage config, where class: managed-premium has been added.

YAML
xxxxxxxxxx
1
 
1
storage:
2
  type: persistent-claim
3
  size: 2Gi
4
  deleteClaim: true
5
  class: managed-premium


Please note that you cannot update the storage type for an existing cluster. To try this out:

  • Delete the existing cluster - kubectl delete kafka/my-kafka-cluster (wait for a while)
  • Create a new cluster - kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-4/kafka-persistent-premium.yaml
Shell
xxxxxxxxxx
1
 
1
//Delete the existing cluster
2
kubectl delete kafka/my-kafka-cluster
3
4
//Create a new cluster
5
kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-4/kafka-persistent-premium.yaml


To confirm, check the PersistentVolumeClain for Kafka node - notice the STORAGECLASS colum

Shell
xxxxxxxxxx
1
 
1
kubectl get pvc/data-my-kafka-cluster-kafka-0
2
3
NAME                            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
4
data-my-kafka-cluster-kafka-0   Bound    pvc-3f46d6ed-9da5-4c49-87ef-86684ab21cf8   2Gi        RWO            managed-premium   21s

We only configured the Kafka broker to use the Premium storage, so the Zookeeper Pod will use the StandardSSD storage type.

Re-size Storage (TL;DR - Does Not Work Yet)

Azure Disks allow you to add more storage to it. In the case of Kubernetes, it is the storage class which defines whether this is supported or not - for AKS, if you check the default (or the managed-premium) storage class, you will notice the property allowVolumeExpansion: true, which confirms that you can do so in the context of Kubernetes PVC as well.

Strimzi makes it really easy to increase the storage for our Kafka cluster - all you need to do is update the storage.size field to the desired value

Check the PVC now: kubectl describe pvc data-my-kafka-cluster-kafka-0

Plain Text
xxxxxxxxxx
1
 
1
Conditions:
2
  Type       Status  LastProbeTime                     LastTransitionTime                Reason  Message
3
  ----       ------  -----------------                 ------------------                ------  -------
4
  Resizing   True    Mon, 01 Jan 0001 00:00:00 +0000   Mon, 22 Jun 2020 23:15:26 +0530           
5
Events:
6
  Type     Reason              Age                From           Message
7
  ----     ------              ----               ----           -------
8
  Warning  VolumeResizeFailed  3s (x11 over 13s)  volume_expand  error expanding volume "default/data-my-kafka-cluster-kafka-0" of plugin "kubernetes.io/azure-disk": compute.DisksClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="OperationNotAllowed" Message="Cannot resize disk kubernetes-dynamic-pvc-3f46d6ed-9da5-4c49-87ef-86684ab21cf8 while it is attached to running VM /subscriptions/9a42a42f-ae42-4242-b6a7-dda0ea91d342/resourceGroups/mc_my-k8s-vk_southeastasia/providers/Microsoft.Compute/virtualMachines/aks-agentpool-42424242-1. Resizing a disk of an Azure Virtual Machine requires the virtual machine to be deallocated. Please stop your VM and retry the operation."


Notice the "Cannot resize disk... error message. This is happening because the Azure Disk is currently attached with AKS cluster node and that is because of the Pod is associated with the PersistentVolumeClaim - this is a documented limitation

I am not the first one to run into this problem of course. Please refer to issues such as this one for details.

There are workarounds but they have not been discussed in this blog. I included the section since I wanted you to be aware of this caveat

Final Countdown 

We want to leave on a high note, don't we? Alright, so to wrap it up, let's scale our cluster out from one to three nodes. It'd dead simple!

All you need to do is to increase the replicas to the desired number - in this case, I configured it to 3 (for Kafka and Zookeeper)

YAML
xxxxxxxxxx
1
 
1
...
2
spec:
3
  kafka:
4
    version: 2.4.0
5
    replicas: 3
6
  zookeeper:
7
    replicas: 3
8
...


In addition to this, I also added an external load balancer listener (this will create an Azure Load Balancer, as discussed in part 2)

YAML
xxxxxxxxxx
1
 
1
...
2
    listeners:
3
      plain: {}
4
      external:
5
        type: loadbalancer
6
...


To create the new, simply use the new manifest

Shell
xxxxxxxxxx
1
 
1
kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-4/kafka-persistent-multi-node.yaml

Please note that the overall cluster readiness will take time since there will be additional components (Azure Disks, Load Balancer public IPs etc.) that'll be created prior to the Pods being activated

In Your k8s Cluster, You Will See...

Three Pods each for Kafka and Zookeeper

Shell
xxxxxxxxxx
1
 
1
kubectl get pod -l=app.kubernetes.io/instance=my-kafka-cluster
2
3
NAME                           READY   STATUS    RESTARTS   AGE
4
my-kafka-cluster-kafka-0       2/2     Running   0          54s
5
my-kafka-cluster-kafka-1       2/2     Running   0          54s
6
my-kafka-cluster-kafka-2       2/2     Running   0          54s
7
my-kafka-cluster-zookeeper-0   1/1     Running   0          4m44s
8
my-kafka-cluster-zookeeper-1   1/1     Running   0          4m44s
9
my-kafka-cluster-zookeeper-2   1/1     Running   0          4m44s


Three pairs (each for Kafka and Zookeeper) of PersistentVolumeClaims ...

Shell
xxxxxxxxxx
1
 
1
kubectl get pvc
2
3
NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
4
data-my-kafka-cluster-kafka-0       Bound    pvc-0f52dee1-970a-4c55-92bd-a97dcc41aee6   3Gi        RWO            managed-premium   10m
5
data-my-kafka-cluster-kafka-1       Bound    pvc-f8b613cb-3da0-4932-acea-7e5e96df1433   3Gi        RWO            managed-premium   4m24s
6
data-my-kafka-cluster-kafka-2       Bound    pvc-fedf431c-d87a-4bf7-80d0-d43b1337c079   3Gi        RWO            managed-premium   4m24s
7
data-my-kafka-cluster-zookeeper-0   Bound    pvc-1fda3714-3c37-428f-9e4b-bdb5da71cda6   1Gi        RWO            default           12m
8
data-my-kafka-cluster-zookeeper-1   Bound    pvc-702556e0-890a-4c07-ae5c-e2354d74d006   1Gi        RWO            default           6m42s
9
data-my-kafka-cluster-zookeeper-2   Bound    pvc-176ffd68-7e3a-4e04-abb1-52c54dcb84f0   1Gi        RWO            default           6m42s


... and the respective PersistentVolumes they are bound to

Plain Text
xxxxxxxxxx
1
 
1
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                                       STORAGECLASS      REASON   AGE
2
pvc-0f52dee1-970a-4c55-92bd-a97dcc41aee6   3Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-kafka-0       managed-premium            12m
3
pvc-176ffd68-7e3a-4e04-abb1-52c54dcb84f0   1Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-zookeeper-2   default                    8m45s
4
pvc-1fda3714-3c37-428f-9e4b-bdb5da71cda6   1Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-zookeeper-0   default                    14m
5
pvc-702556e0-890a-4c07-ae5c-e2354d74d006   1Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-zookeeper-1   default                    8m45s
6
pvc-f8b613cb-3da0-4932-acea-7e5e96df1433   3Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-kafka-1       managed-premium            6m27s
7
pvc-fedf431c-d87a-4bf7-80d0-d43b1337c079   3Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-kafka-2       managed-premium            6m22s


... and Load Balancer IPs. Notice that these are created for each Kafka broker as well as a bootstrap IP which is recommended when connecting from the client application.

Shell
x
 
1
kubectl get svc -l=app.kubernetes.io/instance=my-kafka-cluster
2
3
NAME                                        TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)                      AGE
4
my-kafka-cluster-kafka-0                    LoadBalancer   10.0.11.154    40.119.248.164   9094:30977/TCP               10m
5
my-kafka-cluster-kafka-1                    LoadBalancer   10.0.146.181   20.43.191.219    9094:30308/TCP               10m
6
my-kafka-cluster-kafka-2                    LoadBalancer   10.0.223.202   40.119.249.20    9094:30313/TCP               10m
7
my-kafka-cluster-kafka-bootstrap            ClusterIP      10.0.208.187   <none>           9091/TCP,9092/TCP            16m
8
my-kafka-cluster-kafka-brokers              ClusterIP      None           <none>           9091/TCP,9092/TCP            16m
9
my-kafka-cluster-kafka-external-bootstrap   LoadBalancer   10.0.77.213    20.43.191.238    9094:31051/TCP               10m
10
my-kafka-cluster-zookeeper-client           ClusterIP      10.0.3.155     <none>           2181/TCP                     18m
11
my-kafka-cluster-zookeeper-nodes            ClusterIP      None           <none>           2181/TCP,2888/TCP,3888/TCP   18m


To access the cluster, you can use the steps outlined in part 2

It's a Wrap!

That's it for this blog series on which covered some of the aspects of running Kafka on Kubernetes using the open source Strimzi operator.

If this topic is of interest to you, I encourage you to check out other solutions such as Confluent operator and Banzai Cloud Kafka operator.

kafka Kubernetes azure cluster

Opinions expressed by DZone contributors are their own.

Related

  • Building Hybrid Multi-Cloud Event Mesh With Apache Camel and Kubernetes
  • Kafka on Kubernetes, the Strimzi Way (Part 2)
  • Kafka on Kubernetes, the Strimzi Way! (Part 1)
  • Kafka Connect on Kubernetes The Easy Way!

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!