Kafka on Kubernetes, the Strimzi Way! (Part 4)

DZone 's Guide to

Kafka on Kubernetes, the Strimzi Way! (Part 4)

In this part, we will configure persistence for our Kafka cluster

· Big Data Zone ·
Free Resource

Welcome to part four of this blog series! So far, we have a Kafka single-node cluster with TLS encryption on top of which we configured different authentication modes (TLS and SASL SCRAM-SHA-512), defined users with the User Operator, connected to the cluster using CLI and Go clients, and saw how easy it is to manage Kafka topics with the Topic Operator. So far, our cluster used ephemeral persistence, which in the case of a single-node cluster, means that we will lose data if the Kafka or Zookeeper nodes (Pods) are restarted due to any reason.

Let's march on! In this part we will cover:

  • How to configure Strimzi to add persistence for our cluster.
  • Explore the components such as PersistentVolume and PersistentVolumeClaim
  • How to modify the storage quality.
  • Try and expand the storage size for our Kafka cluster.

The code is available on GitHub - https://github.com/abhirockzz/kafka-kubernetes-strimzi/

What Do I Need to Go Through This Tutorial?

kubectl - https://kubernetes.io/docs/tasks/tools/install-kubectl/

I will be using Azure Kubernetes Service (AKS) to demonstrate the concepts, but by and large it is independent of the Kubernetes provider. If you want to use AKS, all you need is a Microsoft Azure account which you can get for FREE if you don't have one already.

I will not be repeating some of the common sections (such as Installation/Setup (Helm, Strimzi, Azure Kubernetes Service), Strimzi overview) in this or subsequent part of this series and would request you to refer to part one

Add Persistence

We will start off by creating a persistent cluster. Here is a snippet of the specification (you can access the complete YAML on GitHub)


The key things to notice:

  • storage.type is persistent-claim (as opposed to ephemeral) in previous examples.
  • storage.size for Kafka and Zookeeper nodes is 2Gi and 1Gi respectively.
  • deleteClaim: true means that the corresponding PersistentVolumeClaims will be deleted when the cluster is deleted/un-deployed.

You can take a look at the reference for storage https://strimzi.io/docs/operators/master/using.html#type-PersistentClaimStorage-reference

To create the cluster:


Let's see the what happens in response to the cluster creation

Strimzi Kubernetes Magic...

Strimzi does all the heavy lifting of creating required Kubernetes resources in order to operate the cluster. We covered most of these in part 1 - StatefulSet (and Pods), LoadBalancer Service, ConfigMap, Secret etc. In this blog, we will just focus on the persistence related components - PersistentVolume and PersistentVolumeClaim

To check the PersistentVolumeClaims


... and the PersistentVolumes they are Bound to


Notice that the disk size is as specified in the manifest ie. 2 and 1 Gib for Kafka and Zookeeper respectively

Where Is the Data?

If we want to see the data itself, let's first check the ConfigMap which stores the Kafka server config:


In server.config section, you will find an entry as such:


This tells us that the Kafka data is stored in /var/lib/kafka/data/kafka-log${STRIMZI_BROKER_ID}. In this case STRIMZI_BROKER_ID is 0 since we all we have is a single node

With this info, let's look the the Kafka Pod:


If you look into the kafka container section, you will notice the following:

One of the volumes configuration:


The volume named data is associated with the data-my-kafka-cluster-kafka-0 PVC, and the corresponding volumeMounts uses this volume to ensure that Kafka data is stored in /var/lib/kafka/data


To see the contents,


You can repeat the same for Zookeeper node as well

What About the Cloud?

As mentioned before, in the case of AKS, the data will end up being stored in an Azure Managed Disk. The type of disk is as per the default storage class in your AKS cluster. In my case, it is:


More on the semantics for default storage class in AKS in the documentation

To query the disk in Azure, extract the PersistentVolume info using kubectl get pv/<name of kafka pv> -o yaml and get the ID of the Azure Disk i.e. spec.azureDisk.diskURI

You can use the Azure CLI command az disk show command


You will see that the storage type as defined in sku section is StandardSSD_LRS which corresponds to a Standard SSD

This table provides a comparison of different Azure Disk types


... and the tags attribute highlight the PV and PVC association


You can repeat the same for Zookeeper disks as well

Quick Test...

Follow these steps to confirm that the cluster is working as expected..

Create a producer Pod:


In another terminal, create a consumer Pod:


What if(s) ...

Let's explore how to tackle a couple of requirements which you'll come across:

  • Using a different storage type - In case of Azure for example, you might want to use Azure Premium SSD for production workloads
  • Re-sizing the storage - at some point you'll want to add storage to your Kafka cluster

Change the Storage Type

Recall that the default behavior is for Strimzi to create a PersistentVolumeClaim that references the default Storage Class. To customize this, you can simply include the class attribute in the storage specification in spec.kafka (and/or spec.zookeeper).

In Azure, the managed-premium storage class corresponds to a Premium SSD: kubectl get sc/managed-premium -o yaml

Here is a snippet from the storage config, where class: managed-premium has been added.


Please note that you cannot update the storage type for an existing cluster. To try this out:

  • Delete the existing cluster - kubectl delete kafka/my-kafka-cluster (wait for a while)
  • Create a new cluster - kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-4/kafka-persistent-premium.yaml

To confirm, check the PersistentVolumeClain for Kafka node - notice the STORAGECLASS colum


We only configured the Kafka broker to use the Premium storage, so the Zookeeper Pod will use the StandardSSD storage type.

Re-size Storage (TL;DR - Does Not Work Yet)

Azure Disks allow you to add more storage to it. In the case of Kubernetes, it is the storage class which defines whether this is supported or not - for AKS, if you check the default (or the managed-premium) storage class, you will notice the property allowVolumeExpansion: true, which confirms that you can do so in the context of Kubernetes PVC as well.

Strimzi makes it really easy to increase the storage for our Kafka cluster - all you need to do is update the storage.size field to the desired value

Check the PVC now: kubectl describe pvc data-my-kafka-cluster-kafka-0

Plain Text

Notice the "Cannot resize disk... error message. This is happening because the Azure Disk is currently attached with AKS cluster node and that is because of the Pod is associated with the PersistentVolumeClaim - this is a documented limitation

I am not the first one to run into this problem of course. Please refer to issues such as this one for details.

There are workarounds but they have not been discussed in this blog. I included the section since I wanted you to be aware of this caveat

Final Countdown 

We want to leave on a high note, don't we? Alright, so to wrap it up, let's scale our cluster out from one to three nodes. It'd dead simple!

All you need to do is to increase the replicas to the desired number - in this case, I configured it to 3 (for Kafka and Zookeeper)


In addition to this, I also added an external load balancer listener (this will create an Azure Load Balancer, as discussed in part 2)


To create the new, simply use the new manifest


Please note that the overall cluster readiness will take time since there will be additional components (Azure Disks, Load Balancer public IPs etc.) that'll be created prior to the Pods being activated

In Your k8s Cluster, You Will See...

Three Pods each for Kafka and Zookeeper


Three pairs (each for Kafka and Zookeeper) of PersistentVolumeClaims ...


... and the respective PersistentVolumes they are bound to

Plain Text

... and Load Balancer IPs. Notice that these are created for each Kafka broker as well as a bootstrap IP which is recommended when connecting from the client application.


To access the cluster, you can use the steps outlined in part 2

It's a Wrap!

That's it for this blog series on which covered some of the aspects of running Kafka on Kubernetes using the open source Strimzi operator.

If this topic is of interest to you, I encourage you to check out other solutions such as Confluent operator and Banzai Cloud Kafka operator.

azure kubernetes service, bigdata, cncf, kafka, kubernetes

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}