Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Optimizing Kubernetes Storage for Production

DZone 's Guide to

Optimizing Kubernetes Storage for Production

You’re ready to take Tier 1 apps into production on Kubernetes, but storage is holding you back. Here’s how to optimize for production and scale.

· Cloud Zone ·
Free Resource

You’ve containerized apps in Kubernetes and are ready to move to production, but accessing storage dynamically has been a roadblock. This is where your choice of underlying storage fabric can make or break your ability to scale.

Your infrastructure is a wide assortment of server and storage classes, generations, and mediaeach one with a different personality. With the right storage fabric, neither the app developer nor the storage admin needs to be prescriptive about which resources are the best fitthat should be determined in human language terms (e.g. super performant, highly secure) and based on Storage Classes defined in Kubernetes.

Defining Storage Classes

Optimizing storage in Kubernetes is achieved by managing a class of storage against application intent. Just like a Terraform script that defines application needs, the storage platform should supply storage needs using templates whether you’re looking for IOPs, latency, scale, efficiency (compression and dedupe), security (encryption).

Dynamic storage provisioning in Kubernetes is based on the Storage Classes. A persistent volume uses a given Storage Class specified into its definition file. A claim can request a particular class by specifying the name of a Storage Class in its definition file. Only volumes of the requested class can be bound to the claim requesting that class.

Multiple Storage Classes can be defined to match the diverse requirements on storage resources across multiple applications. This allows the cluster administrator to define multiple types of storage within a cluster, each with a custom set of parameters.

Your storage fabric should then have an intelligent optimizer that analyzes the user request, matches to resources, and places them appropriately onto a server that matches the personality of the data. It should also be policy-driven and use telemetry to continuously inventory and optimize for application and microservice needs without human involvement.

Let’s say you want to create a Storage Class to access your MySQL data, add extra protection by making 3 replicas, and place it in a higher performance tier to serve reads/writes better. You can set this up in the Datera platform with just a few steps.

Create a Storage Class for the Datera storage backend as in the following datera-storage-class.yaml file example here:

$ cat datera-storage-class.yaml 

apiVersion: storage.k8s.io/v1
kind: StorageClass
    metadata:
        labels:
        name: datera-storage-class
    namespace:
provisioner: io.datera.csi.dsp
reclaimPolicy: Delete
parameters:
    template: “basic”
    secretName: “mysecret”
    secretNamespace: “default”
    fsType: “xfs”

$ kubectl create -f datera-storage-class.yaml 


Datera allows the Kubernetes cluster administrator to configure a wide range of parameters in the Storage Class definition. Another way to define a Storage Class is with labels (policies), e.g. gold, silver, bronze. These options convey the storage personality and define necessary policies at Datera Storage System level.

It’s very easy to compose three or 300 Storage Classes on the fly in Datera. To manage data services such as dedupe, compression and encryption for high velocity apps and microservices, you can attach volumes to a storage pool that can be further configured alongside policies for data protection and efficiency. Where typically this is done in a silo, Datera can achieve this level of velocity and scale for Tier 1 workloads.

If you have absolutely no idea what the app requirements will be, that’s okDatera uses AI/ML to find the best placement, and the resources and will automatically adjust based on inputs. For mature apps, you can graduate those to policies and templates.

No Scale Without Automation

Kubernetes lets applications scale along with a Storage Class without deviating from the base definition/requirements of resources. Datera keeps that promise intact by binding Storage Classes to templates, and homogeneously extend resources (volumes) to match the intent of applications (consumers).

As application needs change, the storage should adapt alongside it and not rely on human intervention. This is done by defining policies and templates. The storage fabric should also recognize and adapt to new nodes or hardware and automatically adjust to enhance performance, capacity, and resilience of a cluster, as well as making the resources available to all the underlying PVs.

Create the StatefulSet:

 $ kubectl create -f consul-sts.yaml 

List the pods:

$ kubectl get pods -o wide 

NAME     READY STATUS          RESTARTS  AGE IP          NODE
consul-0 1/1   Running         0         32s 10.38.3.152 kubew03
consul-1 0/1   PodInitializing 0         15s 10.38.4.4   kubew04


The first pod has been created. The second pod will be created only after the first one is up and running, and so on. StatefulSets behave this way because some stateful applications can fail if two or more cluster members come up at the same time. For a StatefulSet with N replicas, when pods are deployed, they are created sequentially, in order from {0...N-1} with a sticky, unique identity in the form $(statefulset name)-$(ordinal).The (i)th pod is not created until the (i-1)th is running. This ensures a predictable order of pod creation. However, if the order of pod creation is not strictly required, it is possible to create pods in parallel by setting the podManagementPolicy: Parallel option in the StatefulSet template.

List the pods again to see how the pod creation is progressing:

$ kubectl get pods -o wide 

NAME     READY STATUS   RESTARTS  AGE IP          NODE
consul-0 1/1   Running  0         29m 10.38.3.152 kubew03
consul-1 1/1   Running  0         28m 10.38.4.4   kubew04
consul-2 1/1   Running  0         27m 10.38.5.128 kubew05


Now all pods are running and forming the initial cluster.

Scaling the Cluster

Scaling down a StatefulSet and then scaling it up is similar to deleting a pod and waiting for the StatefulSet to recreate it. Please, remember that scaling down a StatefulSet only deletes the pods, but leaves the Persistent Volume Claims. Also note that scaling down and scaling up is performed similarly to how pods are created when the StatefulSet is created. When scaling down, the pod with the highest index is deleted first: only after that pod gets deleted, the pod with the second-highest index is deleted, and so on.

What is the expected behavior scaling up the Consul cluster? Since the Consul cluster is based on the Raft algorithm, we have to scale up our 3 nodes cluster by 2 nodes at the same time because an odd number of nodes is always required to form a healthy Consul cluster. We also expect a new Persistent Volume Claim is created for each new pod.

Scale the StatefulSet:

 $ kubectl scale sts consul --replicas=5 

By listing the pods, we see our Consul cluster gets scaled up:

$ kubectl get pods -o wide

NAME     READY STATUS  RESTARTS  AGE  IP          NODE
consul-0 1/1   Running    0      5m   10.38.3.160 kubew03
consul-1 1/1   Running    0      5m   10.38.4.10  kubew04
consul-2 1/1   Running    0      4m   10.38.5.132 kubew05
consul-3 1/1   Running    0      1m   10.38.3.161 kubew03
consul-4 1/1   Running    0      1m   10.38.4.11  kubew04


Check the membership of the scaled cluster:

$ kubectl exec -it consul-0 -- consul members 

NODE     ADDRESS           STATUS  TYPE   BUILD PROTOCOLDC SEGMENT
consul-0 10.38.3.160:8301  alive   server 1.0.2     2      kubernetes
consul-1 10.38.4.10:8301   alive   server 1.0.2     2      kubernetes
consul-2 10.38.5.132:8301  alive   server 1.0.2     2      kubernetes
consul-3 10.38.3.161:8301  alive   server 1.0.2     2      kubernetes
consul-4 10.38.4.11:8301   alive   server 1.0.2     2      kubernetes


Also, check the dynamic Datera storage provisioner created the additional volumes:

$ kubectl get pvc -o wide 

NAME          STATUS VOLUME                                CAPACITY ACCESSMODES STORAGECLASS         AGE
data-consul-0 Bound  pvc-91df35af-0123-11e8-86d2-000c29f8a512 1Gi      RWO      datera-storage-class 7m
data-consul-1 Bound  pvc-a010257c-0123-11e8-86d2-000c29f8a512 1Gi      RWO      datera-storage-class 6m
data-consul-2 Bound  pvc-adaa4d2d-0123-11e8-86d2-000c29f8a512 1Gi      RWO      datera-storage-class 6m
data-consul-3 Bound  pvc-1b1b9bd6-0124-11e8-86d2-000c29f8a512 1Gi      RWO      datera-storage-class 3m
data-consul-4 Bound  pvc-28feff1c-0124-11e8-86d2-000c29f8a512 1Gi      RWO      datera-storage-class 2m


Today, companies may be deploying apps by Storage Class. But according to IDC 90% of new applications will be built with a microservices architecture in 2019. Data velocity must match microservices with the right Storage Class, and continually and granularly enhance the storage environment with additional performance and capabilitiesnot via legacy arrays or monolithic and siloed major migrations, but via discrete racks. In other words, autonomous storage is a necessity for any business that needs to scale.

Topics:
kubernetes tutorial ,kubernetes cluster ,storage software ,storage optimization ,storage infrastructure ,kubernetes ,software defined storage

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}