Kubernetes and Running Stateful Workloads
Kubernetes and Running Stateful Workloads
In this article, we discuss fundamentals behind Kubernetes and how to run stateful workloads in the cloud.
Join the DZone community and get the full member experience.Join For Free
Kubernetes, as we know, is currently the most popular container orchestration tool used to scale, deploy, and manage containerized applications. In its initial days, Kubernetes was mostly used to run web-based stateless services.
However if you ever wanted to run stateful services like a database, you either had to run them in virtual machines (VM) or as a cloud-service. But with the rise of the kubernetes based hybrid-cloud, many users want to deploy stateful workloads also on top of kubernetes based clusters.
Stateless and Stateful Workloads
The Kubernetes sweet-spot is running stateless services and applications, which can be scaled horizontally. By keeping state out of applications, Kubernetes can seamlessly add, remove, restart, and delete pods to keep your services healthy and scalable. Developing a stateless application is, without question, the easiest way to ensure that your app can scale with Kubernetes.
A key point to keep in mind is that statefulness requires persistent storage. An application can only be stateful if it has a place to store information about its state, and that information should be available on-demand to read in the future.
- Stateless applications do not persist state and make use of ephemeral storage; which means, stateless applications use a temporary storage provided by Kubernetes that are destroyed once the pod is deleted or accidentally shut down (as it is ephemeral storage). This type of storage is an excellent option for such stateless applications, which use minimal resources, do their tasks, and vanish. Ephemeral storage is local to Kubernetes clusters and also has persistent storage.
- However, in the case of Stateful applications, storage has to follow workloads that need to stay attached with storage in a persistent manner. Stateful applications need to provision with the volume, which will always stay connected with pods that are hosting the stateful applications. These stateful applications mostly require storage outside the Kubernetes clusters, such as network-attached storage or remote storage.
Stateless vs Stateful Lifecycles
Stateless lifecycle management of stateless containers is simple. Stateless containers can be started and stopped at any time, and stateless containers can be run on any node in the cluster. So long as there is at least one instance of the container running at any time, the service provided by that application is always available.
Stateful: stateful containers are not so flexible. This is because the state information needs to be accessible on any node that the container can be moved to. Recently, Kubernetes has added container-native storage solutions to allow the state to be accessed this way.
More difficult issues to address are around container life cycle management. eg. to migrate a container from one node to another, Kubernetes shuts down the current container and starts a new container. During this process, it’s possible that the two container instances (new and old) run concurrently for a brief period of time. That means that applications could connect to either instance.
StatefulSet: To make it easy to run stateful clustered workloads in Kubernetes, StatefulSets were introduced. Kubernetes has been supporting stateful workloads directly since version 1.9 when StatefulSets was made generally available. It is a special type of controller that makes it easy to run clustered workloads in Kubernetes.
- The Pods that belong to a StatefulSet are guaranteed to have stable, unique identifiers.
- StatefulSets are a group of pods with unique stable hostnames and persistent identities.
- StatefulSets are designed to run replicated and stateful Kubernetes services. Kubernetes maintains these pods regardless of whether pods are scheduled.
- Each Pod in a StatefulSet gets its hostname from the name of the Statefulset and the ordinal of the Pod. The identity of this Pod is sticky, irrespective of what Node this Pod is scheduled to, or how many times it is rescheduled.
- Pod identifiers in StatefulSets are unique and ordered. Pods within StatefulSets are created sequentially, waiting for the previous Pod to be in a healthy state before moving on to the next Pod.
- The state information and other resilient data for any given StatefulSet pod are stored in the persistent disk of the StatefulSet.
When to Use StatefulSets
StatefulSets are valuable for applications that require one or more of the following.
- Stable, unique network identifiers.
- Stable, persistent storage.
- Ordered, graceful deployment and scaling.
- Ordered, automated rolling updates.
The below example demonstrates the components of a StatefulSet.
In the above example:
There is a headless Service, names as nginx, which is used to control the network domain. The StatefulSet, named web, has a spec that shows that 3 replicas of the nginx container will be launched in unique Pods.The volumeClaimTemplates will provide stable storage using PersistentVolumesprovisioned by a PersistentVolume Provisioner.
Published at DZone with permission of Saurabh Gupta . See the original article here.
Opinions expressed by DZone contributors are their own.