Managing Persistent Storage in Kubernetes With PVs and PVCs
This article explores the concepts of Persistent Volumes (PVs) and Persistent Volume Claims (PVCs), the essential abstractions to handle storage resources.
Join the DZone community and get the full member experience.
Join For FreeAs containerized applications continue to gain popularity, managing persistent storage in these ephemeral environments has emerged as a significant challenge. Kubernetes addresses this challenge with two key abstractions: Persistent Volumes (PVs) and Persistent Volume Claims (PVCs). These abstractions separate storage concerns, such as provisioning, lifecycle management, and capacity, from the application logic, allowing developers to focus on building scalable and resilient applications.
Understanding Persistent Volumes (PVs)
A PV in Kubernetes is a representation of a piece of storage that has been provisioned by an administrator or dynamically through storage plugins. It is a cluster-level resource, meaning it is not tied to a particular namespace and can be claimed by PVCs across different namespaces. PVs support various storage backends, including block storage, file systems, and object storage, provided by on-premises solutions or cloud storage services.
PVs embody the following characteristics:
- Capacity: The storage size of the PV, which is defined at the time of creation.
- Access Modes: The ways in which the PV can be mounted on a pod, such as ReadWriteOnce (RWO), ReadOnlyMany (ROX), and ReadWriteMany (RWX).
- Reclaim Policy: The policy that dictates what happens to the PV's data after the associated PVC is deleted. Common policies include Retain, Delete, and Recycle.
- Storage Class: An optional attribute linking the PV to a particular storage class, which defines provisioning policies and other storage parameters.
Lifecycle of Persistent Volumes
The lifecycle of a PV begins with its creation and provisioning. Once created, a PV can be in one of the following states:
- Available: The PV is not yet bound to any PVC and is available for claiming.
- Bound: The PV has been claimed by a PVC and is no longer available for new claims.
- Released: The PVC associated with the PV has been deleted, but the underlying storage resource is not yet reclaimed.
- Failed: The PV has encountered an error during automatic reclamation.
Understanding Persistent Volume Claims (PVCs)
A PVC is a request for storage by a user (typically a developer or an application). It specifies the size and access modes, among other storage attributes. PVCs are namespaced resources, meaning they belong to a specific namespace and can only be accessed by pods within the same namespace.
When a PVC is created, the Kubernetes control plane looks for an available PV that satisfies the claim's requirements. If a suitable PV is found, the PVC binds to it, creating a one-to-one mapping between the PV and PVC. If no suitable PV exists and dynamic provisioning is configured, a new PV is dynamically provisioned to satisfy the claim.
Interaction Between PVs and PVCs
The relationship between PVs and PVCs is fundamental to managing stateful workloads in Kubernetes. This interaction allows for:
- Decoupling: Applications are decoupled from the underlying storage infrastructure, simplifying development and deployment.
- Portability: The use of abstracted storage resources enables workload portability across different environments and cloud providers.
- Scalability: The dynamic provisioning of storage resources allows applications to scale seamlessly without manual intervention from administrators.
Creating a Persistent Volume (PV) on a GlusterFS brick involves a few steps, including setting up your GlusterFS cluster and bricks. A brick in GlusterFS is a basic unit of storage corresponding to a directory on a server in the storage network. Once your GlusterFS cluster is ready, you can create a PV referencing the GlusterFS volume (composed of one or more bricks).
Here are the general steps to create a PV on a GlusterFS brick:
Set Up a GlusterFS Cluster
Ensure that you have a running GlusterFS cluster with at least one volume created and started. GlusterFS volumes are made from bricks, where each brick is a directory on a server in the storage network.
Retrieve GlusterFS Volume Information
You'll need the following information about your GlusterFS volume:
- The volume name
- Endpoints (the list of IP addresses or hostnames of the GlusterFS servers)
Create GlusterFS Endpoints and Service in Kubernetes
Create an Endpoints resource that lists the IP addresses of your GlusterFS servers, and then create a Service that points to the Endpoints.
Endpoints YAML (glusterfs-endpoints.yaml
):
apiVersion: v1
kind: Endpoints
metadata:
name: glusterfs-cluster
subsets:
- addresses:
- ip: <glusterfs-server-1-ip>
- ip: <glusterfs-server-2-ip>
- ip: <glusterfs-server-n-ip>
ports:
- port: 1
Service YAML (glusterfs-service.yaml
):
apiVersion: v1
kind: Service
metadata:
name: glusterfs-cluster
ports:
- port: 1
Apply the configurations using the following:
kubectl:kubectl apply -f glusterfs-endpoints.yaml
kubectl apply -f glusterfs-service.yaml
Create the Persistent Volume
With the endpoints in place, you can now create a PV that references the GlusterFS volume.
PV YAML (glusterfs-pv.yaml
):
apiVersion: v1
kind: PersistentVolume
metadata:
name: glusterfs-pv
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
glusterfs:
endpoints: glusterfs-cluster
path: <glusterfs-volume-name>
readOnly: false
persistentVolumeReclaimPolicy: Retain
Replace <glusterfs-server-1-ip>
, <glusterfs-server-2-ip>
, <glusterfs-server-n-ip>
, and <glusterfs-volume-name>
with your actual GlusterFS server IPs and volume name. Then, create the PV:
kubectl apply -f glusterfs-pv.yaml
Verify the Persistent Volume
After creating the PV, check that it's available in your Kubernetes cluster:
kubectl get pv
Please note that this is a general guide and assumes that the GlusterFS volume is already set up and that your Kubernetes cluster can communicate with the GlusterFS servers. Always refer to the official GlusterFS and Kubernetes documentation for detailed instructions tailored to your specific environment and version.
To create a Persistent Volume (PV) on a file system in Kubernetes, you need to define a PV resource that specifies the details of the storage on which it is hosted. Here's how to create a PV using a directory from a node's local filesystem, which could be a mounted disk or any directory accessible to the node.
Important: Using local storage ties your PV to a specific node, which limits the portability of the pods that use it. Furthermore, data on local storage is not replicated, so it is not suitable for production workloads that require high availability.
Here is an example of how you can create a PV using a directory from the local filesystem:
Prepare the Storage on the Node
Choose a directory on your node that you want to expose as a PV. For instance, you might have a directory at /mnt/data
that you wish to use. Make sure that this directory exists and has the proper permissions set:
sudo mkdir -p /mnt/data
sudo chown -R nobody:nogroup /mnt/data
sudo chmod 0777 /mnt/data
Define the Persistent Volume
Create a YAML file for your PV, such as local-pv.yaml
, and define the PV resource:
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /mnt/data
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- <node-name>
In the above YAML file, replace <node-name>
with the name of the node where the storage is located. This will ensure that the PV is only available to pods running on that specific node.
Using kubectl to Get Nodes With JSON or YAML Output
You can also output the information in JSON or YAML format and then use tools like jq
for JSON processing to extract node names:
kubectl get nodes -o json | jq '.items[].metadata.name'
This jq
command will give you a list of node names in quotes:
- "node1" "node2" "node3"
For YAML output, you would use:
kubectl get nodes -o yaml
You can then manually look through the YAML output for the node names or use a tool like yq
to parse YAML from the command line.
Create the Persistent Volume
Apply the configuration to your cluster:
kubectl apply -f local-pv.yaml
Verify the Persistent Volume
After creating the PV, you can check its status with the following command:
kubectl get pv local-pv
Using the Persistent Volume
To use this PV, a pod needs to create a Persistent Volume Claim (PVC) that requests storage of the appropriate size and access modes. Here is an example PVC that could be used to claim the local PV:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: local-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: local-storage
resources:
requests:
storage: 5Gi
The storageClassName
in the PVC should match the storageClassName
defined in your local PV. This is what the Kubernetes scheduler uses to bind the PVC to the appropriate PV.
Once you have created the PVC, you can reference it in the volumes section of a pod's spec to mount the local storage:
volumes:
- name: local-storage
persistentVolumeClaim:
claimName: local-pvc
Remember that when using local volumes, if the node fails or the pod is rescheduled to another node, the data will not be accessible from the new node. Local volumes are typically used for temporary storage or in situations where the application can handle node-specific storage and potential data loss.
To create pods that use a PVC through a Deployment, you need to define a Deployment resource in Kubernetes. The Deployment will specify a template for pod creation, which includes volume mounts that refer to your PVC.
Here's an example of how you can set this up:
Ensure you have a PersistentVolumeClaim (PVC)
Before you can use a PVC in your Deployment, you need to have an existing PVC in your Kubernetes cluster. Here's an example YAML definition for a PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Apply this definition with kubectl
:
kubectl apply -f my-pvc.yaml
Make sure the PVC is bound to a PersistentVolume (PV) and is ready for use.
Create a Deployment that uses the PVC
Define a Deployment YAML that includes a volume mount for the PVC. Here's an example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-deployment
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-container
image: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: /usr/share/nginx/html
name: my-volume
volumes:
- name: my-volume
persistentVolumeClaim:
claimName: my-pvc
In this example, the Deployment will create pods with a container running Nginx, and the PVC will be mounted at /usr/share/nginx/html.
Apply the Deployment with kubectl
:
kubectl apply -f my-deployment.yaml
Verify the Deployment and Pods
Check the status of your Deployment and pods to ensure they are running and that the PVC is correctly mounted:
kubectl get deployment my-deployment
kubectl get pods --selector=app=my-app
You can also describe one of the pods to see more details about the volume mounts:
kubectl describe pod <pod-name>
Replace <pod-name>
with the actual name of one of your pods.
By following these steps, you'll have a Kubernetes Deployment that creates pods using a PVC for persistent storage. Remember to adjust the image, volume mount path, and any other configuration details to match the specific needs of your application.
Ways of Associating PV to PVC
In Kubernetes, a Persistent Volume Claim (PVC) is typically bound to a Persistent Volume (PV) using the storage class and the capacity requirements specified in the PVC. However, there are other ways to associate a PVC with a PV, which can be useful in scenarios where you need more control over the binding process. Here are some alternative methods:
Manual Static Provisioning
When you manually pre-provision PVs, you can ensure that a specific PVC binds to a particular PV by matching the accessModes
and resources.requests.storage
values. In addition, you can use labels and selectors to make the match more explicit.
Example PV with a custom label:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
labels:
type: local
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
storageClassName: manual
local:
path: /mnt/disks/ssd1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- your-node-name
Example PVC with a selector that matches the label:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: manual
resources:
requests:
storage: 5Gi
selector:
matchLabels:
type: local
In this example, the PVC will only bind to PVs with a label type: local
.
VolumeName Field in PVC
You can explicitly specify the name of the PV you want your PVC to bind to by setting the volumeName
field in the PVC spec.
Example PVC with volumeName
set:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
volumeName: my-pv
This method directly associates the PVC with the specified PV, bypassing the usual dynamic provisioning process.
StorageClass and VolumeBindingMode
By setting the volumeBindingMode
field in a StorageClass
to WaitForFirstConsumer
, you can delay the binding and provisioning of a PV until a pod that uses the PVC is created. This can be useful for local volumes where the PV must be on the same node as the pod.
Example StorageClass
with WaitForFirstConsumer
:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-wait
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
PVCs that reference this StorageClass
will wait to bind until a pod requests the PVC.
Pre-Binding PVC to PV
You can pre-bind a PVC to a PV before creating the PVC by specifying claimRef
in the PV spec. This method is not commonly used because it requires manual intervention and careful coordination.
Example PV with claimRefset
:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
claimRef:
namespace: default
name: my-pvc
local:
path: /mnt/disks/ssd1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- your-node-name
When the PVC my-pvc
is created, it will automatically bind to my-pv
.
Each of these methods provides a different level of control over the binding process between PV
Summary
Persistent Volumes and Persistent Volume Claims are pivotal components in the Kubernetes storage model, facilitating the deployment and management of stateful applications. By abstracting storage details away from the application layer, Kubernetes enables a more flexible, efficient, and developer-friendly approach to persistent storage. As the Kubernetes ecosystem continues to evolve, PVs and PVCs will remain central to its strategy for stateful workload orchestration.
Published at DZone with permission of Abhishek Gupta. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments