Managing Persistent Storage in Kubernetes With PVs and PVCs

This article explores the concepts of Persistent Volumes (PVs) and Persistent Volume Claims (PVCs), the essential abstractions to handle storage resources.

Abhishek Gupta

Nov. 25, 24 · Tutorial

Likes (2)

Comment

Save

1.2K Views

As containerized applications continue to gain popularity, managing persistent storage in these ephemeral environments has emerged as a significant challenge. Kubernetes addresses this challenge with two key abstractions: Persistent Volumes (PVs) and Persistent Volume Claims (PVCs). These abstractions separate storage concerns, such as provisioning, lifecycle management, and capacity, from the application logic, allowing developers to focus on building scalable and resilient applications.

Understanding Persistent Volumes (PVs)

A PV in Kubernetes is a representation of a piece of storage that has been provisioned by an administrator or dynamically through storage plugins. It is a cluster-level resource, meaning it is not tied to a particular namespace and can be claimed by PVCs across different namespaces. PVs support various storage backends, including block storage, file systems, and object storage, provided by on-premises solutions or cloud storage services.

PVs embody the following characteristics:

Capacity: The storage size of the PV, which is defined at the time of creation.
Access Modes: The ways in which the PV can be mounted on a pod, such as ReadWriteOnce (RWO), ReadOnlyMany (ROX), and ReadWriteMany (RWX).
Reclaim Policy: The policy that dictates what happens to the PV's data after the associated PVC is deleted. Common policies include Retain, Delete, and Recycle.
Storage Class: An optional attribute linking the PV to a particular storage class, which defines provisioning policies and other storage parameters.

Lifecycle of Persistent Volumes

The lifecycle of a PV begins with its creation and provisioning. Once created, a PV can be in one of the following states:

Available: The PV is not yet bound to any PVC and is available for claiming.
Bound: The PV has been claimed by a PVC and is no longer available for new claims.
Released: The PVC associated with the PV has been deleted, but the underlying storage resource is not yet reclaimed.
Failed: The PV has encountered an error during automatic reclamation.

Understanding Persistent Volume Claims (PVCs)

A PVC is a request for storage by a user (typically a developer or an application). It specifies the size and access modes, among other storage attributes. PVCs are namespaced resources, meaning they belong to a specific namespace and can only be accessed by pods within the same namespace.

When a PVC is created, the Kubernetes control plane looks for an available PV that satisfies the claim's requirements. If a suitable PV is found, the PVC binds to it, creating a one-to-one mapping between the PV and PVC. If no suitable PV exists and dynamic provisioning is configured, a new PV is dynamically provisioned to satisfy the claim.

Interaction Between PVs and PVCs

The relationship between PVs and PVCs is fundamental to managing stateful workloads in Kubernetes. This interaction allows for:

Decoupling: Applications are decoupled from the underlying storage infrastructure, simplifying development and deployment.
Portability: The use of abstracted storage resources enables workload portability across different environments and cloud providers.
Scalability: The dynamic provisioning of storage resources allows applications to scale seamlessly without manual intervention from administrators.

Creating a Persistent Volume (PV) on a GlusterFS brick involves a few steps, including setting up your GlusterFS cluster and bricks. A brick in GlusterFS is a basic unit of storage corresponding to a directory on a server in the storage network. Once your GlusterFS cluster is ready, you can create a PV referencing the GlusterFS volume (composed of one or more bricks).

Here are the general steps to create a PV on a GlusterFS brick:

Set Up a GlusterFS Cluster

Ensure that you have a running GlusterFS cluster with at least one volume created and started. GlusterFS volumes are made from bricks, where each brick is a directory on a server in the storage network.

Retrieve GlusterFS Volume Information

You'll need the following information about your GlusterFS volume:

The volume name
Endpoints (the list of IP addresses or hostnames of the GlusterFS servers)

Create GlusterFS Endpoints and Service in Kubernetes

Create an Endpoints resource that lists the IP addresses of your GlusterFS servers, and then create a Service that points to the Endpoints.

Endpoints YAML (glusterfs-endpoints.yaml):

    YAML
   
 

   apiVersion: v1
kind: Endpoints
metadata:
  name: glusterfs-cluster
subsets:
  - addresses:
      - ip: <glusterfs-server-1-ip>
      - ip: <glusterfs-server-2-ip>
      - ip: <glusterfs-server-n-ip>
    ports:
      - port: 1
  

Service YAML (glusterfs-service.yaml):

    YAML
   
 

   apiVersion: v1
kind: Service
metadata:
  name: glusterfs-cluster
ports:
  - port: 1
  

Apply the configurations using the following:

kubectl:kubectl apply -f glusterfs-endpoints.yaml
kubectl apply -f glusterfs-service.yaml

Create the Persistent Volume

With the endpoints in place, you can now create a PV that references the GlusterFS volume.

PV YAML (glusterfs-pv.yaml):

    YAML
   
 

   apiVersion: v1
kind: PersistentVolume
metadata:
  name: glusterfs-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  glusterfs:
    endpoints: glusterfs-cluster
    path: <glusterfs-volume-name>
    readOnly: false
   persistentVolumeReclaimPolicy: Retain
  

Replace <glusterfs-server-1-ip>, <glusterfs-server-2-ip>, <glusterfs-server-n-ip>, and <glusterfs-volume-name> with your actual GlusterFS server IPs and volume name. Then, create the PV:

kubectl apply -f glusterfs-pv.yaml

Verify the Persistent Volume

After creating the PV, check that it's available in your Kubernetes cluster:

kubectl get pv

Please note that this is a general guide and assumes that the GlusterFS volume is already set up and that your Kubernetes cluster can communicate with the GlusterFS servers. Always refer to the official GlusterFS and Kubernetes documentation for detailed instructions tailored to your specific environment and version.

To create a Persistent Volume (PV) on a file system in Kubernetes, you need to define a PV resource that specifies the details of the storage on which it is hosted. Here's how to create a PV using a directory from a node's local filesystem, which could be a mounted disk or any directory accessible to the node.

Important: Using local storage ties your PV to a specific node, which limits the portability of the pods that use it. Furthermore, data on local storage is not replicated, so it is not suitable for production workloads that require high availability.

Here is an example of how you can create a PV using a directory from the local filesystem:

Prepare the Storage on the Node

Choose a directory on your node that you want to expose as a PV. For instance, you might have a directory at /mnt/data that you wish to use. Make sure that this directory exists and has the proper permissions set:

sudo mkdir -p /mnt/data
sudo chown -R nobody:nogroup /mnt/data
sudo chmod 0777 /mnt/data

Define the Persistent Volume

Create a YAML file for your PV, such as local-pv.yaml, and define the PV resource:

    YAML
   
 

   apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
   persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/data
  nodeAffinity:
    required:
       nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
               operator: In
              values:
                - <node-name>
  

In the above YAML file, replace <node-name> with the name of the node where the storage is located. This will ensure that the PV is only available to pods running on that specific node.

Using kubectl to Get Nodes With JSON or YAML Output

You can also output the information in JSON or YAML format and then use tools like jq for JSON processing to extract node names:

kubectl get nodes -o json | jq '.items[].metadata.name'

This jq command will give you a list of node names in quotes:

"node1" "node2" "node3"

For YAML output, you would use:

kubectl get nodes -o yaml

You can then manually look through the YAML output for the node names or use a tool like yq to parse YAML from the command line.

Create the Persistent Volume

Apply the configuration to your cluster:

kubectl apply -f local-pv.yaml

Verify the Persistent Volume

After creating the PV, you can check its status with the following command:

kubectl get pv local-pv

Using the Persistent Volume

To use this PV, a pod needs to create a Persistent Volume Claim (PVC) that requests storage of the appropriate size and access modes. Here is an example PVC that could be used to claim the local PV:

    YAML
   
 

   apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: local-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-storage
  resources:
    requests:
      storage: 5Gi
  

The storageClassName in the PVC should match the storageClassName defined in your local PV. This is what the Kubernetes scheduler uses to bind the PVC to the appropriate PV.

Once you have created the PVC, you can reference it in the volumes section of a pod's spec to mount the local storage:

    YAML
   
   volumes:
  - name: local-storage
     persistentVolumeClaim:
      claimName: local-pvc

Remember that when using local volumes, if the node fails or the pod is rescheduled to another node, the data will not be accessible from the new node. Local volumes are typically used for temporary storage or in situations where the application can handle node-specific storage and potential data loss.

To create pods that use a PVC through a Deployment, you need to define a Deployment resource in Kubernetes. The Deployment will specify a template for pod creation, which includes volume mounts that refer to your PVC.

Here's an example of how you can set this up:

Ensure you have a PersistentVolumeClaim (PVC)

Before you can use a PVC in your Deployment, you need to have an existing PVC in your Kubernetes cluster. Here's an example YAML definition for a PVC:

    YAML
   
 

   apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  

Apply this definition with kubectl:

kubectl apply -f my-pvc.yaml

Make sure the PVC is bound to a PersistentVolume (PV) and is ready for use.

Create a Deployment that uses the PVC

Define a Deployment YAML that includes a volume mount for the PVC. Here's an example:

    YAML
   
 

   apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-container
          image: nginx
          ports:
            - containerPort: 80
           volumeMounts:
            - mountPath: /usr/share/nginx/html
              name: my-volume
      volumes:
        - name: my-volume
           persistentVolumeClaim:
            claimName: my-pvc
  

In this example, the Deployment will create pods with a container running Nginx, and the PVC will be mounted at /usr/share/nginx/html.

Apply the Deployment with kubectl:

kubectl apply -f my-deployment.yaml

Verify the Deployment and Pods

Check the status of your Deployment and pods to ensure they are running and that the PVC is correctly mounted:

kubectl get deployment my-deployment
kubectl get pods --selector=app=my-app

You can also describe one of the pods to see more details about the volume mounts:

kubectl describe pod <pod-name>

Replace <pod-name> with the actual name of one of your pods.

By following these steps, you'll have a Kubernetes Deployment that creates pods using a PVC for persistent storage. Remember to adjust the image, volume mount path, and any other configuration details to match the specific needs of your application.

Ways of Associating PV to PVC

In Kubernetes, a Persistent Volume Claim (PVC) is typically bound to a Persistent Volume (PV) using the storage class and the capacity requirements specified in the PVC. However, there are other ways to associate a PVC with a PV, which can be useful in scenarios where you need more control over the binding process. Here are some alternative methods:

Manual Static Provisioning

When you manually pre-provision PVs, you can ensure that a specific PVC binds to a particular PV by matching the accessModes and resources.requests.storage values. In addition, you can use labels and selectors to make the match more explicit.

Example PV with a custom label:

    YAML
   
 

   apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
  labels:
    type: local
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: manual
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
       nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
               operator: In
              values:
                - your-node-name
  

Example PVC with a selector that matches the label:

    YAML
   
 

   apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: manual
  resources:
    requests:
      storage: 5Gi
  selector:
    matchLabels:
      type: local
  

In this example, the PVC will only bind to PVs with a label type: local.

VolumeName Field in PVC

You can explicitly specify the name of the PV you want your PVC to bind to by setting the volumeName field in the PVC spec.

Example PVC with volumeName set:

    YAML
   
 

   apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  volumeName: my-pv
  

This method directly associates the PVC with the specified PV, bypassing the usual dynamic provisioning process.

StorageClass and VolumeBindingMode

By setting the volumeBindingMode field in a StorageClass to WaitForFirstConsumer, you can delay the binding and provisioning of a PV until a pod that uses the PVC is created. This can be useful for local volumes where the PV must be on the same node as the pod.

Example StorageClass with WaitForFirstConsumer:

    YAML
   
 

   apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-wait
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
  

PVCs that reference this StorageClass will wait to bind until a pod requests the PVC.

Pre-Binding PVC to PV

You can pre-bind a PVC to a PV before creating the PVC by specifying claimRef in the PV spec. This method is not commonly used because it requires manual intervention and careful coordination.

Example PV with claimRefset:

    YAML
   
 

   apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
   persistentVolumeReclaimPolicy: Retain
  claimRef:
    namespace: default
    name: my-pvc
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
       nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
               operator: In
              values:
                - your-node-name
  

When the PVC my-pvcis created, it will automatically bind to my-pv.

Each of these methods provides a different level of control over the binding process between PV

Summary

Persistent Volumes and Persistent Volume Claims are pivotal components in the Kubernetes storage model, facilitating the deployment and management of stateful applications. By abstracting storage details away from the application layer, Kubernetes enables a more flexible, efficient, and developer-friendly approach to persistent storage. As the Kubernetes ecosystem continues to evolve, PVs and PVCs will remain central to its strategy for stateful workload orchestration.

Kubernetes PVCS cluster

Published at DZone with permission of Abhishek Gupta. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending

Managing Persistent Storage in Kubernetes With PVs and PVCs

This article explores the concepts of Persistent Volumes (PVs) and Persistent Volume Claims (PVCs), the essential abstractions to handle storage resources.

Understanding Persistent Volumes (PVs)

Lifecycle of Persistent Volumes

Understanding Persistent Volume Claims (PVCs)

Interaction Between PVs and PVCs

Set Up a GlusterFS Cluster

Retrieve GlusterFS Volume Information

Create GlusterFS Endpoints and Service in Kubernetes

Create the Persistent Volume

Verify the Persistent Volume

Prepare the Storage on the Node

Define the Persistent Volume

Using kubectl to Get Nodes With JSON or YAML Output

Create the Persistent Volume

Verify the Persistent Volume

Using the Persistent Volume

Ensure you have a PersistentVolumeClaim (PVC)

Create a Deployment that uses the PVC

Verify the Deployment and Pods

Ways of Associating PV to PVC

Manual Static Provisioning

VolumeName Field in PVC

StorageClass and VolumeBindingMode

Pre-Binding PVC to PV

Summary

Related

Partner Resources