DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Can You Run a MariaDB Cluster on a $150 Kubernetes Lab? I Gave It a Shot
  • How Kubernetes Cluster Sizing Affects Performance and Cost Efficiency in Cloud Deployments
  • The Production-Ready Kubernetes Service Checklist
  • 10 Best Practices for Managing Kubernetes at Scale

Trending

  • The Future of Java and AI: Coding in 2025
  • Navigating and Modernizing Legacy Codebases: A Developer's Guide to AI-Assisted Code Understanding
  • The Role of AI in Identity and Access Management for Organizations
  • Navigating Change Management: A Guide for Engineers
  1. DZone
  2. Software Design and Architecture
  3. Containers
  4. Managing Persistent Storage in Kubernetes With PVs and PVCs

Managing Persistent Storage in Kubernetes With PVs and PVCs

This article explores the concepts of Persistent Volumes (PVs) and Persistent Volume Claims (PVCs), the essential abstractions to handle storage resources.

By 
Abhishek Gupta user avatar
Abhishek Gupta
·
Nov. 25, 24 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
1.2K Views

Join the DZone community and get the full member experience.

Join For Free

As containerized applications continue to gain popularity, managing persistent storage in these ephemeral environments has emerged as a significant challenge. Kubernetes addresses this challenge with two key abstractions: Persistent Volumes (PVs) and Persistent Volume Claims (PVCs). These abstractions separate storage concerns, such as provisioning, lifecycle management, and capacity, from the application logic, allowing developers to focus on building scalable and resilient applications.

Understanding Persistent Volumes (PVs)

A PV in Kubernetes is a representation of a piece of storage that has been provisioned by an administrator or dynamically through storage plugins. It is a cluster-level resource, meaning it is not tied to a particular namespace and can be claimed by PVCs across different namespaces. PVs support various storage backends, including block storage, file systems, and object storage, provided by on-premises solutions or cloud storage services.

PVs embody the following characteristics:

  • Capacity: The storage size of the PV, which is defined at the time of creation.
  • Access Modes: The ways in which the PV can be mounted on a pod, such as ReadWriteOnce (RWO), ReadOnlyMany (ROX), and ReadWriteMany (RWX).
  • Reclaim Policy: The policy that dictates what happens to the PV's data after the associated PVC is deleted. Common policies include Retain, Delete, and Recycle.
  • Storage Class: An optional attribute linking the PV to a particular storage class, which defines provisioning policies and other storage parameters.

Lifecycle of Persistent Volumes

The lifecycle of a PV begins with its creation and provisioning. Once created, a PV can be in one of the following states:

  • Available: The PV is not yet bound to any PVC and is available for claiming.
  • Bound: The PV has been claimed by a PVC and is no longer available for new claims.
  • Released: The PVC associated with the PV has been deleted, but the underlying storage resource is not yet reclaimed.
  • Failed: The PV has encountered an error during automatic reclamation.

Understanding Persistent Volume Claims (PVCs)

A PVC is a request for storage by a user (typically a developer or an application). It specifies the size and access modes, among other storage attributes. PVCs are namespaced resources, meaning they belong to a specific namespace and can only be accessed by pods within the same namespace.

When a PVC is created, the Kubernetes control plane looks for an available PV that satisfies the claim's requirements. If a suitable PV is found, the PVC binds to it, creating a one-to-one mapping between the PV and PVC. If no suitable PV exists and dynamic provisioning is configured, a new PV is dynamically provisioned to satisfy the claim.

Interaction Between PVs and PVCs

The relationship between PVs and PVCs is fundamental to managing stateful workloads in Kubernetes. This interaction allows for:

  • Decoupling: Applications are decoupled from the underlying storage infrastructure, simplifying development and deployment.
  • Portability: The use of abstracted storage resources enables workload portability across different environments and cloud providers.
  • Scalability: The dynamic provisioning of storage resources allows applications to scale seamlessly without manual intervention from administrators.

Creating a Persistent Volume (PV) on a GlusterFS brick involves a few steps, including setting up your GlusterFS cluster and bricks. A brick in GlusterFS is a basic unit of storage corresponding to a directory on a server in the storage network. Once your GlusterFS cluster is ready, you can create a PV referencing the GlusterFS volume (composed of one or more bricks).

Here are the general steps to create a PV on a GlusterFS brick:

Set Up a GlusterFS Cluster

Ensure that you have a running GlusterFS cluster with at least one volume created and started. GlusterFS volumes are made from bricks, where each brick is a directory on a server in the storage network.

Retrieve GlusterFS Volume Information

You'll need the following information about your GlusterFS volume:

  • The volume name
  • Endpoints (the list of IP addresses or hostnames of the GlusterFS servers)

Create GlusterFS Endpoints and Service in Kubernetes

Create an Endpoints resource that lists the IP addresses of your GlusterFS servers, and then create a Service that points to the Endpoints. 

Endpoints YAML (glusterfs-endpoints.yaml):

YAML
 
apiVersion: v1
kind: Endpoints
metadata:
  name: glusterfs-cluster
subsets:
  - addresses:
      - ip: <glusterfs-server-1-ip>
      - ip: <glusterfs-server-2-ip>
      - ip: <glusterfs-server-n-ip>
    ports:
      - port: 1


Service YAML (glusterfs-service.yaml):

YAML
 
apiVersion: v1
kind: Service
metadata:
  name: glusterfs-cluster
ports:
  - port: 1


Apply the configurations using the following:

  • kubectl:kubectl apply -f glusterfs-endpoints.yaml 
  • kubectl apply -f glusterfs-service.yaml

Create the Persistent Volume

With the endpoints in place, you can now create a PV that references the GlusterFS volume.

PV YAML (glusterfs-pv.yaml):

YAML
 
apiVersion: v1
kind: PersistentVolume
metadata:
  name: glusterfs-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  glusterfs:
    endpoints: glusterfs-cluster
    path: <glusterfs-volume-name>
    readOnly: false
   persistentVolumeReclaimPolicy: Retain


Replace <glusterfs-server-1-ip>, <glusterfs-server-2-ip>, <glusterfs-server-n-ip>, and <glusterfs-volume-name> with your actual GlusterFS server IPs and volume name. Then, create the PV:

  • kubectl apply -f glusterfs-pv.yaml

Verify the Persistent Volume

After creating the PV, check that it's available in your Kubernetes cluster:

  • kubectl get pv

Please note that this is a general guide and assumes that the GlusterFS volume is already set up and that your Kubernetes cluster can communicate with the GlusterFS servers. Always refer to the official GlusterFS and Kubernetes documentation for detailed instructions tailored to your specific environment and version.

To create a Persistent Volume (PV) on a file system in Kubernetes, you need to define a PV resource that specifies the details of the storage on which it is hosted. Here's how to create a PV using a directory from a node's local filesystem, which could be a mounted disk or any directory accessible to the node.

Important: Using local storage ties your PV to a specific node, which limits the portability of the pods that use it. Furthermore, data on local storage is not replicated, so it is not suitable for production workloads that require high availability.

Here is an example of how you can create a PV using a directory from the local filesystem:

Prepare the Storage on the Node

Choose a directory on your node that you want to expose as a PV. For instance, you might have a directory at /mnt/data that you wish to use. Make sure that this directory exists and has the proper permissions set:

  • sudo mkdir -p /mnt/data  
  • sudo chown -R nobody:nogroup /mnt/data  
  • sudo chmod 0777 /mnt/data

Define the Persistent Volume

Create a YAML file for your PV, such as local-pv.yaml, and define the PV resource:

YAML
 
apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
   persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/data
  nodeAffinity:
    required:
       nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
               operator: In
              values:
                - <node-name>


In the above YAML file, replace <node-name> with the name of the node where the storage is located. This will ensure that the PV is only available to pods running on that specific node.

Using kubectl to Get Nodes With JSON or YAML Output

You can also output the information in JSON or YAML format and then use tools like jq for JSON processing to extract node names:

  • kubectl get nodes -o json | jq '.items[].metadata.name'

This jq command will give you a list of node names in quotes: 

  • "node1" "node2" "node3"

For YAML output, you would use:

  • kubectl get nodes -o yaml

You can then manually look through the YAML output for the node names or use a tool like yq to parse YAML from the command line.

Create the Persistent Volume

Apply the configuration to your cluster:

kubectl apply -f local-pv.yaml

Verify the Persistent Volume

After creating the PV, you can check its status with the following command:

kubectl get pv local-pv

Using the Persistent Volume

To use this PV, a pod needs to create a Persistent Volume Claim (PVC) that requests storage of the appropriate size and access modes. Here is an example PVC that could be used to claim the local PV:

YAML
 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: local-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-storage
  resources:
    requests:
      storage: 5Gi


The storageClassName in the PVC should match the storageClassName defined in your local PV. This is what the Kubernetes scheduler uses to bind the PVC to the appropriate PV.

Once you have created the PVC, you can reference it in the volumes section of a pod's spec to mount the local storage:

YAML
 
volumes:
  - name: local-storage
     persistentVolumeClaim:
      claimName: local-pvc


Remember that when using local volumes, if the node fails or the pod is rescheduled to another node, the data will not be accessible from the new node. Local volumes are typically used for temporary storage or in situations where the application can handle node-specific storage and potential data loss.

To create pods that use a PVC through a Deployment, you need to define a Deployment resource in Kubernetes. The Deployment will specify a template for pod creation, which includes volume mounts that refer to your PVC.

Here's an example of how you can set this up:

Ensure you have a PersistentVolumeClaim (PVC)

Before you can use a PVC in your Deployment, you need to have an existing PVC in your Kubernetes cluster. Here's an example YAML definition for a PVC:

YAML
 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi


Apply this definition with kubectl:

  • kubectl apply -f my-pvc.yaml

Make sure the PVC is bound to a PersistentVolume (PV) and is ready for use.

Create a Deployment that uses the PVC

Define a Deployment YAML that includes a volume mount for the PVC. Here's an example:

YAML
 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-container
          image: nginx
          ports:
            - containerPort: 80
           volumeMounts:
            - mountPath: /usr/share/nginx/html
              name: my-volume
      volumes:
        - name: my-volume
           persistentVolumeClaim:
            claimName: my-pvc


In this example, the Deployment will create pods with a container running Nginx, and the PVC will be mounted at /usr/share/nginx/html.

Apply the Deployment with kubectl:

  • kubectl apply -f my-deployment.yaml

Verify the Deployment and Pods

Check the status of your Deployment and pods to ensure they are running and that the PVC is correctly mounted:

  • kubectl get deployment my-deployment
  • kubectl get pods --selector=app=my-app

You can also describe one of the pods to see more details about the volume mounts:

  • kubectl describe pod <pod-name>

Replace <pod-name> with the actual name of one of your pods.

By following these steps, you'll have a Kubernetes Deployment that creates pods using a PVC for persistent storage. Remember to adjust the image, volume mount path, and any other configuration details to match the specific needs of your application. 

Ways of Associating PV to PVC

In Kubernetes, a Persistent Volume Claim (PVC) is typically bound to a Persistent Volume (PV) using the storage class and the capacity requirements specified in the PVC. However, there are other ways to associate a PVC with a PV, which can be useful in scenarios where you need more control over the binding process. Here are some alternative methods:

Manual Static Provisioning

When you manually pre-provision PVs, you can ensure that a specific PVC binds to a particular PV by matching the accessModes and resources.requests.storage values. In addition, you can use labels and selectors to make the match more explicit.

Example PV with a custom label:

YAML
 
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
  labels:
    type: local
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: manual
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
       nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
               operator: In
              values:
                - your-node-name


Example PVC with a selector that matches the label:

YAML
 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: manual
  resources:
    requests:
      storage: 5Gi
  selector:
    matchLabels:
      type: local


In this example, the PVC will only bind to PVs with a label type: local.

VolumeName Field in PVC

You can explicitly specify the name of the PV you want your PVC to bind to by setting the volumeName field in the PVC spec. 

Example PVC with volumeName set:

YAML
 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  volumeName: my-pv


This method directly associates the PVC with the specified PV, bypassing the usual dynamic provisioning process.

StorageClass and VolumeBindingMode

By setting the volumeBindingMode field in a StorageClass to WaitForFirstConsumer, you can delay the binding and provisioning of a PV until a pod that uses the PVC is created. This can be useful for local volumes where the PV must be on the same node as the pod.

Example StorageClass with WaitForFirstConsumer:

YAML
 
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-wait
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer


PVCs that reference this StorageClass will wait to bind until a pod requests the PVC.

Pre-Binding PVC to PV

You can pre-bind a PVC to a PV before creating the PVC by specifying claimRef in the PV spec. This method is not commonly used because it requires manual intervention and careful coordination.

Example PV with claimRefset:

YAML
 
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
   persistentVolumeReclaimPolicy: Retain
  claimRef:
    namespace: default
    name: my-pvc
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
       nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
               operator: In
              values:
                - your-node-name


When the PVC my-pvcis created, it will automatically bind to my-pv.

Each of these methods provides a different level of control over the binding process between PV

Summary

Persistent Volumes and Persistent Volume Claims are pivotal components in the Kubernetes storage model, facilitating the deployment and management of stateful applications. By abstracting storage details away from the application layer, Kubernetes enables a more flexible, efficient, and developer-friendly approach to persistent storage. As the Kubernetes ecosystem continues to evolve, PVs and PVCs will remain central to its strategy for stateful workload orchestration.

Kubernetes PVCS cluster

Published at DZone with permission of Abhishek Gupta. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Can You Run a MariaDB Cluster on a $150 Kubernetes Lab? I Gave It a Shot
  • How Kubernetes Cluster Sizing Affects Performance and Cost Efficiency in Cloud Deployments
  • The Production-Ready Kubernetes Service Checklist
  • 10 Best Practices for Managing Kubernetes at Scale

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!