CubeFS: High-Performance Storage for Cloud-Native Apps
CubeFS is a CNCF-graduated, cloud-native distributed file system delivering scalable performance, fault tolerance, and smooth Kubernetes integration.
Join the DZone community and get the full member experience.
Join For FreeThrough achieving graduation status from the Cloud Native Computing Foundation, CubeFS reaches an important breakthrough as a distributed file system created by community input. CubeFS's graduation status demonstrates its achieved technical sophistication while establishing its reliable history of managing production workloads on a large scale. CubeFS provides low-latency file lookups and high throughput storage with strong protection through separate handling of metadata and data storage while remaining suited for numerous types of computing workloads.
The inherent compatibility between CubeFS's cloud-native design and Kubernetes achieves full automation of deployments together with rolling upgrades as well as scalable node adaptation to meet increasing data needs. CubeFS establishes itself as a trustworthy high-performance solution for container-based organizations wanting to upgrade their storage systems because of its dedicated open-source community support and adherence to the CNCF quality standards.
Introduction to CubeFS
CubeFS functions as a distributed file system that developers worldwide can access under an open-source license. The distribution of file operations occurs between MetaNodes, which handle metadata management tasks, and DataNodes manage data storage tasks overseen by the Master Node, which coordinates cluster activities. Authored structure achieves quick file searches and maintains high data processing speed. When data nodes fail, replication mechanisms safeguard them, resulting in highly reliable support for essential large-scale applications.
Why Deploy on Kubernetes
Kubernetes offers automated container orchestration, scaling, and a consistent way to deploy microservices. By running CubeFS on Kubernetes:
- You can quickly add or remove MetaNodes and DataNodes to match storage needs.
- You benefit from Kubernetes features like rolling updates, health checks, and autoscaling.
- You can integrate with the Container Storage Interface (CSI) for dynamic provisioning of volumes.
End-to-End Deployment Examples
Below are YAML manifests that illustrate a straightforward deployment of CubeFS on Kubernetes. They define PersistentVolumeClaims (PVCs) for each component, plus Deployments or StatefulSets for the Master, MetaNodes, and DataNodes. Finally, they show how to mount and use the file system from a sample pod.
Master Setup
Master PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cubefs-master-pvc
labels:
app: cubefs-master
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: <YOUR_STORAGECLASS_NAME>
Master Service
apiVersion: v1
kind: Service
metadata:
name: cubefs-master-svc
labels:
app: cubefs-master
spec:
selector:
app: cubefs-master
ports:
- name: master-port
port: 17010
targetPort: 17010
type: ClusterIP
Master Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: cubefs-master-deploy
labels:
app: cubefs-master
spec:
replicas: 1
selector:
matchLabels:
app: cubefs-master
template:
metadata:
labels:
app: cubefs-master
spec:
containers:
- name: cubefs-master
image: cubefs/cubefs:latest
ports:
- containerPort: 17010
volumeMounts:
- name: master-data
mountPath: /var/lib/cubefs/master
env:
- name: MASTER_ADDR
value: "0.0.0.0:17010"
- name: LOG_LEVEL
value: "info"
volumes:
- name: master-data
persistentVolumeClaim:
claimName: cubefs-master-pvc
MetaNode Setup
MetaNode PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cubefs-meta-pvc
labels:
app: cubefs-meta
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: <YOUR_STORAGECLASS_NAME>
MetaNode StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cubefs-meta-sts
labels:
app: cubefs-meta
spec:
serviceName: "cubefs-meta-sts"
replicas: 2
selector:
matchLabels:
app: cubefs-meta
template:
metadata:
labels:
app: cubefs-meta
spec:
containers:
- name: cubefs-meta
image: cubefs/cubefs:latest
ports:
- containerPort: 17011
volumeMounts:
- name: meta-data
mountPath: /var/lib/cubefs/metanode
env:
- name: MASTER_ENDPOINT
value: "cubefs-master-svc:17010"
- name: METANODE_PORT
value: "17011"
- name: LOG_LEVEL
value: "info"
volumes:
- name: meta-data
persistentVolumeClaim:
claimName: cubefs-meta-pvc
DataNode Setup
DataNode PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cubefs-data-pvc
labels:
app: cubefs-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: <YOUR_STORAGECLASS_NAME>
DataNode StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cubefs-data-sts
labels:
app: cubefs-data
spec:
serviceName: "cubefs-data-sts"
replicas: 3
selector:
matchLabels:
app: cubefs-data
template:
metadata:
labels:
app: cubefs-data
spec:
containers:
- name: cubefs-data
image: cubefs/cubefs:latest
ports:
- containerPort: 17012
volumeMounts:
- name: data-chunk
mountPath: /var/lib/cubefs/datanode
env:
- name: MASTER_ENDPOINT
value: "cubefs-master-svc:17010"
- name: DATANODE_PORT
value: "17012"
- name: LOG_LEVEL
value: "info"
volumes:
- name: data-chunk
persistentVolumeClaim:
claimName: cubefs-data-pvc
Consuming CubeFS
With the Master, MetaNodes, and DataNodes running, you can mount CubeFS in your workloads. Below is a simple pod spec that uses a hostPath for demonstration. In practice, you may prefer the CubeFS CSI driver for dynamic volume provisioning.
apiVersion: v1
kind: Pod
metadata:
name: cubefs-client-pod
spec:
containers:
- name: cubefs-client
image: cubefs/cubefs:latest
command: ["/bin/sh"]
args: ["-c", "while true; do sleep 3600; done"]
securityContext:
privileged: true
volumeMounts:
- name: cubefs-vol
mountPath: /mnt/cubefs
volumes:
- name: cubefs-vol
hostPath:
path: /mnt/cubefs-host
type: DirectoryOrCreate
Inside this pod, you would run:
mount.cubefs -o master=cubefs-master-svc:17010 /mnt/cubefs
Check logs to ensure successful mounting, and test file I/O operations.
Post-Deployment Checks
- Master Logs: kubectl logs cubefs-master-deploy-<POD_ID>
- MetaNode Logs: kubectl logs cubefs-meta-sts-0 and kubectl logs cubefs-meta-sts-1
- DataNode Logs: kubectl logs cubefs-data-sts-0, etc.
- I/O Test: Write and read files on /mnt/cubefs to confirm everything is functioning.
Conclusion
Through its CNCF graduation, CubeFS achieves confirmed enterprise-grade status as a cloud-native storage system that withstands demanding data workloads. Organizations gain simple operational storage solutions that improve performance while optimizing resource usage through CubeFS’s scalable architecture and efficient Kubernetes integration, which also provides fault tolerance. CubeFS stands as a dependable choice thanks to features that consistently evolve due to active community backing empowered by CNCF graduation while providing key users support for modern storage solutions that handle any data volume.
Opinions expressed by DZone contributors are their own.
Comments