DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Containers Topics

article thumbnail
How to Deploy a Spring Boot App on AWS ECS Cluster
Learn how to setup an AWS ECS (Elastic Container Service) Cluster. Create the Cluster and deploy a Docker image containing a Spring Boot App.
September 15, 2021
by Gunter Rotsaert DZone Core CORE
· 2,148 Views · 1 Like
article thumbnail
7 Microservices Best Practices for Developers
In this article, we’ll look at some microservices best practices and suggest a few ways to help you design, orchestrate, and secure your microservices architecture.
September 13, 2021
by Michael Bogan DZone Core CORE
· 186,471 Views · 22 Likes
article thumbnail
Provisioning High-Performance Storage for NoSQL databases with OpenEBS
Modern applications rely on multiple data models that generate different data types. To help such use cases, NoSQL (Not only SQL) databases allow for the storage and processing of different data types in a non-tabular fashion. NoSQL databases process unstructured data using flexible schemas to enable efficient storage and analysis for distributed, data-driven applications. By relaxing data consistency restrictions of SQL-based databases, NoSQL databases enable low latency, scalability and high performance for data access. The performance of SQL databases varies with the cluster size, the application’s configuration and network latency. This means that developers don’t have to worry about optimizing data structures, indexes and queries to achieve peak performance from the storage subsystem. NoSQL databases are popular with modern cloud applications since they use APIs for the storage and retrieval of data structures. This makes it easy to interface NoSQL databases with a host of microservice-based applications for agile, cloud-native deployment. OpenEBS is a leading Container Attached Storage (CAS) that helps developers deploy workloads efficiently by turning storage available on worker nodes into dynamically provisioned volumes. With OpenEBS, developers can implement granular policies per workload, reduce storage costs, avoid vendor lock-in and develop persistent storage for applications with high availability. This post delves into how OpenEBS can be used to provision high-performance storage for NoSQL databases. A Deep Dive into NoSQL Databases Before the mid-2000s, relational databases dominated application development. The Structured Query Language (SQL) was used to store and retrieve data structures, with popular databases being MySQL, Oracle, PostgreSQL, DB2 and SQL Server. The decrease in the cost of storage options eliminated the need to use strict, complex data models allowing for data applications to store and query more types of data. NoSQL databases evolved to handle the flexible schema needed to process different combinations of structured, semi-structured and polymorphic data. This section explores various features, types and popular distributions of NoSQL databases. Features of NoSQL Databases Though there are various flavours of NoSQL databases, here’s a list of common features that distinguish them from SQL databases: Multi-Model NoSQL databases are built for flexibility to handle large amounts of data. Unlike SQL-based databases that access & analyze data in tables and columns, NoSQL databases ingest all types of data with relative ease. NoSQL databases enable the creation of specific data models for each application, enhancing agility in handling different data types without having to manage separate databases. Schema Agnostic Schemas are used to instruct a relational database on what data to expect and how to store the data. Any changes in data structures or data paths require extensive re-architecting of the database using a modified schema. On the contrary, NoSQL databases require no upfront design work before data is stored. These databases shorten development time by allowing developers to start coding and accessing data without having to understand the internal implementation of the database. Non-Relational Relational databases have strict restrictions on how tables associate with each other while relying on traditional master-slave architecture. On the other hand, NoSQL databases do not rely on tabularized data with fixed row and column records while running on peer-to-peer networks with no concept of relationships between their records. The databases, therefore, facilitate easy storage & retrieval and fast query speed for data-driven applications. Distributable NoSQL database systems are designed to use multiple locations involving different data centres/regions for global scalability. Thanks to their masterless architecture, NoSQL databases can maintain continuous availability through the replication of data in multiple read/write locations. Easily Scalable While relational databases are also scalable, their scalability is costly and complex since their architecture requires the addition of larger, more powerful hardware for scaling. NoSQL databases allow for linear scalability both vertically and horizontally, so developers can either provide more processors or powerful ones for increased workloads. Types of NoSQL Databases Document Databases In this database, data is represented as a JSON-type document or object for efficient and intuitive data modelling. Document-type NoSQL databases simplify application development since developers can query and store data with the same document model format used in application source code. Document databases are flexible, hierarchical and semi-structured, so they scale to match with an application’s evolving needs. Graph NoSQL Databases These databases store data in edges and nodes. A node stores information about entities while the edge stores information about the relationship between these entities. These databases are mostly used in applications that analyze relationships between users and other entities to identify patterns. These include social networking, knowledge graphs, recommendation engines and fraud detection. Key-Value Databases These store data in simple key-value pairs. Values are retrieved by referencing their keys, which simplifies how developers query for specific data entries. These databases are highly applicable in applications that need to store large amounts of data without needing complex query operations for retrieval. The most common use-case for key-value databases is the storage of user preferences. Wide-Column Stores These databases use dynamic columns, rows and tables to store data. They are more flexible than relational databases since each row can have a different number of columns. Wide column stores are used for large amounts of data with predictable query patterns. Popular NoSQL Databases Some popular NoSQL Database management solutions include: Cassandra A free, open-source, distributed, wide-column store that powers mission-critical deployments with fault-tolerant replication, hybrid cloud support and audit logging. Apache Cassandra is trusted by major brands, such as Facebook, Netflix, Macy’s and MobilePay for scalability, high availability and improved performance. Follow this guide to explore the steps to deploy Cassandra StatefulSets with OpenEBS storage. MongoDB MongoDB is popular with modern application development as it relies on the document data model to simplify database management for developers. MongoDB includes functionality such as data-geolocation, horizontal scaling and automatic failover to accelerate development and adapt to application changes. Read this guide to discover how to provision Persistent Volumes for MongoDB StatefulSets in Kubernetes with OpenEBS. Redis An open-source, distributed, in-memory key-value data store that can be used in application development as a message broker, cache and database. Redis supports multiple types of abstract data structures and provides high availability through automatic disk partitioning and the Redis Sentinel framework. This guide demonstrates how OpenEBS can be used to provide persistent storage for Redis StatefulSets. OpenEBS for NoSQL Databases OpenEBS simplifies the deployment of stateful Kubernetes workloads using a collection of data engines to implement persistent volumes. The OpenEBS control plane is deeply integrated with Kubernetes, and uses Kubernetes-friendly constructs to manage how volumes are provisioned, scheduled and maintained. With OpenEBS, cluster administrators can take advantage of dynamic provisioning for local and distributed volumes, depending on workload and cluster size. These features make OpenEBS a popular choice to orchestrate storage for stateful applications. The following section explores the steps taken to provision OpenEBS storage for NoSQL databases. Why Adopt OpenEBS for NoSQL? Many organizations and users have adopted OpenEBS to deploy and provision storage for their stateful workloads, including those who use NoSQL. Some of the following are reasons to adopt OpenEBS for NoSQL databases include: Open-Source, Kubernetes-Centric Cloud Native Storage OpenEBS follows a loosely coupled Container Attached Storage (CAS) architecture. OpenEBS itself is deployed as a workload on Kubernetes nodes, bringing the DevOps benefits of container orchestration in the application layer to the data layer. This allows developers to leverage the cloud-native benefits of Kubernetes such as agility and scalability for developing reliable, effective data-driven applications. No Cloud Lock-in With OpenEBS, application data is written into storage engines, creating a data abstraction layer. This allows developers to move data easily between multiple Kubernetes environments. OpenEBS can be deployed on-premises, local storage or managed cloud services -- thereby allowing NoSQL applications to simultaneously access data stored on different deployment platforms. OpenEBS Enables Granular Deployment and Management of Workloads The cloud-native, loosely-coupled architecture of OpenEBS clusters enable multiple teams to deliver faster since they are free of cross-functional dependencies. OpenEBS also makes it easier to declare policies and settings on a per-workload or per-volume basis, with constant monitoring to ensure workloads achieve desired results. Granularity makes it easier for developers to segregate large amounts of data based on data type, structure or use-case. Reduced Storage Costs The dynamic nature of NoSQL data often necessitates the over-provision of cloud storage resources to achieve higher performance and a lower risk of disruption. To help with this, OpenEBS relies on thin provisioning mechanisms to pool storage and grow data volumes when the NoSQL database needs it. While adjusting storage on the fly without disrupting volumes attached to workloads, OpenEBS enables cost savings of up to 60% leveraging a thin, dynamic provisioning. Configuration Workflow To provision high-performance storage for NoSQL databases using OpenEBS, the following steps are undertaken: Installing OpenEBS - OpenEBS integrates seamlessly into the Kubernetes workflow and is installed into the cluster using installation modes available to Kubernetes applications, such as helm and with YAML files via kubectl. This activates a declarative storage control plane that can be managed from within the cluster. Selecting the OpenEBS storage engine - The storage engine represents OpenEBS data plane components. OpenEBS includes a number of storage engines and may automatically choose one that suits the storage available on nodes and application requirements. OpenEBS comes with three storage engines: cStor, Jiva, Local PV and Mayastor. Creating a Storage Class - The StorageClass is used to provision volumes on physical storage devices. These classes consume storage pools created on the disks attached to cluster nodes. Launching the NoSQL database - The database is then deployed into the cluster either using operators or helm charts. A typical architecture of NoSQL database with OpenEBS Persistent Volumes Monitoring Deployment Metrics OpenEBS includes post-deployment recommendations to ensure a successful deployment for NoSQL workloads. While developers are advised to allocate sufficient volume size during initial configuration, the volume size should be constantly monitored to ensure seamless operation. Additionally, developers and administrators should watch pool capacity and add more physical disks once the workload hits an 80% threshold. OpenEBS integrates with Kubernetes centric monitoring & logging tools such as Prometheus and Grafana for easier metric collection, analysis and visualization. NoSQL databases enable data-driven application development since they facilitate global scalability, flexibility and delivery agility. Cloud-native application architectures are considered perfect for NoSQL databases since they deliver on-demand infrastructure for dynamic workloads. As an essential enabler, OpenEBS can orchestrate stateful Kubernetes workloads including NoSQL databases, offering multiple benefits such as reduced storage costs and zero lock-in. To know more on how OpenEBS can help to manage your organization’s stateful workloads, contact us here. This article has already been published on https://blog.mayadata.io/provisioning-high-performance-storage-for-nosql-databases-with-openebs and has been authorized by MayaData for a republish.
September 10, 2021
by Sudip Sengupta DZone Core CORE
· 4,737 Views · 2 Likes
article thumbnail
Limit Communication Between Microservices With Kubernetes Network Policies
To prevent a few compromised services from affecting all the services on the platform, a microservices platform needs to limit the interactions between services.
September 9, 2021
by Rahul Rai
· 5,818 Views · 4 Likes
article thumbnail
Nextcloud and Kubernetes in the Cloud With Kuma Service Mesh
Want a powerful, self-hosted personal cloud? Then look no further than Nextcloud running on Kubernetes with a service mesh to add all the help and features you need.
Updated September 4, 2021
by Chris Ward DZone Core CORE
· 7,876 Views · 3 Likes
article thumbnail
Mule Server, Server Group and Cluster
Read this article to learn core concepts of Mule servers, server group, and cluster of Mule runtime engine instances (servers).
August 30, 2021
by Mukesh Thakur
· 18,242 Views · 4 Likes
article thumbnail
How Kafka Can Make Microservice Planet a Better Place
In this article, we want to focus on using Kafka in the microservice architecture, and we need an important concept named Kafka Topic for that.
August 28, 2021
by Reza Ganji DZone Core CORE
· 19,283 Views · 13 Likes
article thumbnail
Testcontainers: From Zero To Hero [Video]
Do you want to make the most out of your integration tests by using the popular Testcontainers library? Learn how to use it from the ground up.
August 27, 2021
by Marco Behler
· 8,234 Views · 4 Likes
article thumbnail
Redis-Based Tomcat Session Management
Learn what Tomcat clustering is and what problems it can solve by working together with Redis.
Updated August 23, 2021
by Nikita Koksharov
· 60,001 Views · 20 Likes
article thumbnail
Popular Tools Supporting YAML Data Format
Learning about YAML is very beneficial for today's software engineers. This article includes a list, and a description, of ten tools that support YAML.
Updated August 21, 2021
by Tarun Telang
· 22,978 Views · 4 Likes
article thumbnail
Implementing JDBC Persistent Object Store With Anypoint Clustering | MySQL Database
Clustering is group of nodes that act as a single unit. With JDBC Persistent Object Store, data can be persisted in case of Mule Runtime failure, crashes or shutdown.
Updated August 21, 2021
by Jitendra Bafna
· 18,996 Views · 3 Likes
article thumbnail
Spring Boot Docker Deployment
Spring Boot Docker Deployment is a quick and easy way to deploy Spring Boot Microservices as containers. Take a look at all the steps involved in this process below.
August 15, 2021
by Saurabh Dashora DZone Core CORE
· 8,098 Views · 6 Likes
article thumbnail
ContainerD Kubernetes Syslog Forwarding
Move from Logspout to Filebeat to support containerd logging architecture.
August 15, 2021
by Miguel Callejas
· 14,949 Views · 3 Likes
article thumbnail
Boost Your Development Environment With Ubuntu Multipass
Ubuntu Multipass is used to quickly launch and manage Linux virtual machines on any workstation. It's available for Linux, macOS, and Windows. From a developer’s perspective, it is an interesting alternative to Docker or VirtualBox. For me, it feels easier and simpler to use.
August 14, 2021
by Tomasz Waraksa
· 9,463 Views · 6 Likes
article thumbnail
How Your Application Architecture Has Evolved
In this post, I will discuss how application architecture, in my opinion, has evolved in the last few years and what has been the driving factor for each evolution.
August 12, 2021
by Himanshu Gupta
· 13,044 Views · 11 Likes
article thumbnail
Why Use LocalPV with NVMe for Your Workload?
Containerized applications are ephemeral, which means any data created by a container is lost as soon as the process terminates. This requires a pragmatic approach to data persistence and management when orchestrating containers using Kubernetes. To deal with this, the Kubernetes orchestration platform uses Volume plugins to isolate storage consumption from provisioned hardware. A Persistent Volume (PV) is a Kubernetes API resource that provisions persistent storage for PODs. Cluster resources can use a PV construct to mount any storage unit -- file system folders or block storage options -- to Kubernetes nodes. PODs request for a PV using Persistent Volumes Claims (PVC). These storage integrations and other features make it possible for containerized applications to share data with other containers and preserve the container state. PVs can be provisioned statically by the cluster administrator or dynamically using Storage Classes. Some important features that distinguish different Storage Classes include Capacity, Volume Mode, Access Modes, Performance and Resiliency. When a Local Disk is attached directly to a single Kubernetes node, it is known as a Local PV which provides the best performance and is only accessible from a single node where it is attached. This post explores why LocalPV and NVMe storage should be used for Kubernetes workloads. Non-Volatile Memory Express (NVMe) for Kubernetes NVMe is a high-speed access protocol that delivers low latency and high throughput for SSD storage devices by connecting them to the processor through a PCIe interface. Early SSDs connected to the CPU through SATA or Serial Attached SCSi (SAS). These relied on legacy standards customized for Hard Disk speeds which were considered inefficient since each connection to the processor remained limited by synchronized locking or the SAS Host Bus Adapter (HBA). To overcome this challenge, NVMe unlocks the true potential of flash storage using the Peripheral Component Interconnect Express (PCIe) that supports high performance, Non-Uniform Memory Access (NUMA). NVMe also supports parallel processing, with 64K Input-Output queues with each queue having 64K entries. This high-bandwidth, low-latency storage hosts applications that can create as many I/O queues as system configuration, workload and the NVMe controller allows. Following a NUMA based storage protocol, NVMe allows different CPUs to manage I/O queues, using various arbitration mechanisms. Modern enterprises are data-driven, with users and devices generating huge amounts of data that may overwhelm companies. By enhancing the capabilities of multi-core CPUs, NVMe provides low latency and fast transfer rates for better access and processing of large data sets. NVMe devices typically rely on NAND Flash Memory that can be hosted on various SSD form factors including normal SSDs, U2 Cards, M2 Cards, and PCIe Add-In Cards. NVMe over Fabrics (NVMe-oF) extends the advantages of NVMe storage access by implementing the NVMe protocol for remotely connected devices. The architecture allows one node to directly access a storage device of another computer over several transport protocols. NVMe Architecture In NVMe architecture, the host computer is connected to SSD storage devices via a high throughput Host-Controller Interface. The storage service is composed of three main elements: SSD Controllers The PCIe Host Interface Non-Volatile Memory (e.g., NAND Flash) To submit queues to the Input/Output, the NVMe controller utilizes Memory-Mapped Controller Registers and the host system’s DRAM. The number of mapped registers determines the number of parallel I/O operations the protocol can support. A typical NVMe storage architecture Advantages of Using NVMe for Kubernetes Clusters PCIe reduces the need for various abstract implementation layers, allowing for faster, efficient storage. Some benefits of using NVMe for storage include: Efficient memory transfer - NVMe protocol only requires one ring per CPU to communicate directly with Non-Volatile Memory, thereby reducing locking speeds for I/O controllers. NVMe also enables parallelism by combining the number of Message Signalled Interrupts with multi-core CPUs to further reduce latency. Secured Cluster Data - NVMe-oF enables secure tunnelling protocols developed and managed by reputable data security communities such as the Trusted Security Group (TSG). This enables enterprise-grade security features such as Encryption at REST, Access Control and Crypto-Erase for cluster nodes and SSD storage devices. Supports Multi-Core Computing - The NVMe protocol utilizes a private queueing strategy to support up to 64K commands per queue over 64K queues. Since every controller has its own set of queues, the throughput increases linearly with the number of CPU cores available. Requires Fewer Instructions to Process I/O requests - NVMe relies on an efficient set of commands to half the number of CPU instructions required to implement Input-Output operations. This reduces latency while enabling advanced security features like reservations and power management for cluster administrators. Why use LocalPV with NVMe Storage for Kubernetes Clusters? While most storage systems used to persist data for Kubernetes clusters are remote and independent of the source nodes, it is possible to attach a local disk directly to a single node. Locally attached storage typically guarantees higher performance and tighter security than remote storage. A Kubernetes LocalPV represents a portion of local disk storage that can be used for data persistence in StatefulSets. With LocalPV, the local disk is specified as a persistent volume that can be consumed with the same PVC and Storage Class abstractions used for remote storage. This results in low latency storage that is suitable for fault-tolerant use-cases such as: Distributed data stores that share replicated data across multiple nodes LocalPV can also be used to cache data sets that require faster processing over data gravity LocalPV vs. hostPath Volumes Before the introduction of LocalPV volumes, hostPath volumes were used for accessing local storage. There were certain challenges while orchestrating local storage with hostPath as it didn’t support important Kubernetes features, such as StatefulSets. Additionally, hostPath volumes required separate operators for disk management, POD scheduling and topology, making them difficult to use in production environments. LocalPV volumes were designed in response to issues with the scheduling, disk accounting and portability of hostPath volumes. One of the major distinctions is that the Kubernetes control plane knows the Node that owns a LocalPV. With hostPath, data is lost when a POD referencing the volume is scheduled to a different node. LocalPV volumes can only be referenced using a Persistent Volume Claim (PVC) while hostPath volumes can be referenced both directly in the POD definition file and via PVC. How to Configure a Kubernetes Cluster with LocalPV NVMe Storage Workloads can be configured to access NVMe SSDs on a local machine using LocalPV and a Persistent Volume Claim, or StatefulSet with volume claim attributes. This section explores how to attach a local disk to a Kubernetes cluster with NVMe storage configured. The first step is to create a storage class that enables Volume Topology-Aware Scheduling. This will instruct the Kubernetes API to not bind a PVC until a Pod consuming the PVC is scheduled. The configuration file for the storage class will be similar to: YAML $ cat sc.yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: openebs-device-sc allowVolumeExpansion: true parameters: devname: "test-device" provisioner: device.csi.openebs.io volumeBindingMode: WaitForFirstConsumer Check the doc on storageclasses to know all the supported parameters for Device LocalPV. If the device with a meta partition is available on certain nodes only, then make use of topology to tell the list of nodes where we have the devices available. As shown in the below storage class, we can use allowedTopologies to describe device availability on nodes. YAML apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: openebs-device-sc allowVolumeExpansion: true parameters: devname: "test-device" provisioner: device.csi.openebs.io allowedTopologies: - matchLabelExpressions: - key: kubernetes.io/hostname values: - device-node1 - device-node2 The above storage class tells that device with meta partition test-device is available on nodes device-node1 and device-node2 only. The Device CSI driver will create volumes on those nodes only. The OpenEBS Device driver has its own scheduler which will try to distribute the PV across the nodes so that one node should not be loaded with all the volumes. Currently, the driver supports two scheduling algorithms: VolumeWeighted and CapacityWeighted, in which it will try to find a device which has lesser number of volumes provisioned in it or less capacity of volume provisioned out of a device respectively, from all the nodes where the devices are available. To know about how to select a scheduler via storage-class, refer this link. Once it is able to find the node, it will create a PV for that node and also create a DeviceVolume custom resource for the volume with the node information. The watcher for this DeviceVolume CR will get all the information for this object and create a partition with the given size on the mentioned node. The scheduling algorithm currently only accounts for either the number of volumes or total capacity occupied from a device and does not account for other factors like available cpu or memory while making scheduling decisions. So if you want to use node selector/affinity rules on the application pod, or have cpu/memory constraints, a Kubernetes scheduler should be used. To make use of kubernetes scheduler, you can set the volumeBindingMode as WaitForFirstConsumer in the storage class. This will cause a delayed binding, i.e Kubernetes scheduler will schedule the application pod first, and then it will ask the Device driver to create the PV. The driver will then create the PV on the node where the pod is scheduled. YAML apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: openebs-device-sc allowVolumeExpansion: true parameters: devname: "test-device" provisioner: device.csi.openebs.io volumeBindingMode: WaitForFirstConsumer Please note that once a PV is created for a node, the application using that PV will always get scheduled to that particular node only, as PV will be sticky to that node. The scheduling algorithm by Device driver or kubernetes will come into picture only during the deployment time. Once the PV is created, the application can not move anywhere as the data is there on the node where the PV is. Create a PVC using the storage class created for the device driver. YAML $ cat pvc.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: csi-devicepv spec: storageClassName: openebs-device-sc accessModes: - ReadWriteOnce resources: requests: storage: 4Gi Create the deployment YAML using the PVC backed by device driver storage. YAML $ cat fio.yaml apiVersion: v1 kind: Pod metadata: name: fio spec: restartPolicy: Never containers: - name: perfrunner image: openebs/tests-fio command: ["/bin/bash"] args: ["-c", "while true ;do sleep 50; done"] volumeMounts: - mountPath: /datadir name: fio-vol tty: true volumes: - name: fio-vol persistentVolumeClaim: claimName: csi-devicepv After the deployment of the application, we can go to the node and see that the partition is created and is being used as a volume by the application for reading/writing the data. Advantages of Using LocalPV with NVMe for Kubernetes Operators Some benefits of integrating LocalPV into clusters using NVMe for storage include: Compared to remotely connected storage systems, Local Persistent Volumes support more Input-Output Operations Per Second (IOPS) and throughput since the volume directory is directly mounted on the node. This means that with LocalPV Volumes, organizations can hone in on the high performance offered by NVMe SSDs. LocalPV also enables the dynamic reservation of storage resources needed for stateful services. This makes it easy to relaunch a process on the same node using the same SSD volume. LocalPV volume configuration pins tasks to the nodes where their data resides, eliminating the need for scheduling constraints, thereby enabling quicker access of SSDs through NVMe. Destroying a LocalPV is as easy as deleting the PVC consuming it, allowing for simpler storage management. Summary Non-Volatile Memory Express (NVMe) enhances data storage and access by leveraging the performance benefits of flash memory for SSD based storage. By connecting storage devices to the CPU directly via the PCIe interface, data companies eliminate the bottlenecks associated with SATA or SAS based access. LocalPV reduces the data path between storage and Kubernetes nodes by mounting a volume directly on a Kubernetes node. This results in higher throughput and IOPS, suitable for fault-tolerant stateful applications. OpenEBS by MayaData is one of the popular open-source, agile storage stacks for performance-sensitive databases orchestrated by Kubernetes. Mayastor, OpenEBS’s latest storage engine, delivers very low overhead versus the performance capabilities of underlying devices.. OpenEBS Mayastor does not require NVMe devices or that workloads consume NVMe, although in both cases performance will increase. OpenEBS Mayastor is unique currently amongst open source storage projects in utilizing NVMe internally, to communicate to options OpenEBS replicas. To learn more about how OpenEBS Mayastor, leveraging NVMe as a protocol, performs when leveraging some of the fastest NVMe devices currently available on the market, visit this article. OpenEBS Mayastor builds a foundational layer that enables workloads to coalesce and control storage as needed in a declarative, Kubernetes-native way. While doing so, the user can focus on what's important, that is, deploying and operating stateful workloads. If you’re interested in trying out Mayastor for yourself, instructions for how to set up your own cluster, and run a benchmark like `fio` may be found at https://docs.openebs.io/docs/next/mayastor.html. Related Blogs: https://blog.mayadata.io/the-benefits-of-using-nvme-for-kubernetes https://blog.mayadata.io/mayastor-nvme-of-tcp-performance
August 12, 2021
by Sudip Sengupta DZone Core CORE
· 6,911 Views · 2 Likes
article thumbnail
Container Attached Storage (CAS) vs. Shared Storage: Which One to Choose?
An Overview of Storage in Kubernetes Kubernetes supports a powerful storage architecture that is often complex to implement unless done right. The Kubernetes orchestrator relies on volumes-abstracted storage resources - that help to save and share data between ephemeral containers. Since these storage resources abstract the underlying infrastructure, volumes enable dynamic provisioning of storage for containerized workloads. In Kubernetes, shared storage is typically achieved by mounting volumes and connecting to an external filesystem or block storage solution. Container Attached Storage (CAS) is a relatively newer solution that allows Kubernetes administrators to deploy storage as containerized microservices in a cluster. The CAS architecture makes workloads more portable and simpler to modify storage based on application needs. Because CAS is deployed per workload or per cluster, it also eliminates the cross workload and cluster blast radius of traditional shared storage. This article compares CAS with traditional shared storage to explore their similarities, differences and architecture overview. Container Attached Storage: Container Attached Storage (CAS) is a solution for stateful workloads that deploys storage as a cluster running in the cloud or on-premises. Unlike traditional storage options where storage is a shared filesystem or block storage running externally, CAS enables storage controllers that can be managed by Kubernetes. These storage controllers can run anywhere with a Kubernetes distribution, whether on top of traditional shared storage systems, or managed storage services like Amazon EBS. Data stored in CAS is accessed directly from containers within the cluster, thereby significantly reducing Read/Write times. Architecture Overview: CAS leverages the container orchestrator’s environment to enable persistent storage. The CAS software has storage targets in containers that run as services. If desired, these services are replicated as microservice-based storage replicas that can easily be scheduled and scaled independently of each other. CAS services can be orchestrated using Kubernetes or any other orchestration platform as containerized workloads, ensuring the autonomy and agility of software development teams. For any CAS solution, the cluster is typically divided into two layers: The control plane consists of the storage controllers, storage policies, and instructions on how to configure the data plane. Control plane components are responsible for the provisioning volumes and other storage associated tasks. The data plane components receive and execute instructions from the control plane on how to save and access container information. The main element of the data plane is the Storage Engine which implements pooled storage. The engine is essentially responsible for the Input-Output volume path. Some popular storage engines of OpenEBS include Mayastor, cStor, Jiva and OpenEBS LocalPV. Some prominent users of OpenEBS include the CNCF, ByteDance(Tiktok), Optro, Flipkart, Bloomberg and others. Features: Container Attached Storage is built to primarily run on Kubernetes and other cloud-native container orchestrators. This makes the solution inherently platform-agnostic and portable, thereby making it an efficient storage solution that can be deployed on any platform without the inconvenience of vendor lock-in. CAS decomposes storage controllers into constituent units that can be scaled and run independently. Every storage controller is attached to a Persistent Volume and typically runs within the user-space, achieving storage granularity and independence from underlying operating systems Control plane entities are deployed as Custom Resource Definitions that deal with physical storage entities such as disks Data plane entities are deployed as a collection of PODs running in the same cluster as the workload The CAS architecture can offer synchronous replication in order to add additional availability. When to Use: Container Attached Storage is steadily becoming the de-facto standard for persistent storage of stateful Kubernetes workloads. CAS is most like the Direct Attached Storage that many current workloads expect, such as NoSQL, logging, machine learning pipelines, Kafka and Pulsar. Many workload communities and users have embraced CAS. CAS also allows small teams to retain control over their workloads. In short, CAS may be preferred where: The workloads expect local storage Teams want to be able to efficiently turn local storage, including disks or cloud volumes, into volumes on demand for Kubernetes workloads Performance is a concern The loose coupling of the architecture is desired to be maintained at the storage layer Increased density of workloads on hosts is desired Small team autonomy is desired to be maintained Traditional Shared Storage: Shared storage was designed to allow multiple users/machines to access and store data in a pool of devices. Shared storage provided additional availability to workloads that themselves were unable to provide for their own availability; additionally, shared storage was able to work around the poor performance of the underlying disk which at the time we're able to deliver no more than 150 I/O operations per second. Today’s underlying drives can be 10,000 times more performant; massively faster than the performance requirements of most workloads. A shared storage infrastructure typically consists of block storage systems in Storage Area Networks (SANs) or file system based storage devices in Network Attached Storage (NAS) configurations. Adoption While the storage industry was once a rapidly growing industry, with growth rates in excess of 30% - 50% YoY in the late 1990s and early 2000s. In the 2010s this growth rate moderated and in certain years stopped entirely. In the 2020s growth started again, however, at a rate much slower than the exponential growth in the amount of data storage. Meanwhile, Direct Attached Storage and Cloud storage each grew more quickly in terms of capacity shipped and overall spending. Architecture Overview In traditional shared storage, all nodes in a network share the same physical storage resources but have their own private memory and processing devices. Files and other data can be accessed by any machine connected to the central storage. For a Kubernetes application, traditional shared storage is first implemented by using monolithic storage software to virtualize physical storage resources, which could either be bare-metal servers, SAN/NAS networks or block storage solutions. The software then connects to Persistent Volumes that store cluster data. Each Persistent Volume (PV) is bound to a Persistent Volume Claim (PVC) which application PODs use to request a portion of the shared storage. Both CAS and shared storage can utilize the Container Storage Interface (CSI). CSI is used to issue the commands to the underlying storage such as the need to provision a PV or to expand or snapshot that capacity. Features: Embraces centralized, consolidated storage for Block and File Storage systems, allowing administration from a single interface. Traditional storage is distinctly divided into 3 layers: the Hosts tier which has client machines, the Fabric layer which includes switches and other networking devices, and the storage layer which includes the controllers used to read/write data onto physical disks. Shared storage integrates redundancy into the design of storage devices, allowing systems to withstand failure to a sizable degree. To scale up traditional shared storage, additional storage devices should be deployed and configured into the existing array. When to Use Shared storage is used to manage large amounts of data generated and accessed by a number of different machines. This is because traditional shared storage enables high performance for large files with no bottlenecks or downtimes. Shared storage is also the go-to storage solution for organizations that depend on collaboration between teams. As data and files are managed centrally, shared storage allows efficient version control and consolidated information management. Traditional Shared Storage is also used to eliminate the need for multiple drives containing the same information, which helps reduce redundancies, thus increasing storage capacity. CAS vs. Shared Storage The two storage options vary greatly in how they persist application data. While traditional shared storage relies on an external array of storage devices to persist data, CAS uses containers within an orchestrated environment. Following are a few similarities and differences between CAS and Traditional Shared Storage: Similarities: Both CAS and traditional shared storage offer high availability storage for applications. CAS allows high availability using Data POD replicas that ensure storage is always available for the CAS cluster. While traditional shared storage uses a redundant design to ensure that the storage system can withstand failure. Both options provide quick storage options for critical applications. CAS uses agile microservices to ensure quick I/O times while shared storage allows multiple machines to quickly read and write data on a shared pool of storage devices, reducing the need to create connections between individual machines. Both solutions accommodate software-defined storage which leverages the performance of physical devices with the agility of software. Both can utilize the Container Storage Interface (CSI) to issue the commands to the underlying storage. Both can be Open Source, extending the openness of Kubernetes to the data layer. It appears that container attached storage is somewhat more likely to be open source however that is yet to be determined conclusively. Differences CAS follows a container-based microservice framework for storage, which means teams can take advantage of the agility and portability of containers to ensure faster, more efficient storage. On the contrary, traditional shared storage involves different virtual or physical machines reading/writing into a shared pool of storage devices, thereby increasing latency and reducing access speeds. CAS is platform-agnostic. This means CAS-based storage solutions can run either on-premises or the cloud, without requiring extensive configuration changes. While shared storage relies on Kernel modifications, making it is inefficient to deploy for workloads across different environments. While traditional shared storage relies on consolidated monolithic storage software, CAS runs on the userspace, enabling independent management capabilities for efficient storage administration at the granular level. CAS allows linear scalability since storage containers can be brought up as required, while in traditional shared storage, scaling involves adding newer devices to an existing storage array Summary Designed in Kubernetes, CAS enables agility, granularity and linear scalability, making it a favourite for cloud-native applications. Traditional shared storage offers a mature stack of storage technology that mainly falls short in persisting storage for stateful applications due to the inherent lack of linear scalability. CAS is a novel solution that enables the implementation of storage controllers to exist in userspace, allowing maximum scalability. OpenEBS, a popular CAS based storage solution, has helped several enterprises run stateful workloads. Originally developed by MayaData, OpenEBS is now a CNCF project with a vibrant community of organizations and individuals alike. This was also evident from CNCF’s 2020 Survey Report that highlighted MayaData (OpenEBS) in the top-5 list of popular storage solutions. Resources: Canonical definition of Container Attached Storage: https://www.cncf.io/blog/2020/09/22/container-attached-storage-is-cloud-native-storage-cas/ To read Adopter use-cases or contribute your own, visit: https://github.com/openebs/openebs/blob/master/ADOPTERS.md. CNCF 2020 Survey Report: https://www.cncf.io/wp-content/uploads/2020/11/CNCF_Survey_Report_2020.pdf OpenEBS LocalPV Quick Start Guide: https://docs.openebs.io/docs/next/localpv.html This article has already been published on https://blog.mayadata.io/container-attached-storage-cas-vs.-shared-storage-which-one-to-choose and has been authorized by MayaData for a republish.
August 10, 2021
by Sudip Sengupta DZone Core CORE
· 4,689 Views · 3 Likes
article thumbnail
Microservices vs Monolith: The Ultimate Comparison 2021
Microservices vs Monolith: Why microservices are something new that has hit the software market thread while the monolithic approach is losing its value?
August 9, 2021
by Alfonso Valdes
· 8,416 Views · 7 Likes
article thumbnail
Managing Secrets in Node.js With HashiCorp Vault
Safely manage your company's secrets by learning how to access Vault via Node.js applications, retrieve secrets, and interface with Vault via Web UI and CLI.
August 5, 2021
by Kentaro Wakayama
· 19,922 Views · 2 Likes
article thumbnail
IBM App Connect Enterprise
In this article, we describe and provide a walkthrough of the configuration required to run a SAP Inbound scenario in an IBM App Connect Enterprise Container.
August 5, 2021
by Amar Shah
· 9,777 Views · 3 Likes
  • Previous
  • ...
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×