DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Big Data Topics

article thumbnail
AWS Data Pipeline vs Glue vs Lambda: Who Is a Clear Winner?
In this article, you will see a comparison between AWS Data Pipeline, Glue and Lambda
July 17, 2021
by Alex Jordan
· 27,622 Views · 2 Likes
article thumbnail
Sensors and Actuators in IoT - Enabling Industrial Automation
In IoT, automation is enabled by connecting data to a machine. Sensors and actuators in IoT represent these two end points of the system.
July 16, 2021
by Madhuri Jadhav
· 24,837 Views · 2 Likes
article thumbnail
What Is Data Locality?
This article covers leveraging Data Locality in your Big Data processing.
July 16, 2021
by Tomasz Lelek
· 12,605 Views · 4 Likes
article thumbnail
Snowflake vs. Redshift: Which Cloud Data Warehouse Is Right for You?
Analysis of Snowflake and Redshift's scalability, performance, support, security, and more to help determine which one is the best fit for your business.
July 7, 2021
by Ben Putano
· 6,695 Views · 3 Likes
article thumbnail
EC2 Instance Types: the Good, the Bad, and the Ugly
Let's explore EC2 instance types and choose ones that offer the best price-performance combo. Let's discuss the best practices for AWS cost optimization.
July 1, 2021
by Vito Clover
· 8,033 Views · 3 Likes
article thumbnail
The SOC Technology Stack: XDR, SIEM, WAF, and More
A SOC is composed of a wide range of processes and technologies, as well as a team of security experts. The team often employs automation to support their efforts.
June 27, 2021
by Eddie Segal
· 13,225 Views · 2 Likes
article thumbnail
How Carbon Uses PrestoDB With Ahana to Power Real-Time Customer Dashboards
Why ad tech company Carbon chose Presto on AWS for SQL data lake analytics.
June 25, 2021
by Jordan Hoggart
· 13,597 Views · 6 Likes
article thumbnail
Integration Patterns in Microservices World
In this in-depth article, architects and developers can use this information to guide their integration solutions.
June 24, 2021
by Jignesh Karia
· 38,535 Views · 20 Likes
article thumbnail
5 Characteristics of Modern Enterprise Architect
5 characteristics of a modern enterprise architect that everyone should know and 7 ways you can work on building these characteristics.
Updated June 24, 2021
by Mir Ali
· 26,408 Views · 16 Likes
article thumbnail
Data Lake and Data Mesh Use Cases
Data lakes are here to stay and may be supplemented with a data mesh.
June 23, 2021
by Tom Smith DZone Core CORE
· 8,448 Views · 3 Likes
article thumbnail
Understanding the Lag in Your Kafka Cluster
Kafka powers compelling consumer experiences in the companies. Consumer lag is a big challenge in Kafka. Understand and address consumer lag in Kafka.
Updated June 14, 2021
by Rohit Choudhary
· 17,058 Views · 4 Likes
article thumbnail
Introduction to Spring Data Elasticsearch 4.1
Getting started with the latest version of Spring Data Elasticsearch 4.1 using Elasticsearch 7 as a NoSQL database.
June 13, 2021
by Arnošt Havelka DZone Core CORE
· 23,128 Views · 9 Likes
article thumbnail
Confluent’s Kafka REST Proxy, The Silk Route for Data Movement to Operational Kafka Cluster
In this article, I am going to detailing out the steps to integrate the prebuilt versions of Confluent REST Proxy with running a multi-broker Apache Kafka cluster.
June 13, 2021
by Gautam Goswami DZone Core CORE
· 20,364 Views · 3 Likes
article thumbnail
Introducing Cloudera SQL Stream Builder (SSB)
SSB is an improved release of Eventador's SQL Stream Builder with integration into Cloudera Manager, Cloudera Flink, and other streaming tools.
Updated June 6, 2021
by Tim Spann DZone Core CORE
· 14,852 Views · 5 Likes
article thumbnail
Applications for GPU-Based AI and Machine Learning
We look at some of the most talked-about Artificial Intelligence and Machine Learning areas where graphical processing units (GPU) play an ever-increasing role.
June 6, 2021
by Kevin Vu
· 16,695 Views · 5 Likes
article thumbnail
Oracle BI vs. Tableau: Which Business Intelligence Tool Is Better?
The choice between these 2 equally good BI software will depend on the scale, complexity of data, and the objective of the enterprises towards BI implementation.
June 6, 2021
by Raju Shahi
· 9,629 Views · 4 Likes
article thumbnail
Deploying CockroachDB on Kubernetes using OpenEBS LocalPV
CockroachDB is a cloud-native SQL database that features both scalability and consistency. The database is designed to withstand data center failures by deploying multiple instances of symmetric nodes in a cluster consisting of several machines, disks, and data centers. Kubernetes’ built-in capabilities to scale and survive node failures make it well suited to orchestrate CockroachDB’s databases. This is particularly for the reason that Kubernetes simplifies cluster management and helps maintain high-availability by replicating data across independent nodes. This guide focuses on how OpenEBS LocalPV devices can be used to persist storage for Kubernetes-Hosted CockroachDB clusters. Introduction to Distributed, Scaled-out Databases Ever growing demands for resilience, performance, scalability and ease of use have led to an explosion of choices for developers and data scientists in search of an open source database to address their needs. Databases are often characterized as either SQL databases, noted for their consistency guarantees with PostgreSQL and MariaDB considered to be ACID compliant (Atomic, Consistent, Isolated, Durable), or NoSQL databases which have been noted for their scalability and flexibility however not considered to be either ACID compliant or completely compatible with SQL. More recently Distributed, Scaled-out Databases were introduced that promise to avoid the trade-offs between SQL and NoSQL databases, allowing for the scalability of NoSQL DBs along with the ACID (Atomic, Consistent, Isolated, Durable) transactions, strong consistency, and relational schemas of SQL DBs. CockroachDB is a distributed database that is built on top of RocksDB as its transactional and key-value store. Cockroach DB supports both ACID transactions and vertical & horizontal scalability. With extensive geographical distribution, CockroachDB can maintain availability with controlled latency in case of disk, machine or even a data center failure. How CockroachDB works: CockroachDB is deployed in clusters consisting of multiple nodes. Each node is divided into five layers: The SQL Layer converts client queries to key-value entities by first parsing them against a YACC file then converting them into an abstract syntax tree. With this tree, the database will generate a network of plan nodes containing a key-value code. When the plan nodes are executed, they initiate communication with the transaction layer. The Transaction Layer then uses two-phase commits to implement the semantics of ACID transactions. These commits are executed across all nodes in the cluster. The commit involves posting write extents and transaction records, then executing read operations. Once a commit has been made at the transaction layer, a request is made to the respective node’s Distribution Layer. This layer then identifies the destination node for the request and forwards the request to its replication layer. The Replication Layer’s primary responsibility is creating multiple copies of data across cluster nodes. It also uses a raft algorithm to ensure consensus between different nodes holding similar copies of data. The Storage Layer uses RocksDB to store data as key-value pairs. Although CockroachDB can run on Mac, Linux, and Windows OS, production instances of CockroachDB are typically run on Linux Virtual machines or containers. The database can be orchestrated either on cloud or on-premises setup. For running stateful applications, orchestration tools like Kubernetes are considered perfect. Orchestrating CockroachDB with Kubernetes Clusters: Before we begin To understand how CockroachDB is orchestrated on Kubernetes, here are some Kubernetes terminology applicable to storage and stateful applications: A StatefulSet is a collection of Kubernetes PODs viewed as a single stateful unit with its own network identity. A StatefulSet is a stable Kubernetes object that always binds to the same persistent storage when it restarts. A Persistent Volume is a block-storage-based file system that is bound to a POD. A volume’s lifecycle is not tied to the POD to which it is attached, and every CockroachDB node can attach to the same persistent volume every time it restarts. A Certificate Signing Request is a request by a client to have their TLS certificate signed by the Certificate Authority built into Kubernetes by default. Role-Based Access Control (RBAC) is the system used by Kubernetes to administer access permissions in the cluster. Roles allow users to access certain resources within the cluster. To use the most up-to-date files, Kubernetes version 1.15 or higher is required to run CockroachDB clusters. The database can be deployed on any Kubernetes distribution, including a Local cluster (such as Minikube), Amazon AWS, EKS, Google GKE and GCE, among others. For persistence and replication, CockroachDB relies on external persistent volumes such as OpenEBS LocalPV. Installing CockroachDB Operators on OpenEBS LocalPV Devices When using OpenEBS with CockroachDB, a LocalPV is provisioned on the node where a CockroachDB POD is attached. The volume uses an unattached block device, which is used to store data. OpenEBS Dynamic LocalPV provisioner can create Kubernetes Local Persistent Volumes using block devices available on the node to persist data, hereafter referred to as OpenEBS LocalPV Device volumes. When compared to native Kubernetes Local Persistent Volumes, OpenEBS LocalPV Device volumes have the following advantages. Dynamic Volume provisioner as opposed to a Static Provisioner. Better management of the block devices used for creating LocalPVs by OpenEBS NDM. NDM provides capabilities like discovering block device properties, setting up device filters, metrics collection and the ability to detect if the block devices have moved across nodes. Once a volume claims a block device, no other application can use the device for storage. If there are limited block devices in other nodes, nodeSelectors can be used to provision storage for applications on particular cluster nodes. The recommended configuration for CockroachDB clusters is at least three nodes with one unclaimed Local SSD per node. This solution guide takes you through installing CockroachDB Kubernetes operators, and then configuring the cluster to use Local OpenEBS devices as the storage engines. The guide also highlights how to access the database for SQL queries, and finally demonstrates how to monitor the database using Prometheus and Grafana. Let us know how you use CockroachDB in production and if you have an interesting use case to share. Also, please check out other OpenEBS deployment guides on common Kubernetes stateful workloads at: Deploying Kafka on Kubernetes Deploying Elasticsearch on Kubernetes Deploying WordPress on DigitalOcean Kubernetes Deploying Magento on Kubernetes Deploying Percona on Kubernetes Deploying Cassandra on Kubernetes Deploying MinIO on Kubernetes Deploying Prometheus on Kubernetes This article has already been published on https://blog.mayadata.io/deploying-cockroachdb-on-kubernetes-using-openebs-localpv and authorised by MayaData for a republish.
May 31, 2021
by Sudip Sengupta DZone Core CORE
· 13,855 Views · 3 Likes
article thumbnail
AWS Serverless Data Lake: Built Real-time Using Apache Hudi, AWS Glue, and Kinesis Stream
In an enterprise system, populating a data lake relies heavily on interdependent batch processes. Today’s business demands high-quality data in minutes or seconds.
May 29, 2021
by Gaurav Gupta
· 13,227 Views · 3 Likes
article thumbnail
4 Ways the IoT Creates Intelligent Pipeline Monitoring
IoT sensors make it possible to detect and pinpoint leaks in pipelines more effectively in the pipeline industry. How can they improve pipeline monitoring?
May 27, 2021
by Emily Newton
· 21,230 Views · 2 Likes
article thumbnail
Azure Synapse Analytics – New Insights Into Data Security
Integrated Azure Synapse Workspace helps handle the security of data in one place for all data lakes, data analytics, and warehousing needs, but also requires learning some new concepts.
May 24, 2021
by Piotr Gwiazda
· 8,452 Views · 2 Likes
  • Previous
  • ...
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×