Spotlight on CockroachDB
Learn about CockroachDB and see the architecture, how to synchronize it with Kubernetes, how to deploy it on a Kubernetes cluster, and more!
Join the DZone community and get the full member experience.Join For Free
The construction, process, and usage of databases has evolved a lot over the last few decades. Traditional relational databases were enough to work with the data present at that time, but with the innate reliance on the Internet, the progression of cloud-native architecture, and the advancement of how businesses utilize and analyze data science, relational databases are not cutting it. What happens if a node fails in a traditional single machine of a relational database? Your database would go down along with any applications that depend on it.
Over time as NoSQL databases were introduced—which are capable of handling a large amount of data in real-time—the risk of apps failing began to decrease but the risk of data inconsistencies increased. So, there has been a growing need for a better storage solution for data to cope with today’s dynamic cloud-native architecture. CockroachDB was specifically designed to solve and meet this need.
What Is CockroachDB?
CockroachDB is a globally distributed SQL database constructed on top of a transactional and consistent key-value store that you can use everywhere. The database tool is optimized for the cloud to deliver guaranteed transactions for local and globally distributed workloads and it allows you to build global, scalable and resilient cloud services.
When using CockroachDB, you will typically run into two main terms: nodes and clusters. Nodes are individual machines running CockroachDB, and when we join these nodes together, they form a cluster which is the start of an entire functioning CockroachDB system. It’s advisable to run CockroachDB as a multi-node cluster to leverage the full scope of it’s cloud-native database design.
CockroachDB is implemented as a distributed key-value store over a monolithic sorted map, to make it easy for large tables and indexes to function. While CockroachDB is a distributed SQL database, developers treat it as a relational database because it uses the same SQL syntax. But on an architecture level, CockroachDB’s architecture is different from a relational database architecture. In CockroachDB, every table is ordered lexicographically by key. So, when we store the data on the database, we are leveraging the key value store.
Since CockroachDB has a distributed architecture, we just need to spin a node up of cockroach Database, point it at a cluster, and the database participates in that cluster. CockroachDB then coordinates with the nodes to gain consensus for all queries and transactions. When we spin up a node and point at the cluster, data is balanced out based on what you optimally want to do with that data. The whole cluster has just one type of node as a single composable unit and every node is a single consistent gateway to the entirety of the database. So, we could have a database with clusters and nodes worldwide, which will look like one logical database to the application that is accessing it in whichever region.
CockroachDB architecture offers high availability and consistency. In the CockroachDB system, if a node dies, your application continues running by leveraging the other nodes in the cluster, and when you bring the node back online, the node reads are immediately consistent with the other nodes.
Synchronizing CockroachDB and Kubernetes
If you have spent a lot of time around people in the container or the orchestration community, you may have heard the opinion on the occasion that running databases on Kubernetes is a bad idea. While some of these complaints were valid a couple of years ago, today there are multiple resilient databases that are capable of hiding virtualizations, failovers, container workloads, etc. This is what makes CockroachDB is a great fit for Kubernetes clusters.
While Kubernetes makes it easy to deploy, scale and manage applications, managing and retaining the state of a cluster is a challenge on the orchestration platform. Kubernetes needs a storage system that can replicate data across the database nodes to survive any kind of failure, and that is what CockroachDB thrives at. Combining CockroachDB with Kubernetes allows us to orchestrate containers without sacrificing high availability and helps in maintaining the correctness of stateful databases.
Deploying CockroachDB on a Kubernetes Cluster
The ideal prerequisite for following the below steps is to have a Kubernetes cluster up and running for deploying and using CockroachDB on.
There are multiple ways how we can deploy CockroachDB on Kubernetes, for this article though, we will deploy through the CockroachDB Kubernetes operator. Firstly, apply the CockroachDB Operator using CustomResourceDefinition (CRD).
mylab@mylab:~$ kubectl apply -f https://raw.githubusercontent.com/cockroachdb/cockroach-operator/master/config/crd/bases/crdb.cockroachlabs.com_crdbclusters.yaml customresourcedefinition.apiextensions.k8s.io/crdbclusters.crdb.cockroachlabs.com created
Next, apply the operator manifest. This will create all the roles, accounts, deployments necessary.
mylab@mylab:~$ kubectl apply -f https://raw.githubusercontent.com/cockroachdb/cockroach-operator/master/manifests/operator.yaml clusterrole.rbac.authorization.k8s.io/cockroach-database-role created serviceaccount/cockroach-database-sa created clusterrolebinding.rbac.authorization.k8s.io/cockroach-database-rolebinding created role.rbac.authorization.k8s.io/cockroach-operator-role created clusterrolebinding.rbac.authorization.k8s.io/cockroach-operator-rolebinding created clusterrole.rbac.authorization.k8s.io/cockroach-operator-role created serviceaccount/cockroach-operator-sa created rolebinding.rbac.authorization.k8s.io/cockroach-operator-default created deployment.apps/cockroach-operator created
We can now check if the cockroach operator pod is running.
mylab@mylab:~$ kubectl get pods NAME READY STATUS RESTARTS AGE cockroach-operator-75787667ff-qf2xq 1/1 Running 0 109s
Download the example.yaml file which has the specifications for configuring Kubernetes cluster using operators.
mylab@mylab:~$ curl -O https://raw.githubusercontent.com/cockroachdb/cockroach-operator/master/examples/example.yaml
Apply the example manifest.
mylab@mylab:~$ kubectl apply -f example.yaml crdbcluster.crdb.cockroachlabs.com/cockroachdb created
If we check the pods list, we can see 3 pod instances of CockroachDB are running, this is to provide high availability.
mylab@mylab:~$ kubectl get pods NAME READY STATUS RESTARTS AGE cockroach-operator-75787667ff-qf2xq 1/1 Running 1 24m cockroachdb-0 1/1 Running 0 5m28s cockroachdb-1 1/1 Running 0 5m7s cockroachdb-2 1/1 Running 0 4m48s
Using a built-in SQL client, get inside one of the CockroachDB pod instance SQL shell.
mylab@mylab:~$ kubectl exec -it cockroachdb-0 -- ./cockroach sql --certs-dir cockroach-certs # # Welcome to the CockroachDB SQL shell. root@:75262/defaultdb>
Now, create a user for CockroachDB.
root@:75262/defaultdb> CREATE USER demo WITH PASSWORD 'demo123'; CREATE ROLE
Open a new terminal and post-forward the CockroachDB service on port 8080.
mylab@mylab:~$ kubectl port-forward service/cockroachdb-public 8080 Forwarding from 127.0.0.1:8080 -> 8080 Forwarding from [::1]:8080 -> 8080
Open the browser and go to localhost:8080. We will be able to access the CockroachDB user interface. We need to put the user and password, which we created in the steps above, to login in to CockroachDB.
Once you login, you will get all the details of CockroachDB running in the pods on a Kubernetes cluster.
Now, let us go back to the terminal where we were running the database and execute few basic SQL commands to create a table.
root@:75262/defaultdb> show databases; database_name | owner ----------------+-------- defaultdb | root postgres | root system | node (3 rows) root@:75262/defaultdb> create database company; CREATE DATABASE root@:75262/defaultdb> use company; SET root@:75262/company> CREATE TABLE Employee ( EmployeeID int, Name varchar(30), City varchar(50) ); CREATE TABLE root@:75262/company> INSERT INTO Employee VALUES (1, 'Rob', 'California'); INSERT 1 root@:75262/company> INSERT INTO Employee VALUES (2, 'Geoff', 'New York'); INSERT 1 root@:75262/company> SELECT * FROM Employee; employeeid | name | city -------------+--------+------------- 1 | Rob | California 2 | Geoff | New York (2 rows) root@:75262/company> GRANT admin TO demo; GRANT
Go back to the browser, refresh the page and click on the database tab. We can see the employee table appearing, which we created in CockroachDB.
And there you have it, even if you accidentally delete any CockroachDB instance running on the cluster, a new instance will start immediately.
mylab@mylab:~$ kubectl delete pod cockroachdb-0 pod "cockroachdb-0" deleted
We can see a new container is getting created.
mylab@mylab:~$ kubectl get pods NAME READY STATUS RESTARTS AGE cockroach-operator-75787667ff-qf2xq 1/1 Running 1 45m cockroachdb-0 0/1 ContainerCreating 0 11s cockroachdb-1 1/1 Running 0 25m cockroachdb-2 1/1 Running 0 25m cockroachdb-vcheck-26982384-2bstc 0/1 Completed 0 26m
Here, all the three CockroachDB instances and back online.
mylab@mylab:~$ kubectl get pods NAME READY STATUS RESTARTS AGE cockroach-operator-75787667ff-qf2xq 1/1 Running 1 46m cockroachdb-0 1/1 Running 0 61s cockroachdb-1 1/1 Running 0 26m cockroachdb-2 1/1 Running 0 26m cockroachdb-vcheck-26982384-2bstc 0/1 Completed 0 27m
CockroachDB aims to make leveraging business data easy. Rather than waste time and energy troubleshooting database shortcomings, refocus that time, investment, and engineering into optimizing your company to become stronger on the market.
Published at DZone with permission of Kevin Taylor. See the original article here.
Opinions expressed by DZone contributors are their own.