DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Advanced Maintenance of a Multi-Database Citus Cluster With Flyway
  • Getting Started With Apache Cassandra
  • How to Hive on GCP Using Google DataProc and Cloud Storage: Part 1
  • Implement a Distributed Database to Your Java Application

Trending

  • Docker Base Images Demystified: A Practical Guide
  • How Kubernetes Cluster Sizing Affects Performance and Cost Efficiency in Cloud Deployments
  • Scaling DevOps With NGINX Caching: Reducing Latency and Backend Load
  • Manual Sharding in PostgreSQL: A Step-by-Step Implementation Guide
  1. DZone
  2. Data Engineering
  3. Databases
  4. How to Scale a Distributed SQL Database to 1M+ Inserts Per Sec

How to Scale a Distributed SQL Database to 1M+ Inserts Per Sec

Cut down on your query latency, quickly and easily with this tutorial!

By 
Karthik Ranganathan user avatar
Karthik Ranganathan
·
Oct. 28, 19 · Tutorial
Likes (6)
Comment
Save
Tweet
Share
12.7K Views

Join the DZone community and get the full member experience.

Join For Free

stack-of-pancakes-with-fruit

There are a number of well-known experiments where eventually-consistent NoSQL databases were scaled out to perform millions of inserts and queries. Here, we do the same using YSQL, YugabyteDB’sPostgreSQL-compatible, strongly-consistent, distributed SQL API.

We created a 100-node YugabyteDB cluster, ran single-row INSERT and SELECT workloads with high concurrency – each for an hour and measured the sustained performance (throughput and latency). This post details the results of this experiment as well as highlights the key aspects of the YugabyteDB architecture that makes it fit for such high-volume ingest workloads.

You may also like: How Data Sharding Works in a Distributed SQL Database.

Database Cluster Setup

While YugabyteDB can be deployed across multiple zones or regions, this benchmark was focussed on the aggregate performance of a 100 node cluster. Therefore, all the 100 nodes were deployed in the AWS cloud, Oregon (us-west-2) region and in a single-zone (us-west-2a). Each of the instances were of type c5.4xlarge (16 vCPUs). This information is summarized below.

  • Cluster name: MillionOps.
  • Cloud: Amazon Web Services.
  • Region: Oregon (us-west-2).
  • Zone: us-west-2a.
  • Number of nodes: 100.
  • Instance type: c5.4xlarge (16 vCPUs).
  • Disk on each node: 1TB EBS SSD (gp2).
  • Replication Factor: 3.
  • Consistency level: Strong consistency for both writes and reads.

Initial performance dashboard

Initial performance dashboard

Benchmark Setup

The benchmark application was an open-source Java program. The application’s database workload simply does multi-threaded, single-row INSERTs and SELECTs against a table that has a key and a value column. The size of each row was 64 bytes. The insert and select benchmarks were run for one hour each in order to measure the sustained throughput and latency.

This benchmark application was run on 6 instances of 8 cores each. Note that we could not consolidate into a fewer number of more powerful instances since we were hitting the maximum network bandwidth on the network instances. Each of these benchmark instances were prepared as shown below.

  • Java 8 was installed using the following commands.
$ sudo apt update
$ sudo apt install default-jre


  • The benchmark program was downloaded on to these machines as shown below.
$ wget -P target https://github.com/YugaByte/yb-sample-apps/releases/download/v1.2.0/yb-sample-apps.jar



  • The benchmark program is capable of taking a list of servers that comprise the database cluster, and performing random operations across these servers. In order to do this, we set up an environment variable with the list of comma-separated host:port entries of the 100 database servers as shown below.
$ export YSQL_NODES=node-1-ip-addr:5433,node-2-ip-addr:5433,...


Benchmarking the INSERT Workload

The first step was to run an INSERT benchmark on this 100-node cluster. The following command was run on each of the benchmark instances.

java -jar ~/yb-sample-apps-no-table-drop.jar    \
                   --workload SqlInserts        \
                   --nodes $YSQL_NODES          \
                   --num_unique_keys 5000000000 \
                   --num_threads_write 400      \
                   --num_threads_read 0         \
                   --uuid 00000000-0000-0000-0000-00000000000n


The table on which the benchmark was run had the following simple schema.

CREATE TABLE table_name (k varchar PRIMARY KEY, v varchar);


This workload performed a number of INSERTs using prepared statements, as shown below.

INSERT INTO table_name (k, v) VALUES (?, ?);


Note a few points about the benchmark setup.

  • Each benchmark program writes unique set of rows. The uuid parameter forms a prefix of the row key. It is set differently (by varying the value of n from 1 to 6) on each benchmark instance to ensure it writes separate keys.
  • A total of 30 billion unique rows will be inserted upon completion. Each benchmark program proceeds to write out 5 billion keys, and there are 6 such programs running in parallel.
  • There are 2,400 concurrent clients performing inserts. Each benchmark program uses 400 write threads, and there are six such programs running concurrently.

Performance specs

Initial performance dashboard


A screenshot of the write throughput on this cluster while the benchmark was in progress is shown below. The write throughput was 1.26 million inserts/sec.

Total YSQL OPs/Sec

Total YSQL OPs/Sec

The corresponding average insert latency across all the 100 nodes was 1.66ms, as shown below. Note that each insert is replicated three-ways to make the cluster fault tolerant.

PSQL Op Latency

PSQL Op Latency

The average CPU usage across the nodes in the cluster was about 78%, as shown in the graph below.

CPU Usage

CPU Usage

Benchmarking the SELECT Workload

The following command was run for the SELECT workload.

java -jar ~/yb-sample-apps-no-table-drop.jar \
               --workload SqlInserts         \
               --nodes $YSQL_NODES           \
               --max_written_key 500000000   \
               --num_writes 0                \
               --num_reads 50000000000       \
               --num_threads_write 0         \
               --num_threads_read 400        \
               --read_only                   \
               --uuid 00000000-0000-0000-0000-00000000000n


The SELECT workload would lookup random rows on the table that the INSERT workload (described in the previous section) populated. Each SELECT query was done using prepared statements, as shown below.

SELECT * FROM table_name WHERE k=?;


The main points worth noting in this workload are:

  • There are 2,400 concurrent clients issuing selects. Each benchmark program uses 400 threads, and there are six programs running in parallel.
  • Each read operation randomly selects one row from a total of 3 billion rows. Each benchmark program randomly queries one row from a total of 500 million rows, and there are six concurrent programs.

Query performance specs

Query performance specs

Once again, a screenshot of the read throughput on this cluster while the benchmark was in progress is shown below. The read throughput was 2.8 million selects/sec. YugabyteDB reads are strongly consistent by default and that is the setting used for this benchmark. Additional throughput can be achieved by simply allowing timeline-consistent reads from follower replicas (see final section of this post).

Total YSQL Ops/sec

Total YSQL Ops/sec

The corresponding average select latency across all the 100 nodes was 0.56ms, as shown below.

YSQL Op Latency

YSQL Op Latency

The average CPU usage across the nodes in the cluster was about 64%, as shown in the graph below.

CPU usage

CPU Usage

Architecture for Horizontal Write Scaling

The architecture of a YugabyteDB cluster is shown in the figure below. The YB-TServer service is responsible for managing the data in the cluster, while the YB-Master service manages the system configuration of the cluster. YB-TServer automatically shards every table into a number of shards (aka tablets).

Given the Replication Factor (RF) of three for the cluster, each tablet is represented as a Raft group of three replicas with one replica considered the leader and other two replicas considered as followers. In a 100-node cluster, each of these three replicas are automatically stored on exactly three (out of 100) different nodes where each node can be thought of as representing an independent fault domain.

YB-Master automatically balances the total number of leader and follower replicas on all the nodes so that no single node becomes a bottleneck and every node contributes its fair share to incoming client requests. The end result is strong write consistency (by ensuring writes are committed at a majority of replicas) and tunable read consistency (by serving strong reads from leaders and timeline-consistent reads from followers) irrespective of the number of nodes in the cluster.

Data architecture

Data architecture

To those new to Raft, the simplest explanation is that it is a protocol with which a cluster of nodes can agree on values. It is arguably the most popular distributed consensus protocol in use today. Business-critical cloud-native systems like etcd (the configuration store for Kubernetes) and consul (Hashicorp’s popular service discovery solution) are built on Raft as a foundation.


Further Reading

  • 10 Easy Steps to a Complete Understanding of SQL
  • Graph-Powered Search: Neo4j & Elasticsearch.
Database sql cluster

Published at DZone with permission of Karthik Ranganathan. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Advanced Maintenance of a Multi-Database Citus Cluster With Flyway
  • Getting Started With Apache Cassandra
  • How to Hive on GCP Using Google DataProc and Cloud Storage: Part 1
  • Implement a Distributed Database to Your Java Application

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!