DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report

2 Billion MySQL Records

Handling 2 billion MySQL records is actually possible. Sure, you'll need a monster server, and zero indexes or foreign keys during import, but it is possible.

Thomas Hansen user avatar by
Thomas Hansen
CORE ·
Mar. 30, 22 · Tutorial
Like (7)
Save
Tweet
Share
7.63K Views

Join the DZone community and get the full member experience.

Join For Free

Yesterday Gary Gray, a friend of mine sent me the following screenshot. He's got a database table with 2 billion records he intends to use Magic on, and he wanted to show it to me, thinking I'd probably be like, "That's cool." Gary works for Dialtone in South Africa, and is used to handling a "monster amount of data". For the record, neither Gary nor I would want to encourage anyone to handle 2 billion records in MySQL, but as you can obviously, see it is possible.

Screenshot from Gary

This of course is a legacy system. If it was started today, Gary would obviously use ScyllaDB, Cassandra, or something similar. Simply counting the records in the table above requires 5 minutes of execution in MySQL workbench. Obviously, such sizes are not for those faint at heart. Such a database also implies a lot of inserts.  This makes it impossible to add indexes, resulting in every select query you do towards it having some sort of where clause in it that implies a full table scan. However, it is possible.

Another technique to use (if you can create database snapshots) is to import your huge database into a similar table structure to a "read-only database copy" without any indexes or foreign keys, and then add indexes and keys on your read-only copy after having imported your records. This allows you to do at least some selection into it afterward, resulting in query capacity into a read-only "copy" of your database. For the record, importing 2 billion records into MySQL took Gary 20 hours, even on a "monster machine" (as you can see literally was the name of his server in the above screenshot). I don't know how much time creating indexes would require, but I would assume something similar being 20+ hours for each index.

Once you have indexes on your read-only copy, you can actually do some basic CRUD read on it, with "where" statements resulting in filtering, and maybe even do some basic paging and sorting: which, of course, was Gary's point. Then the idea is to generate a Hyperlambda CRUD API, providing him with a backend allowing him to at least extract some meaningful data from it by adding filtering conditions, exposing these again to an automatically generated frontend.

For the record, handling 2 billion database records in MySQL is (probably) madness, but sometimes you have no choice, having inherited some legacy project that slowly built up over time. At the very least, the above provides you with a method to do some basic handling of your data. Interestingly, Magic allowed Gary to extract his data just as rapidly as he could with MySQL Workbench once he was done, which I think is pretty cool. So, yeah:

That's cool, Gary. ;)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • When Should We Move to Microservices?
  • Introduction Garbage Collection Java
  • Reliability Is Slowing You Down
  • Distributed Tracing: A Full Guide

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: