Despite advances in computing, faster processors and high-speed networks, the performance of relational database applications is becoming slower and slower.
These performance problems are happening because of the rapid growth not only in the volume and velocity of data, but in its variety, complexity, and interconnectedness – that is, the data relationships inherent in any dataset.
The tidal wave of today’s data can be characterized as densely connected, semi-structured and with a high degree of data model volatility. And as the volume, velocity and variety of data are increasing, data relationships are growing even faster.
In this series on SQL strain, we’ll dive into the causes – and cures – of relational database performance issues, including the future-proof alternative of graph databases.
This week, we’ll discuss the five surest signs and symptoms it’s time to give up your relational database (RDBMS). But first, let’s quickly cover why relational databases keep you from extracting meaningful insights from your data.
The Problem with Relational Databases
Relational databases were designed for tabular data, with a consistent structure and a fixed schema. They work best for problems that are well defined at the outset.
However, attempting to answer questions about data relationships (e.g., a product recommendations engine, a social graph or a fraud detection solution) with a relational database involves numerous and expensive JOINs between database tables.
Despite their name, relational databases do not store relationships between data elements, making them unfit for today’s highly connected data.
Relational databases have a fixed scheme, so they don’t adapt well to changes. So even as Database Administrators (DBAs) and developers face a steady stream of requests to meet changing business requirements, such schema changes are problematic and take a great deal of time.
Many relational database applications are working fine within their limits. Some, however, may be showing significant signs of strain induced by the database, especially when an RDBMS is being used to handle highly connected data.
In a world where the only constant is flux and business data is connected more than ever, here are the five surest signs it’s time to abandon your SQL database:
1. A Large Number of JOINs
When you utilize queries that join many different tables, there’s an explosion of complexity and computing resource consumption. This results in a corresponding increase in query response times.
2. Numerous Self-JOINs (or Recursive JOINs)
Self-JOIN statements are common for hierarchy and tree representations of data, but traversing relationships by repeatedly joining tables to themselves is inefficient. In fact, some of the longest SQL queries in the world involve recursive JOINs.
3. Frequent Schema Changes
At a time when business agility is at a premium, requests for changes are more often than not put off by DBAs because the schema of relational databases isn’t designed for frequent changes and pivots. Common schema changes indicate that the data or requirements are rapidly evolving, calling for a more flexible model.
4. Slow-Running Queries (Despite Extensive Tuning)
Your DBA might use every trick in the book to speed up query times, but many SQL queries still aren’t fast enough to support your application’s needs. In addition, denormalizing data models for performance can negatively impact data quality and update behavior.
5. Pre-Computing Your Results
Because queries run so slowly, many applications pre-compute their results using past data. However, this is effectively using yesterday’s data for queries that should be handled in real timetoday. Furthermore, your system usually must pre-compute 100% of your data, even if only 1-2% of it will be accessed at any given time.
If you or your development team frequently suffer from any of these symptoms of SQL strain, then you’re probably trying to use a relational database to solve a graph problem. A graph database is purpose-built to store highly connected data, to flex as schemas change and to capture real-time insights from data relationships.