Graph Database vs. Relational Database
Learn the main differences between a graph database and a relational database, use-cases that are best suited for each database type, and their strengths and weaknesses.
Join the DZone community and get the full member experience.Join For Free
At the very beginning of most development endeavors lies an important question: What database to choose? There is such an abundance of database technologies at this moment, it’s no wonder many developers don’t have the time or energy to research new ones. If you are one of those developers and you aren’t very familiar with graph databases in general, you’ve come to the right place!
In this article, you will learn about the main differences between a graph database and a relational database, what kind of use-cases are best suited for each database type, and what are their strengths and weaknesses.
How Does a Graph Database Differ From a Relational Database?
The main difference is the way relationships between entities are stored. In a graph database, relationships are stored at the individual record level, while a relational database uses predefined structures, a.k.a., table definitions.
Relational databases are faster when handling huge numbers of records because the structure of the data is known ahead of time. This also leads to a smaller memory footprint. Graph databases don’t have a predefined structure for the data which is why each record has to be examined individually during a query to determine the structure of the data.
The Graph Data Model
First things first! To decide if you need a graph database, you need to be familiar with the basic terminology. The fundamental components of a graph database are:
- Nodes: the main entities in a graph. You can think of them as rows in a relational database.
- Relationships: the connections between those entities. These would be foreign keys in a relational database.
- Labels: attributes that group similar nodes together.
- Properties: key/value pairs stored within nodes or relationships.
In a typical social network graph, the nodes represent people in different social groups and their connections with one another. Every person is represented with a node that’s labeled as
Person. These nodes contain the properties
FRIENDS_WITH and contain a
yearsOfFriendship property to specify the duration of the friendship connection. Each person is assigned a location through
:LIVES_IN relationships with nodes labeled
While this is a very simple example, it concisely demonstrates the power and benefits of using a graph database. For example, if you wanted to add different properties to some of the nodes, you would be able to. Unlike a table, where you need to add a column for each additional attribute, here you can be much more flexible with the data structure and types. A property that was meant to be a string can be used as an integer without any constraints. To be fair, this can cause problems for you in the long run, but you can do it if need be.
The Relational Data Model
A relational database requires a predefined and carefully modeled set of tables. We create one for each entity and add the needed attributes as columns. While this is also pretty straightforward, it’s much more rigid than the graph schema and not as extendible.
For example, each person is connected to other people through friendships, and to model this relationship, we have to add another table. If there were different kinds of connections (related to, no longer friends…) we would have to change the schema accordingly. A relational database isn’t suited for this specific use case because the focus isn’t on the data itself but rather on the relationships within it.
When to Use a Graph Database?
There are always two sides to every story and graph databases aren’t a perfect solution for every problem. Far from it. There are a lot of use cases for which you should stick with relational databases or maybe search for other alternatives aside from graph databases.
Here are three simple questions you can ask yourself to decide if there are any benefits to using a graph database.
1. Is My Data Highly-Connected?
Graph solutions are focused on highly connected data that comes with an intrinsic need for relationship analysis. If the connections within the data are not the primary focus and the data is of a transactional nature, then a graph database is probably not the best fit. Sometimes it’s just important to store the data and complex analysis isn’t needed.
In our example, if we were to store only people without their relationships, then we would end up with a sparsely connected graph. Yes, a number of simpler graphs would remain because of the connections between nodes
Location, but this degree of connectedness and the consistency of the data structure is well suited for a relational database.
2. Is Retrieving the Data More Important to Me Than Storing It?
Graph databases are optimized for data retrieval and if you choose one, then you should probably use this functionality often. If your focus is on writing to the database and you’re not concerned with analyzing the data, then a graph database wouldn’t be an appropriate solution. A good rule of thumb is, if you don’t intend to use JOIN operations in your queries, then a graph is not a must-have.
In our example, if you only store data for the sake of logging interactions and you don’t intend to analyze it later on, then a graph database isn’t particularly helpful. However, if there are numerous connections within the data being stored, then a graph might be worth considering.
3. Does My Data Model Change Often?
If your data model is inconsistent and demands frequent changes, then using a graph database might be the way to go. Because graph databases are more about the data itself than the schema structure, they allow a degree of flexibility.
On the other hand, there are often benefits in having a predefined and consistent table that’s easy to understand. Developers are comfortable and used to relational databases and that fact cannot be downplayed.
For example, if you are storing personal information such as names, dates of birth, locations… and don’t expect many new fields or a change in data types, relational databases are the go-to solution. On the other hand, a graph database could be useful if:
- Additional attributes could be added at some point,
- Not all entities will have all the attributes in the table and,
- The attribute types are not strictly defined.
In our example, the attributes and relationships of a person could be set in stone due to a specific use case and no further changes may be needed.
When Not to Use a Graph Database?
1. When Queries Don’t Include Specific Starting Points
If you need to run frequent table scans and searches for data that fits defined categories, a graph database wouldn’t be very helpful. Graph databases are well equipped to traverse relationships when you have a specific starting point or at least a set of points to start with (nodes with the same label). They are not suited for traversing the whole graph often. While it’s possible to run such queries, other storage solutions may be more optimized for such bulk scans.
If the majority of the queries in our example include searches by property values over the entire network, then a graph database wouldn’t be the right fit.
2. When you Need Key/Value Storage
Very often, databases are used to look up information stored in key/value pairs. When you have a known key and need to retrieve the data associated with it, a graph database is not particularly useful.
For example, if the sole purpose of your database is storing a user’s personal information and retrieving it by name or ID, then refrain from using a graph. But if there were other entities involved (visited locations for example), and a large number of connections is required to map them to users, then a graph database could bring performance benefits. A good rule of thumb is, if most of your queries return a single node via a simple identifier (key), then just skip graph databases.
3. When you Need to Store Large Chunks of Information
If the entities in your model have very large attributes like BLOBs, CLOBs, long texts… then graph databases aren’t the best solution. While you can store those objects as nodes and link them to other nodes to utilize the power of traversing relationships, sometimes it just makes more sense to store them directly with the entities they are connected to.
In our example, if each person had a long biography that needed to be included in the same database, a graph wouldn’t be the answer. However, if you needed to connect these biographies to other entities in the database (for example people that are mentioned in them), then the strengths of a graph database could outway the limitations.
Is a Graph Database Worth it?
It very much depends on your specific use case. Graph databases are a very powerful tool when it comes to handling interconnected data. If you have a hard time deciding, then go through the aforementioned requirements and check if any of them apply to your scenario.
In this article, you have gained some insights into the fundamental differences between relational and graph databases.
Published at DZone with permission of Ivan Despot. See the original article here.
Opinions expressed by DZone contributors are their own.