DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Why SQL Isn’t the Right Fit for Graph Databases
  • Graph-Oriented Solutions Enhancing Flexibility Over Mutant Requirements
  • How Graph Databases Fight Organized Crime
  • Distribution and Partitioning in Graph Databases

Trending

  • Securing the AI Host: Spring AI MCP Server Communication With API Keys
  • When One MVP Is Really Four Systems: A Better Way to Plan Multi-Role Apps
  • Jakarta EE 12: Entering the Data Age of Enterprise Java
  • The Big Data Architecture Blueprint: Core Storage, Integration, and Governance Patterns
  1. DZone
  2. Data Engineering
  3. Databases
  4. Performance of Graph vs. Relational Databases

Performance of Graph vs. Relational Databases

By 
Josh Adell user avatar
Josh Adell
·
Feb. 04, 13 · Interview
Likes (2)
Comment
Save
Tweet
Share
35.9K Views

Join the DZone community and get the full member experience.

Join For Free

Curator's Note:  Here's a post from back in 2011 that offers a general look at the difference in performance of graph databases and relational databases.


A few weeks ago, Emil Eifrem, CEO of Neo Technology gave a webinar introduction to graph databases. I watched it as a lead up to my own presentation on graph databases and Neo4j.


Right around the 54 minute mark, Emil talks about a very interesting experiment showing the performance difference between a relational database and a graph database for a certain type of problem called "arbitrary path query", specifically, given 1,000 users with an average of 50 "friend" relationships each, determine if one person is connected to another in 4 or fewer hops.


Against a popular open-source relational database, the query took around 2,000 ms. For a graph database, the same determination took 2 ms. So the graph database was 1,000 times faster for this particular use case.


Not satisfied with that, they then decided to run the same experiment with 1,000,000 users. The graph database took 2 ms. They stopped the relational database after several days of waiting for results.


I showed this clip at my presentation to a few people who stuck around afterwards, and we tried to figure out why the graph database had the same performance with 1,000 times the data, while the relational database became unusably slow. The answer has to do with the way in which each type of database searches information.


Relational databases search all of the data looking for anything that meets the search criteria. The larger the set of data, the longer it takes to find matches, because the database has to examine everything in the collection.


Here's an example: let's assume there is a table with "friend" relationships:

> SELECT * FROM friends;
+-------------+--------------+
| user_id     | friend_id    |
+-------------+--------------+
| 1           | 2            |
| 1           | 3            |
| 1           | 4            |
| 2           | 5            |
| 2           | 6            |
| 2           | 7            |
| 3           | 8            |
| 3           | 9            |
| 3           | 10           |
+-------------+--------------+

In order to see if user "9" is connected to user "2", the database has to find all the friends of user "9", and see if user "2" is in that list. If not, find all of their friends, and then see if user "2" is in that list. The database has to scan the entire table each time. This means that if you double the number of rows in the table, you've doubled the amount of data to search, and thus doubled the amount of time it takes to find what you are looking for. (Even with indexing, it still has to find the values in the index tree, which involves traversing that tree. The index tree grows larger with each new record, meaning the time it takes to traverse grows larger as well. And for each search, you always start at the root of the tree.)


Each new batch of friends to look at requires an entirely new scan of the table/index. So more records leads to more search time.


Conversely, a graph database looks only at records that are directly connected to other records. If it is given a limit on how many "hops" it is allowed to make, it can ignore everything more than that number of hops away.

Only the blue records are ever seen during the search. And since a graph traversal "remembers" where it is at any time, it never has to start from the beginning, only from its last known position.


You could add another ring of records around the outside, or add a thousand more rings of records outside, but the search would never need to look at them because they are more steps away than the limit. The only way to increase the number of records searched (and thereby decrease performance) is to add more records within the 2-step limit, or add more relationships between the existing records.


So the reason why having 1,000 vs. 1,000,000 records causes such a stark difference between a relational and a graph database is that relational database performance decreases in relation to the number of records in the table, while graph database performance decreases in relation to the number of connections between the records.

Database Relational database Graph (Unix)

Published at DZone with permission of Josh Adell. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Why SQL Isn’t the Right Fit for Graph Databases
  • Graph-Oriented Solutions Enhancing Flexibility Over Mutant Requirements
  • How Graph Databases Fight Organized Crime
  • Distribution and Partitioning in Graph Databases

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook