DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. Understanding Graph Databases

Understanding Graph Databases

Why are graph databases important to you?

Steve Sarsfield user avatar by
Steve Sarsfield
·
Nov. 20, 18 · Opinion
Like (5)
Save
Tweet
Share
8.13K Views

Join the DZone community and get the full member experience.

Join For Free

What Enterprise Architects Should Know

The description of graph databases that you get when you Google it is mostly academic. I see a lot of descriptions about graph databases that talk about seven bridges in Königsberg or Berners-Lee, the inventor of the internet. There are theories and visions which are fine, but for me, I still think it’s important to lead with the relevance. Why are graph databases important to you?

RDBMS vs graphs

Imagine the data that’s stored in a local restaurant chain. If you were keeping track, you’d store customer information in one database table, the items you offer in another, and the sales that you’ve made in a third table. This is fine when I want to understand what I sold, order inventory, and know who my best customer is. But what’s missing is the connective tissue and the connection between the items along with functions in the database that can allow me to make the most of it.

A graph database stores the same sort of data but is also able to store linkages between the things. John buys a lot of Pepsi, Jack is married to Valerie and buys different drinks. I don’t have to run JOINs to understand how I should market to each individual customer. I can see the relationships in the data without having to make a hypothesis and test it.

This new connected information layer does a lot for you. It’s not just about buyer intent, but it could be helpful in a lot of use cases and it is very helpful in machine learning or when you want analysis or inferencing to be done by the machine (see the table below).

Examples of applications for the semantic layer
Semantic Info Stored
Example
Use Case
Ownership
Susan owns a Honda. Who else owns a Honda?
Buyer Intent
Interest
Steve is interested in Football. Who else?
Designed by
Frank Lloyd Wright designed the Guggenheim. What else?
Knowledge Graph
<classification>
Guggenheim is a museum. What are other museums?
Connections
via port e.g. server1 connected via port 8080 to server2. Did this happen more than expected?
Network/IT operations
Is associated with
gene is associated with cancer. What other genes?
Life Sciences
Many more

Since traditional databases are designed with tables, not the linked data, SQL won’t do anymore. This has given rise to SQL-like languages (but for linked datasets or graph data structures) like SparQL, Gremlin, and Cypher to name a few. A major difference is the analytical functions you need to act upon the linked data. If I wanted to find the most popular time to buy a certain product on your website, or if I wanted to rank the popularity of an item, for example, there’s a new syntax for that. You need to learn the language of connected data to make the most of it.

Can’t You Do That With an RDBMS?

Yes, it is possible to create these linkages in a traditional Relational Database Management System (RDBMS). However, to perform these tasks in traditional databases, database administrators have toiled to maintain unique keys and reconstruct relationships with JOINs. If graph databases are used, both the subject and its relationship, known as subjects and predicates, are already known. There’s no need to reconstruct the connections.

Inferring that Zoe is the daughter of Mary if you have previously defined that Mary is the mother of Zoe is another example. You do not necessarily need to specifically define both relationships because graph databases are smart about this. By comparison, relational databases cannot understand anything that isn’t defined. Therefore, this inferencing capability has clear value when looking at interests, households, and communities.

Caveat: Graph Databases Have Specialties

Like a traditional RDBMS, graph databases can be either transactional or analytical. Choose your focus when you choose your graph database. For example, the popular Neo4J is focused on transactional (OLTP) graph database while AnzoGraph is an analytical (OLAP) graph database. It’s seemingly a subtle distinction when you are first trying graph databases. However, you may need a different engine for running quick queries that touch upon single entities (e.g. What car does Susan own?) and analytical queries that poll the whole database. (e.g. What is the average price for a car paid by people like Susan?). Graph OLAP databases are becoming very important as Machine Learning and AI grows since a number of Machine Learning algorithms are inherently graph algorithms and are more efficient to run on a graph OLAP database vs. running them on an RDBMS.

If you want to learn more about SparQL, the W3C-defined standard language of Graph Databases, check out one of the many SparQL tutorials online. There’s ample opportunity to try a graph database like AnzoGraph.

Relational database Database Graph (Unix)

Published at DZone with permission of Steve Sarsfield, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • The Data Leakage Nightmare in AI
  • How and Why You Should Start Automating DevOps
  • SAST: How Code Analysis Tools Look for Security Flaws
  • Memory Debugging: A Deep Level of Insight

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: