DZone
Database Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Database Zone > Database Building 101: Let's Graph This for Real

Database Building 101: Let's Graph This for Real

In the start of a new series from Ayende Rahien, learn the premise behind building a database and what to expect from future posts.

Oren Eini user avatar by
Oren Eini
·
Aug. 16, 16 · Database Zone · Tutorial
Like (2)
Save
Tweet
2.04K Views

Join the DZone community and get the full member experience.

Join For Free

In the Guts n’ Glory of Database Internals series (which I’ll probably continue if people suggest new topics), I talked about the very low-level things that are involved in actually building a database — from how to ensure consistency to the network protocols. But those are very low-level concerns. Important ones, but very low level. In this series, I want to start going up a bit in the stack and actually implement a toy database on top of a real production system, to show you what the database engine actually does.

In practice, we divide the layers of a database engine this way:

  1. Low-level storage (how we save the bits to disk), journaling, ACID.
  2. High-level storage (what kind of storage options do we have, B+Tree, nested trees, etc).
  3. Low-level data operations (working on a single item at a time).
  4. High-level data operations (large-scale operations, typically).
  5. Additional features (subscribing to changes, for example).

In order to do something interesting, we are going to be writing a toy graph database. I’m going to focus on levels 3 & 4 here, the kind of data operations that we need to provide the database we want, and we are going to build over pre-existing storage solution that handles 1 & 2.

Selecting the storage engine — sometimes it makes sense to go elsewhere for the storage engine. Typical examples includes using LMDB or LevelDB as embedded databases that handle the storage, and you build the data operations on top of that. This works, but it is limiting. You can’t do certain things, and sometimes you really want to. For example, LMDB supports the notion of multiple trees (and even recursive trees), while LevelDB has a flat key space. That has a big impact on how you design and build the database engine.

At any rate, I don’t think it will surprise anyone that I’m using Voron as the storage engine. It was developed to be a very flexible storage engine, and it works very well for that purpose.

We’ll get to the actual code in tomorrow’s post, but let’s lay out what we want to end up with:

  • The ability to store nodes (for simplicity, a node is just an arbitrary property bag).
  • The ability to connect nodes using edges.
    • Edges belong to types, so KNOWS and WORKED_AT are two different connection types.
    • An edge can be bare (no properties) or have data (again, for simplicity, just arbitrary property bag).

The purpose of the toy database we build is to allow the following low-level operations:

  • Add a node.
  • Add an edge between two nodes.
  • Traverse from a node to all its edges (cheaply).
  • Traverse from an edge to the other nodest (cheaply).

That is it, should be simple enough, right? With that in mind, stay tuned for the next part of this series, where we dive into building a flexible database.

Database engine Graph (Unix)

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Common Mistakes to Avoid When Migrating
  • Replace your Scripts with Gradle Tasks
  • DevOps Security Checklist for Kubernetes
  • The Impacts of Blockchain on the Software Development Industry

Comments

Database Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo