Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Database Building 101: High-Level Graph Operations

DZone's Guide to

Database Building 101: High-Level Graph Operations

Are you new to graph databases? Here's a quick note to let you dip your toe into a simple search/filter operation on nodes and edges.

· Database Zone
Free Resource

Check out the IT Market Clock report for recommendations on how to consolidate and replace legacy databases. Brought to you in partnership with MariaDB.

I talked about high-level and low-level data operations. So far, all we have seen are very low-level operations (get node, get edges for, etc).

image


Let us see how we’ll deal with a bigger challenge. In this case, we want to implement a classic graph operation, doing a depth first search, filtering by both nodes and edges.

Here is how we can implement this:

public IEnumerable<NameValueCollection> DepthFirstSearch(
long start,
string edgeType,
Func<long, NameValueCollection, bool> nodePredicate,
Func<long, NameValueCollection, bool> edgePredicate)
{
    var seen = new HashSet<long>();
    var stack = new Stack<long>();
    stack.Push(start);
    while(stack.Count > 0)
    {
        var nodeId = stack.Pop();
        if(seen.Add(node) == false)
        continue;
        var node = GetNode(nodeId);
        if(nodePredicate(node) == false)
        continue;
        yield return node;
        foreach(var edge in GetEdgesFor(edgeType, nodeId))
        {
            if(edgePredicate(edge.Item2))
            stack.Push(edge.Item1); // destination node id
        }
    }
}


In the real world, we’ll need quite a bit more. On each node (and edge) we’ll need to decide if to return it from the query, or just traverse through it, etc. And that is just to start with.

But I think this demonstrates the point of how to layer behavior on top of the lower level database operations.

There is one thing that we need to talk about still, this code will actually use a lot of individual transactions, one for each independent operation. That is quite expensive, we can open a single transaction and pass it to the functions we call, so there is just a single cost for the entire duration of the operation.

Other things we can do is to explicitly designate specific scenarios as important and change the design so we can answer them very quickly (as in the O(1) cost for accessing nodes/edge data).

Interested in reducing database costs by moving from Oracle Enterprise to open source subscription?  Read the total cost of ownership (TCO) analysis. Brought to you in partnership with MariaDB.

Topics:
database ,graph databases ,graph modeling

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}