Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Graph Algorithms in Neo4j: The Neo4j Graph Algorithms Library

DZone's Guide to

Graph Algorithms in Neo4j: The Neo4j Graph Algorithms Library

Take an in-depth look at the Neo4j Graph Algorithms Library.

· Database Zone ·
Free Resource

Download the Altoros NoSQL Performance Benchmark 2018. Compare top NoSQL solutions – Couchbase Server v5.5, MongoDB v3.6, and DataStax Enterprise v6 (Cassandra).

The Neo4j Graph Algorithms Library is used on your connected data to gain new insights more easily within Neo4j. These graph algorithms improve results from your graph data, for example by focusing on particular communities or favoring popular entities.

This article series is designed to help you better leverage graph analytics so you can effectively innovate and develop intelligent solutions faster.

Last week, we took a look at various graph algorithm concepts, including two fundamental graph traversal algorithms: breadth-first search (BFS) and depth-first search (DFS).

This week, we will take an in-depth look at the Neo4j Graph Algorithms Library. We developed this library as part of our effort to make it easier to use Neo4j for a wider variety of applications.

The Neo4j Graph Algorithms Library: Why and How

We developed the graph algorithms library as part of our effort to make it easier to use the Neo4j graph database for a wider variety of applications. These algorithms have been tuned to be as efficient as possible in regards to resource utilization as well as streamlined for management and debugging.

Note: If you want to try out the examples in the rest of the article series, you'll need to first install the graph algorithms library. For detailed instructions, please download the free eBook at the end of this article and follow the directions in Appendix B.

Neo4j graph algorithms are available as user-defined procedures called as part of Cypher statements running on top of Neo4j. Here is an architecture diagram.

Architecture Diagram

Usage

These algorithms are exposed as Neo4j procedures. They are called directly using Cypher in your Neo4j Browser, from Cypher Shell or from your client code. For most algorithms, there are two procedures:

    • algo. — This procedure writes results back to the graph as node-properties and reports statistics.
    • algo..stream — This procedure returns a stream of data. For example, nodes and computed values.

For large graphs, the streaming procedure might return millions, or even billions, of results. In this case, it may be more convenient to store the results of the algorithm, and then use them with later queries.

This is one of the use cases for a handy feature called graph projection.

Graph projection places a logical subgraph into a graph algorithm when your original graph has the wrong shape or granularity for that specific algorithm. For example, if you're looking to understand the relationship between drug results for men versus women but your graph is not partitioned for this, you'll be able to temporarily project a subgraph to quickly run your algorithm upon and move on to the next step.

We project the graph we want to run algorithms on with either label and relationship-type projection, or Cypher projection (shown below).

Cypher Projection

The projected graph model is separate from Neo4j's stored graph model to enable fast caching for the topology of the graph, containing only relevant nodes, relationships, and weights. During projection of a directed subgraph, only one relationship directed in and one relationship directed out is allowed between a pair of nodes.

During the projection of an undirected subgraph, two relationships between a pair of nodes are allowed (there is no direction).

Label and Relationship-Type Projection

We project the subgraph we want to run the algorithm on by using the label parameter to describe nodes, and relationship-type to describe relationships.

The general call syntax is:

CALL algo.("NodeLabel", "RelationshipType", {config})

Cypher Projection

If label and relationship-type projection are not selective enough to describe our subgraph to run the algorithm on, we use Cypher statements to project subsets of our graph. Use a node-statement instead of the label parameter and a relationship-statement instead of the relationship-type, and use graph:"cypher" in the config.

Relationships described in the relationship-statement will only be projected if both source and target nodes are described in the node-statement. Relationships that don't have both source and target nodes described in the node-statement will be ignored.

We also return a property value or weight (according to our config) in addition to the IDs from these statements.

Cypher projection enables us to be more expressive in describing the subgraph that we want to analyze, but it might take longer to project the graph with more complex Cypher queries.

The general call syntax is:

CALL algo.(
"MATCH (n) RETURN id(n) AS id",
"MATCH (n)-->(m) RETURN id(n) AS source, id(m) AS target",
{graph: "cypher"})

Huge Graph Projection

The default label and relationship-type projection has a limitation of two billion nodes and two billion relationships, so if our projected graph is bigger than this, we need to use a huge graph projection. This is enabled by setting graph:"huge" in the config.

The general call syntax is:

CALL algo.("NodeLabel", "RelationshipType", {graph: "huge"})

Algorithm Types

For transactions and operational decisions, you need real-time graph analysis to provide a local view of relationships between specific data items and take action. To make discoveries about the overall nature of networks and model the behavior of complex systems, you need graph algorithms that provide a broader view of patterns and structures across all data and relationships.

The following table is helpful for working out the appropriate algorithm for your use case.

Conclusion

As we've shown, the Neo4j Graph Algorithms library is used on your connected data to gain new insights more easily within Neo4j. These graph algorithms improve results from your graph data, for example by focusing on particular communities or favoring popular entities.

Next week, we will take an in-depth look at pathfinding and graph search algorithms, such as Shortest Path, Single Source Shortest Path, and All Pairs Shortest Path.

Download the whitepaper, Moving From Relational to NoSQL: How to Get Started. We’ll take you step by step through your first NoSQL project.

Topics:
graph database ,database ,neo4j ,dfs ,bfs ,graph algorithms ,neo4j graph algorithms library

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}