Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

DZone's Guide to

# An Introduction to Property Graph Algorithms

· Database Zone
Free Resource

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Download the Guide to Open Source Database Selection: MySQL vs. MariaDB and see how the side-by-side comparison of must-have features will ease the journey. Brought to you in partnership with MariaDB.

The term property graph has come to denote an attributed, multi-relational graph. That is, a graph where the edges are labeled and both vertices and edges can have any number of key/value properties associated with them. An example of a property graph with two vertices and one edge is diagrammed below.

Property graphs are more complex than the standard single-relational graphs of common knowledge. The reason for this is that there are different types of vertices (e.g. people, companies, software) and different types of edges (e.g. knows, works_for, imports). The complexities added by this data structure (and multi-relational graphs in general, e.g. RDF graphs) effect how graph algorithms are defined and evaluated.

Standard graph theory textbooks typically present common algorithms such as various centralities, geodesics, assortative mixings, etc. These algorithms usually come pre-packaged with single-relational graph toolkits and frameworks (e.g. NetworkX, iGraph).

It is common for people to desire such graph algorithms when they begin to work with property graph software. I have been asked many times:

“Does the property graph software you work on support any of the common centrality algorithms? For example, PageRank, closeness, betweenness, etc.?”

My answer to this question is always:

“What do you mean by centrality in a property graph?”

When a heterogeneous set of vertices can be related by a heterogeneous set of edges, there are numerous ways in which to calculate centrality (or any other standard graph algorithm for that matter).

1. Ignore edge labels and use standard single-relational graph centrality algorithms.
2. Isolate a particular “slice” of the graph (e.g. the knows subgraph) and use standard single-relational graph centrality algorithms.
3. Make use of abstract adjacencies to compute centrality with higher-order semantics.

The purpose of this blog post is to stress point #3 and the power of property graph algorithms. In Gremlin, you can calculate numerous eigenvector centralities for the same property graph instance. At this point, you might ask: “How can a graph have more than one primary eigenvector?” The answer lies in seeing all the graphs that exist within the graph—i.e. seeing all the higher-order, derived, implicit, virtual, abstract adjacencies. Each line below exemplifies point #1, #2, and #3 in the list above, respectively. The code examples use the power method to calculate the vertex centrality rankings which are stored in the map m.

```g.V.outE.inV.groupCount(m).loop(3){c++ < 10000} // point #1
g.V.outE[[label:'knows']].inV.groupCount(m).loop(4){c++ < 10000} // point #2
g.V.???.groupCount(m).loop(?){c++ < 10000} // point #3
```

The ??? on line 3 refers to the fact that ??? can be any arbitrary computation. For example, ??? can be:

```outE[[label:'works_for']].inV.inE[[label:'works_for']].outV
outE[[label:'works_for']].inV[[name:'ACME']].inE[[label:'works_for']].outV
outE[[label:'develops']].inV.outE[[label:'imports']].inV[[name:'Blueprints']].back(7).outE[[label:'works_for']].inV.inE[[label:'works_for']].outV.outE[[label:'develops']].inV.outE[[label:'imports']].inV[[name:'Blueprints']].back(7)
```

The above expressions have the following meaning:

1. Coworker centrality
2. ACME Corporation coworker centrality
3. Coworkers who import Blueprints into their software centrality

There are numerous graphs within the graph. As such, “what do you mean by centrality?”

These ideas are explored in more detail in the following article and slideshow.

Rodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks to Single-Relational Network Analysis Algorithms,” Journal of Informetrics, 4(1), pp. 29-41, Elsevier, doi:10.1016/j.joi.2009.06.004, 2009.

Interested in reducing database costs by moving from Oracle Enterprise to open source subscription?  Read the total cost of ownership (TCO) analysis. Brought to you in partnership with MariaDB.

Topics:

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Published at DZone with permission of Marko Rodriguez, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.