Graph Databases for Beginners: Graph Theory and Predictive Modeling
Connecting nodes in a graph database is fundamentally different than filling a table with data. If you're new to graph databases: Read this.
Join the DZone community and get the full member experience.Join For Free
Graphs are everywhere, from illustrating the connections between Game of Thrones characters to tracking the interactions betweens hundreds of thousands of servers in a public network.
Throughout this blog series, we’ve talked a lot about the practical details of working with graph databases. Now it’s time to discuss graph theory, with its a far more practical application to everyday life.
As a more developed field, graph theory helps us gain insight into new domains. Combined with the social sciences, there are many concepts that can be straightforwardly used to gain insight from graph data.
In this “Graph Databases for Beginners” blog series, we have discussed why graphs are the future, why data relationships matter, the basics of data modeling, data modeling pitfalls to avoid, why a database query language matters, why we need NoSQL databases, ACID vs. BASE, a tour of aggregate stores, other graph data technologies, native versus non-native graph processing and storage, and graph search algorithms.
In last week’s post, we explained the lower-level traversal mechanisms of graph algorithms. If you haven’t read it yet, I would recommend doing so in order to best understand the higher-order analyses we are about to discuss. Now let’s take a look at some key concepts in social graph theory.
One of the most common properties of social graphs is that of triadic closures. This is the observation that if two nodes are connected via a path with a mutual third node, there is an increased likelihood of the two nodes becoming directly connected in the future.
In a social setting, a triadic closure would be a situation where two people with a mutual friend have a higher chance of meeting each other and becoming acquainted.
The triadic closure property is most likely to be upheld when a graph has a node A with a strong relationship to two other nodes, B and C. This then gives B and C a chance of a relationship, whether it be weak or strong. Although this is not a guarantee of a potential relationship, it serves as a credible predictive indicator.
Let’s take a look at this example.
Above is an organizational hierarchy where Alice manages both Bob and Charlie. This is rather strange, as it would be unlikely for Bob and Charlie to be unacquainted with one another while sharing the same manager.
As it is, there is a strong possibility they will end up working together due to the triadic closure property. This will create either a WORKS_WITH (strong) or PEER_OF (weak) relationship between the two of them, closing the triangle — hence the term triadic closure.
However, another aspect to consider in the formation of stable triadic closures is the quality of the relationships involved in the graph. To illustrate the next concept, assume that the MANAGES relationship is somewhat negative while the PEER_OF and WORKS_WITH relationships are more positive.
Based off of the triadic closure property, we can assume that we can fill in the third relationship with any label, such as having everyone manage each other like in the first image below or the weird situation in the second image below.
However, you can see how uncomfortable those working situations would be in reality. In the second image, Charlie finds himself both the peer of a boss and a fellow worker. It would be difficult for Bob to figure out how to treat Charlie — as a fellow coworker or as the peer of his boss?
We have an innate preference for structural symmetry and rational layering. In graph theory, this is known as structural balance.
A structurally balanced triadic closure is made of relationships of all strong, positive sentiments (such as the first example below) or two relationships with negative sentiments and a single positive relationship (second example).
Balanced closures help with predictive modeling in graphs. The simple action of searching for chances to create balanced closures allows for the modification of the graph structure for accurate predictive analysis.
We can go further and gain more valuable insight into the communications flow of our organizations by looking at local bridges. These refer to a tie between two nodes where the endpoints of the local bridge are not otherwise connected, nor do they share any common neighbors. You can think of local bridges as connections between two distinct clusters of the graph. In this case, one of the ties has to be weak.
For example, the concept of weak links is relevant in algorithms for job search. Studies have shown that the best sources of jobs come from looser acquaintances rather than close friends. This is because closer friends tend to share a similar worldview (are in the same graph component) but looser friends across a local bridge are in a different social network (and are in a different graph component).
In the image above, Davina and Alice are connected by a local bridge but belong to different graph components. If Davina were to look for a new job, she would be more likely to find a successful recommendation from Alice than from Frances.
This property of local bridges being weak links is something that is found throughout social graphs. As a result, we can make predictive analyses based on empirically derived local bridge and strong triadic closure notions.
The Final Takeaway
While graphs and our understanding of them are rooted in hundreds of years of study, we continue to find new ways to apply them to our personal, social and business lives. Technology today offers another method of understanding these principles in the form of the modern graph database.
As we have seen throughout the “Graph Databases for Beginners” blog series, we simply need to understand how to apply graph theory algorithms and analytical techniques in order to achieve our goals. Take a look back at the other posts in this series and you’ll gain the skills you need to tap into the power of graphs.
Published at DZone with permission of Joy Chao, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.