news,architects,nosql,architecture,graph database,graph theory

# Basic Graphs: An Introduction

The purpose of this post is to give a common footing for those reading to understand what I mean when I talk about a "graph". The field of graph theory is a very deep and well-explored one. The real trick here is trying to give an introduction to graphs and graph theory without getting lost in too much detail. I will endeavour to keep this post to the utter, simplified basics of graphs.

(My apologies to
anyone who already knows all about graph theory; this is meant more for
those who might not have been exposed to graphs before, as well as to
establish some common terms and nomenclature I'll be using in this blog.
If some of the definitions seem too simplistic, please bear in mind
I'm trying to just get the basics out there so anyone can understand the
concepts.)

We also think of graphs like the following (which is one of my personal favourites from our good friend Jay-Z):

Alas, these are
not the graphs I'm referring to in this blog. The kind I'm talking
about is a little bit different; though, at its core, perhaps we can
take a graph to mean (at a very oversimplified level), "a graphical
representation of data".

The kind of graph
I'm referring to in this blog has to do with the representation of
objects and their interconnections. Because I'm terrible at drawing
(and lazy), I'll show a simple graph that illustrates what I'm saying
and then try to describe what's going on.

This, my
graph-fiends (yes, I just made that horrible pun up), is one example of a
graph. What does this graph consist of? Here's what we have:

- 6 vertices or nodes (the circles), and,
- 7 edges or relationships (the lines between the circles).

At a first
glance, that's all there is to it. In mathematics, we commonly use the
terms "vertices" and "edges" (the vertices being the circles/points, the
edges being the lines between); however, it's likely a bit more
intuitive to most to use the terms "nodes" and "relationships" in place
of "vertices" and "edges", respectively. I'll try to stick to using
"nodes" and "relationships", though I may occasionally use the other
terms interchangeably.

Is a graph still a
graph if there are no edges involved? Absolutely. This is one example
of a "null graph". (For those interested, at the other end of the
spectrum, a "complete graph" is one whose vertices are all
connected/adjacent to each other, for a total of n(n-1)/2 edges, where n
is the number of vertices.)

Edges/relationships
can be "undirected" (like the example above), or "directed" (like the
example below). Directed edges have a source and a destination (this is
useful for showing relationships between two objects). Undirected
edges can also be used to show a bi-directional relationship (i.e. one
that exists in both directions); for example, Tom likes Dana, and Dana
likes Tom.

Still with me?

So, we know that a

**graph**consists of a series of**vertices**or**nodes**that can be connected via**edges**or**relationships**. These**edges**can be**undirected**or**directed**.
At its core, that is what a graph is.

By now, if you
haven't already, you're seeing how this might be useful to represent
highly connected data. Social networks are a great example (think of
how you could represent your "friendships" on Facebook or MySpace with a
graph; this is actually called a

**"Likes" graph**). Try to think of other information you could represent in graph fashion.
What about a family tree? For sure. As another matter of fact, a

**tree**is actually a type of graph that is**directed**and**acyclical**(meaning that there are no loops, i.e. if you start following the directions on a graph, you travel each node exactly once).**Weight**is another concept we come across in graph theory and graph databases. Think of

**weight**as being how important or strong a

**relationship**is between two

**nodes**. For example, a higher weight can imply a stronger

**relationship**between two

**nodes**representing, say, two friends.

This is an important concept as we can use it to determine a shortest

**path**or least costly**path**between two**nodes**. (See an explanation of**paths**further below.)
One slightly more complicated idea is that of

**degree**, which represents the number of**edges**attached to a**node**. So, a**node**with a high**degree**will be one that has a lot of**relationships**attached to it. This can be very useful for determining just how popular, for example, something or someone is. (Cue jokes for this blog.)
One last concept to get out there before wrapping this article up is the concept of a

**walk**or**traversal**. A**traversal**is taken to be the resultant**path**found by starting at a node X and following to some other node on the graph Y and the**edges**and**nodes**visited during the trip from X to Y. This concept becomes important as we start to look in to graph-based databases as traversals are part of what we can use to extract useful information from a graph (e.g. "Who are all of my cousins?") and to even predict and recommend a product for someone based on their browsing history.
Ok, I think
that's enough math-like stuff for one post. The definitions given in
this post don't even scratch the surface of what's out there in the
world of graph theory. While I'm no serious
graph theorist academically, I'm sure there are some standard texts out
there. That said, as mentioned earlier, Wikipedia is really a great
resource for exploring the world of graph theory a bit more.

I'll introduce
other graph-related concepts as they become needed (e.g. searching or
path-finding); however, if there are requests for discussing specific
graphing concepts, I'd be happy to address them (provided I can speak
intelligently to them, of course).

Now go weird your friends out with jokes that their family tree isn't really a tree as it has loops!

Published at DZone with permission of {{ articles[0].authors[0].realName }}, DZone MVB. (source)

Opinions expressed by DZone contributors are their own.

{{ nComments() }}