Over a million developers have joined DZone.

MapReduce: Detecting Cycles in a Network Graph

DZone's Guide to

MapReduce: Detecting Cycles in a Network Graph

· Database Zone
Free Resource

Want to free your enterprise from the costs of provisioning servers? Learn about serverless architecture with Couchbase and cloud computing.

I recently received an email from an audience of my blog on Map/Reduce algorithm design regarding how to detect whether a graph is acyclic using Map/Reduce.  I think this is an interesting problem and can imagine there can be wide range of application to it.

Although I haven't solved this exact problem in the past, I'd like to sketch out my thoughts on a straightforward approach, which may not be highly optimized.  My goal is to invite other audience who has solved this problem to share their tricks.

To define the problem:  Given a simple directed graph, we want to tell whether it contains any cycles.

Lets say the graph is represented in incidence edge format where each line represent an edge from one node to another node.  The graph can be split into multiple files.

node1, node2
node4, node7
node7, node1

Here is a general description of the algorithm.

We'll maintain two types of graph data.

  1. A set of link files where each line represent an edge in the graph.  This file will be static.
  2. A set of path files where each line represent a path from one node to another node.  This file will be updated in each round of map/reduce.
The general idea is to grow the path file until either the path cannot grow further, or the path contain a cycle, we'll use two global flags to keep track of that: "change" flag to keep track of whether new path is discovered, and "cycle" flag to keep track whether a cycle is detected.

The main driver program will create the initial path file by copying the link file.  And it will set the initial flag condition:  Set cycle and change flag to true.  In each round, the driver will
  • Check if the cycle is still false and change is true, exit if this is not the case
  • Otherwise, it will clear the change flag and invoke the map/reduce
Within each round, the map function will do the following
  • Emit the tuple [key=toNode, value=fromNode] if it is a path
  • Emit the tuple [key=fromNode, value=toNode] if it is a link
This ensure a path and all extended link reaches the same reducer, which will do the following
  • Check to see if the beginning point of the path is the same as the endpoint of the link.  If so, it detects a cycle and mark the cycle flag to be true.
  • Otherwise, it compute the product of every path to every link, and form the new path which effectively extend the length by one.

The following diagram illustrates the algorithm ...

Learn how the world’s first Engagement Database can lower your costs and simplify deployments with Couchbase in the cloud.


Published at DZone with permission of Ricky Ho, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}