POWERED BY
Platinum Partner
java,architects,nosql,neo4j,graph database

# Java Network/Graph/Data Mining Algorithm 'Arsenal' on Neo4j: Part 2

The Java Zone is brought to you in partnership with ZeroTurnaround. Discover how you can skip the build and redeploy process by using JRebel.

A few weeks ago I showed you how to visualize a graph using the chord flare visualization and how to visualize a network using a force directed graph visualization from D3.js.

On Twitter Claire Willett from Riparian Data asked:

This post on Graphs Beyond the Hairball by Robert Kosara explains why some non-traditional graph visualizations may work better and links us to an article explaining what a Node Quilt is and how it’s useful. We’re going to just take the first step and build a Matrix representation of a graph. We will use one of the JUNG clustering algorithms to help us understand it.

The study of social networks goes back to at least the ancient Greeks, but we won’t go back that far in time today… just to 1977. A man named Wayne Zachary recorded the interactions of a Karate Club at a University for 2 years. During this time a conflict developed between the administrator and the instructor and the club broke up into two. Turns out you could predict which club each member would belong to by building a graph of their weighted relationships and spliting it along the minimum-cut.

We’re working with Neo4j, and you all know Neo knows Kung Fu, so we’ll do something a little different. Plus looking at a two node cluster is kinda lame… let’s go bigger. We’re going to be using Neo4j with the JUNG jars already in the lib directory. Refer to JUNG in Neo4j – Part 1 if you need help with that. Let’s create our graph:

```def create_graph
neo = Neography::Rest.new
graph_exists = neo.get_node_properties(1)
return if graph_exists && graph_exists['name']

names = %w[Aaron Achyuta Adam Adel Agam Alex Allison Amit Andreas Andrey
Andy Anne Barry Ben Bill Bob Brian Bruce Chris Corey
Dan Dave Dean Denis Eli Eric Esteban Ezl Fawad Gabriel
James Jason Jeff Jennifer Jim Jon Joe John Jonathan Justin
Kim Kiril LeRoy Lester Mark Max Maykel Michael Musannif Neil]

commands = names.map{ |n| [:create_node, {"name" => n}]}

names.each_index do |from|
commands << [:add_node_to_index, "nodes_index", "type", "User", "{#{from}}"]
connected = []

# create clustered relationships
members = 5.times.collect{|x| x * 10 + (from % 10)}
members.delete(from)
rels = 1 + rand(4)
rels.times do |x|
to = members[x]
connected << to
commands << create_rel(from, to) unless to == from
end

# create random relationships
rels = 1 + rand(4)
rels.times do |x|
to = rand(names.size)
commands << create_rel(from, to) unless (to == from) || connected.include?(to)
end
end
batch_result = neo.batch *commands
end```

I am once again using the first 50 names from the Graph DataBase- Chicago Meet-Up group. You’ve seen me do this once or twice already, so I won’t go over it in detail, but as you can see above, I am forcing small clusters of relationships to exist along with random relationships. Our relationships have a weight property, so the create_rel method is just this:

```def create_rel(x,y,z= 1 + rand(10))
[:create_relationship, "knows", "{#{x}}", "{#{y}}", {:weight => z}]
end
```

Our visualization is expecting a list of nodes already in groups, and a list of relationships that includes it’s strength. The JSON object looks like this:

```{"nodes":[{"name":"Myriel","group":1},
{"name":"Napoleon","group":1},
{"name":"Mlle.Baptistine","group":2}],
"links":[{"source":1,"target":0,"value":1},
{"source":2,"target":0,"value":8},
{"source":3,"target":0,"value":10}]
}
```

Getting our nodes is pretty easy, we’ll use Gremlin to retrieve all but the root node.

```def get_nodes
neo = Neography::Rest.new
neo.execute_script("g.V.filter{it.id != 0}.transform{[it.id,it.name]}")
end
```

Getting the relationships is also pretty ease, to switch it up, we’ll use Cypher.

```def get_relationships
neo = Neography::Rest.new
cypher_query =  " START a = node:nodes_index(type='User')"
cypher_query << " MATCH a-[r:knows]-b"
cypher_query << " RETURN ID(a), ID(b), r.weight"
neo.execute_query(cypher_query)["data"]
end
```

Now comes the fun part. To get our clusters, we’ll be using the Voltage Clusterer Class. We don’t want the root node getting in the way, so we’ll create a sub-graph using TinkerGraph that excludes it (you could do something similar to cluster just a small part of a larger graph). We then pass this graph on to GraphJung and set our VoltageClusterer to try to get 10 clusters for us.

```def get_voltage_clusters
neo = Neography::Rest.new
lg = neo.execute_script("import edu.uci.ics.jung.algorithms.cluster.VoltageClusterer;
to = new TinkerGraph();
g.V.filter{it.id != 0}.sideEffect{toVertex = to.addVertex(it.getId());
ElementHelper.copyProperties(it, toVertex);
}.iterate();
g.E.sideEffect{toEdge = to.addEdge(it.getId(),
to.v(it.getOutVertex().getId()),
to.v(it.getInVertex().getId()),
it.getLabel());
ElementHelper.copyProperties(it, toEdge);
}.iterate();
vc = new VoltageClusterer(new GraphJung(to), 10);
vc.cluster(10).id;
")
end
```

We can then put it all together to build that JSON object.

```get '/cluster' do
clusters = Hash.new
get_voltage_clusters.each_with_index {|item, index| item.each{|i| clusters[i.to_i] = index + 1} }
nodes = get_nodes.map{|n| {"name" => n[1], "group" => clusters[n[0]]}}
relationships = get_relationships.map{|r| {"source" => r[0] - 1, "target" => r[1] - 1, "value" => r[2]} }
{:nodes => nodes, :links => relationships}.to_json
end
```

Our visualization was done by Mike Bostock with D3.js. As always, the code is on Github. Click on the image below to see the live example on Heroku.

Note: The version on Heroku is using a pre-generated JSON file.

The Java Zone is brought to you in partnership with ZeroTurnaround. Discover how you can skip the build and redeploy process by using JRebel.
ADVERTISEMENT