Over a million developers have joined DZone.

The Last Mile: VisualSearch and Neo4j

DZone's Guide to

The Last Mile: VisualSearch and Neo4j

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

The “last mile” is a term used in the telecommunications industry that refers to delivering connectivity to the customers that will actually be using the system. In the sense of Graph Databases, it refers to how well the end user can extract value and insight from the graph. We’ve already seen an example of this concept with Graph Search, allowing a user to express their requests in natural language. Today we’ll see another example. We’ll be taking advantage of the features of Neo4j 2.0 to make this work, so be sure to have read the previous post on the matter.

We’re going to be using VisualSearch.js made by Samuel Clay of NewsBlur. VisualSearch.js enhances ordinary search boxes with the ability to autocomplete faceted search queries. It is quite easy to customize and there is an annotated walkthrough of the options available. You can see what it does in the image below, or click it to try their demo.


We’ve previously prepared a Neo4j 2.0 graph with Actors, Director, Producers, Writers, and Users all connected to Movies.


The first thing we need to do is find the Facets for visualsearch.js. We don’t want to configure this manually, because that would be painful and our graph may change over time. So instead we’ll use the “list_labels” method to get the Labels of our Graph:

  get '/facets' do
    content_type :json
    cache_control :public, :max_age => 600
    facets = []
    categories = $neo.list_labels    
    categories.each do |cat| 
      get_properties(cat).each do |label|
        facets << {:category => cat, :label => cat + "." + label} 

One of the nice things we can do is group properties of a label together, we don’t have a hard schema for what properties are in each Label, but we can query the graph, grab one node and see the properties it has.

    def get_properties(category)
      cypher = "MATCH n:#{category} RETURN n LIMIT 1"

This will return a JSON array that looks like:


We will pass this on to visualsearch.js and have our first drop down working with these grouped label properties.

Once a user clicks on one of the properties, we will fill in some of the available options for that property. We can do this with cypher by MATCHing the nodes of the specified Label that have the property we care about and grouping it so we only get the top 25 unique values.

  get '/values/:facet/' do
    content_type :json

    label, key = get_label_and_key(params)
    cypher = "MATCH node:#{label} 
              WHERE HAS(node.#{key})
              RETURN node.#{key} AS label, COUNT(*)
              ORDER BY label
              LIMIT 25"
    $neo.execute_query(cypher)["data"].collect{|x| x.first.to_s}.compact.flatten.to_json

Now we can see some of the values in our search box. In this example, we are grabbing names of Actors in our graph.


The top 25 items is nice, but what if we’re looking for an Actor whose name beings with the letter Z like, “Zach Grenier“? Visualsearch.js gives us the ability to start typing the value and it will reset our options to match.

visual_search 3

We will enhance our previous query by adding a case insensitive regular expression with the term or part of the term we are looking for.

  get '/values/:facet/:term' do
    content_type :json
    label, key = get_label_and_key(params)
    cypher = "MATCH node:#{label} 
              WHERE HAS(node.#{key}) AND node.#{key} =~ {term}
              RETURN node.#{key} AS label, COUNT(*)
              ORDER BY label
              LIMIT 25"
    $neo.execute_query(cypher, {:term => "(?i).*" + params[:term] + ".*"})["data"].collect{|x| x.first.to_s}.compact.flatten.to_json

Once we click on Zach Grenier, a few things happen. We get a little message telling us that:

You searched for: Actor.name: “Zach Grenier”. (1 node)

Our search bar comes alive again with the next set of Labels to query on…

visual search 4

… and our graph (currently consisting of just one node) is populated via vivagraph.js. See this previousvivagraph.js post for more information on how this great graph visualization library works.

Screen Shot 2013-07-02 at 11.24.06 PM

Now… I know you may be thinking… we populated an Actor node, and now only Movie is available in our drop down. How did that happen? That’s the magic of this application. Instead of just grabbing any next node at random, we are taking the context of our first node and building a path of available connections from there. If we click on “Movie.title”, we call the following method under the covers to get our possibilities:

  post '/connected_values/:facet/' do
    content_type :json
    related_label, related_key = get_label_and_key(params)
    match, where, values = prepare_query(params)
    last_node = get_last_node_id(params)
    where << "HAS(node#{last_node}.#{related_key})"
    cypher  = prepare_cypher(match,where)
    cypher << "WITH LAST(EXTRACT(n in NODES(p) : n.#{related_key}?)) AS label, COUNT(*) AS cnt "
    cypher << "RETURN label ORDER BY label LIMIT 25"    
    parameters = prepare_parameters(values)
    $neo.execute_query(cypher, parameters)["data"].flatten.collect{|d| d.to_s}.to_json

It looks a little complicated, but all we are doing is just building a cypher query dynamically that will end up looking like this:

MATCH p = node0:Actor -- node1:Movie 
WHERE node0.name? = {value0} AND HAS(node1.title) 
WITH LAST(EXTRACT(n in NODES(p) : n.title?)) AS label, COUNT(*) AS cnt 
RETURN label 
ORDER BY label 

This Cypher query will be executed with the parameters {“value0″=>”Zach Grenier”}. It will find the Actor node for Zach Grenier in the graph, and then find the nodes that are labeled “Movie” and are related to Zach Grenier, and then extract the property “title” from the last node in our path (which happen to be the movies Zach Grenier is in) and give us our answer.

In our graph, we only have two things connected to Zach Grenier… the Movie “RescueDawn” and “Twister”. Let’s go ahead and click on Twister:

visual search 5

We query the graph for the pattern Actor named “Zach Grenier” that is connected to the movie titled “Twister”. The graph finds this pattern, returns the nodes and relationships within this pattern, and Twister gets added to our graph, connected to Zach Grenier.

The patterns we can create can go beyond just a single hop, for example. Actor born in 1929, that acted in “Snow Falling on Cedars” alongside Rick Yune, who was also in Ninja Assassin, alongside other actors…

MATCH p = node0:Actor -- node1:Movie -- node2:Actor -- node3:Movie -- node4:Actor 
WHERE node0.born? = {value0} AND node1.title? = {value1} AND node2.name? = {value2} AND node3.title? = {value3} AND HAS(node4.name) 
WITH LAST(EXTRACT(n in NODES(p) : n.name?)) AS label, COUNT(*) AS cnt 
RETURN label 
ORDER BY label 

This query will be executed with the parameters: {“value0″=>1929, “value1″=>”Snow Falling on Cedars”, “value2″=>”Rick Yune”, “value3″=>”Ninja Assassin”}. One of the Actors at the end of the pattern is “Naomie Harris” and once we click on her we get this graph:

visual search 6

Don’t just take my word for it thought. Try the live Demo, take a look at the source code, and try pointing it at your own Neo4j 2.0 Labeled Graph.

What missing?

This is a dynamic UI that gives an end user quick access to the graph. However, the astute observer will notice something is missing. The relationship types. The patterns we are creating and matching against the graph only care about nodes that are connected, not in the way they are connected, and that might be a very important feature of our graph we are omitting. Alas, this little project is not the last mile, it is but one step further, and eventually we’ll reach it.

Help me work on these kinds of problems.

Understanding the power of graphs will give your data architect skills a boost. Don’t let this blog post be the last time you think in graphs. Learn about graphs at one of the dozens of events already on the Calendar and keep an eye out as more get added every week. Take some time to watch these great graph videosfrom the events you might have missed. Read the Graph Databases book, and of course… subscribe to my blog and follow me on Twitter.

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.


Published at DZone with permission of Max De Marzi, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}