Over a million developers have joined DZone.
Platinum Partner

At a Conference? Need a Dataset? Neo4j at NOSQL NOW

· Cloud Zone

The Cloud Zone is brought to you in partnership with Iron.io. Discover how Microservices have transformed the way developers are building and deploying applications in the era of modern cloud infrastructure.

For the "Lunch and Learn around Neo4j" with Andreas Kollegger we wanted to use a dataset that is easy to understand and interesting enough for attendees of the conference.

So we chose to use just that days conference program as dataset. Conference data is usually well connected and has the opportunity for challenging data model discussions and insightful queries.

So we set up a Heroku instance, connected to a provisioned Neo4j database hosting an informational website. It explains Neo4j, the local installation, the Heroku add-on and lists available drivers for the different languages.
We then used a small ruby script with the neography gem by our community rock star Max De Marzi

to populate the database. From our example-data site, you can download the graph.db directory for your local Neo4j server.

require 'rubygems'
require 'neography'

def neo
  @neo ||= Neography::Rest.new("http://localhost:7474")

def has_rel(node, dir, type)
  res = neo.get_node_relationships(node, dir, type)
  return res && res.size > 0

def add_talk(slot, title, speakers,audience,tags)
  root = neo.get_root()
  talk = neo.create_node({:title => title})
  slot = neo.create_unique_node(:slots, :slot, slot, { :slot => slot})
  neo.create_relationship(:at, talk, slot)
  speakers.each do |name, from|
    speaker = neo.create_unique_node(:speakers, :name, name, { :name => name})
    neo.create_relationship(:presents, speaker, talk)
    company = neo.create_unique_node(:companies, :company, from, { :company => from})
    neo.create_relationship(:works_at, speaker, company) unless has_rel(speaker, :out, :works_at)
  tags.each do |name|
    tag = neo.create_unique_node(:tags, :tag, name, { :tag => name})
    neo.create_relationship(:tagged, talk, tag)
    neo.create_relationship(:tag, root, tag) unless has_rel(tag,:in, :tag)
  who = neo.create_unique_node(:audience, :audience, audience, { :audience => audience})
  neo.create_relationship(:for, talk, who)

neo.execute_query("start n=node(*) match n-[r?]-m where ID(n)<>0 delete n,r")

[:slots, :speakers, :companies, :tags, :audience].each do |name|
  neo.create_node_index(name, :exact, :lucene)

add_talk("08:30 AM - 09:00 AM",'The Journey to Amazon DynamoDB: From Scaling by Architecture to Scaling by Commandment',
  {'Swami Sivasubramanian'=>'Amazon Web Services'}, 'Technical - Introductory', [ 'Cloud Computing',"NoSQL Architecture and Design"])
add_talk("09:00 AM - 09:45 AM", 'Then Our Buildings Shape Us: A new way to think about NoSQL technology selection',
  {'Tim Berglund'=>'GitHub'}, 'Business / Non-Technical', [ 'NoSQL Architecture and Design', "NoSQL Technology Evaluation"])
add_talk("09:45 AM - 10:00 AM",'Create Powerful New Applications with Graphs',
  {'Emil Eifrem'=>'Neo Technology'}, 'Business / Non-Technical', [ 'Graph Databases'])
add_talk("10:30 AM - 11:15 AM",'Why and When You Should Use Redis',
  {'Josiah Carlson'=>'ChowNow Inc.'}, 'Technical - Introductory', [ 'NoSQL Technology Evaluation'])
add_talk("10:30 AM - 11:15 AM",'Intro to Graph Databases 101',
  {'Andreas Kollegger'=>'Neo Technology'}, 'Technical - Introductory', [ 'Graph Databases'])
add_talk("01:15 PM - 02:00 PM",'Lunch N Learn with Neo Technology and Neo4j',
  {'Andreas Kollegger'=>'Neo Technology'}, 'Technical - Introductory', [ 'Graph Databases'])
add_talk("02:15 PM - 03:00 PM", 'Using Graph Databases to Analyze Relationships, Risks and Business Opportunities - A Case Study',
  {'Jans Aasman'=>'Franz Inc'}, 'Technical - Introductory', [ 'Graph Databases'])
add_talk("04:15 PM - 04:45 PM", 'High performance graph database using cache, cloud, and standards',
  {'Bryan Thompson'=>'SYSTAP, LLC'}, 'Technical - Advanced', [ 'Graph Databases'])
add_talk("04:15 PM - 04:45 PM", 'Introducing Hadoop and Big Data into a Healthcare Organization: A True Story and Learned Lessons',
  {'Vladimir Bacvanski'=>'SciSpike'}, 'Technical - Intermediate', [ 'Big Data'])
add_talk("04:15 PM - 04:45 PM", 'NoSQL Data Modelling for Scalable eCommerce',
  {'Dipali Trivedi'=>'Staples.com'}, 'Technical - Intermediate', [ 'NoSQL Architecture and Design'])

add_talk("05:30 PM - 06:30 PM",'The NoSQL "C Panel"', {"Robert Scoble"=>"RackSpace",
                                                      "Bob Wiederhold"=>"Couchbase",
                                                      "Dwight Merriman"=>"10gen",
                                                      "Emil Eifrem"=>"Neo Technology",
                                                      "Jay Jarrell"=>"Objectivity, Inc.",
                                                      "Kirk Dunn"=>"Cloudera, Inc."},
                                                      "Business / Non-Technical",
                                                      ["Graph Databases", "Hadoop", "MongoDB"])

Andreas ran a very successful session working with the conference dataset, here are the slides introducing Neo4j and Cypher:

To spark your creativity we also prepared some more advanced queries running with the dataset and made them available. You can access the server web interface with an interactive console running the queries and a data browser for visualizing the available data.

Have fun!


Index lookup:

    start abk=node:speakers(name="Andreas Kollegger")
    return abk;

return properties & id:

    start abk=node:speakers(name="Andreas Kollegger")
    return abk.name, id(abk);

follow relationships:

    start abk=node:speakers(name="Andreas Kollegger")
    match abk-[:presents]->talk
    return talk.title;

    start abk=node:speakers(name="Andreas Kollegger")
    match abk-[:presents]->talk-[:at]->slot
    return talk.title,slot.slot;

which other talks are during those slots:

    start abk=node:speakers(name="Andreas Kollegger")
    match abk-[:presents]->talk-[:at]->slot<-[:at]-other
    return talk.title,slot.slot, other.title;

group them into a collection, and count them

    start abk=node:speakers(name="Andreas Kollegger")
    match abk-[:presents]->talk-[:at]->slot<-[:at]-other
    return talk.title,slot.slot, collect(other.title) as others, count(*) as cnt;

only see those where there is more than one competing slot

    start abk=node:speakers(name="Andreas Kollegger")
    match abk-[:presents]->talk-[:at]->slot<-[:at]-other
    with talk, count(*) as cnt
    where cnt>1
    return talk.title,cnt;

slots are connected with a next relationship, show all slots

    start n=node(2)
    match p=n-[:next*0..]->current
    return current.slot;
show the talks at the slot
    start n=node(2)
    match p=n-[:next*0..]->current<-[:at]-talk
    return current.slot, talk.title;

all talks with the tag Graph Databases

    start tag=node:tags(tag="Graph Databases")
    match tag<-[:tagged]-talk
    return talk;

which companies talk about graph databases

    start tag=node:tags(tag="Graph Databases")
    match tag<-[:tagged]-talk<-[:presents]-speaker-[:works_at]->company
    return talk,speaker,company;

which companies speak about graph databases (with a surprise)

    start tag=node:tags(tag="Graph Databases")
    match tag<-[:tagged]-talk<-[:presents]-speaker-[:works_at]->company
    return distinct company.company;

The Cloud Zone is brought to you in partnership with Iron.io. Learn how to build and test their Go programs inside Docker containers.


Published at DZone with permission of Andreas Kollegger , DZone MVB .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}