DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Data Engineering
  3. Data
  4. Visualizing a Set of Hiveplots with Neo4j

Visualizing a Set of Hiveplots with Neo4j

Max De Marzi user avatar by
Max De Marzi
·
Mar. 30, 12 · Interview
Like (0)
Save
Tweet
Share
7.45K Views

Join the DZone community and get the full member experience.

Join For Free



what should a graph look like and how can i tell two graphs apart?


these are questions martin krzywinski (genome sciences center, vancouver, bc) has been asking. take a look at the picture below:

it’s the same graph, the same data, visualized 8 different ways. which is the right way? what advantage does one layout give over the other? can you tell it’s the same network? i can’t.

eight layouts might be too much, so let’s just look at one on the next picture:

martin took the spring embedded visualization and tweaked it around. can you tell it’s the same graph, the same data underneath? i can’t.

to tackle this problem, martin invented the hive plot , a perceptually uniform and scalable layout visualization for network visual analytics.

if you want to learn more about hive plots, take a look at his website and this presentation (it is quite large at 20 mb). i cannot do it justice in this short blog post, and in all honestly haven’t had the time to study it properly.

today i just want to give you a little taste of hiveplots. i am going to visualize the github graphs of nine languages you might not have heard of: boo, dylan, factor, gosu, mirah, nemerle, nu, parrot, self. i’m not going to show you how to create the graph this time, because this is real data we are using. you can take a look at it on the data folder in github.

the graph is basically: (language)–(repository)–(user). there are two relationships between repository and user, wrote and forked.

i’ll show you how to get the data out and into our visualization.

def wroterepos(language)
  neo = neography::rest.new
  neo.execute_script("m = [:]
                      g.v.filter{it.type == 'language' && it.name == '#{language}'}
                       .in.transform{m[it.name] = it.in('wrote').gather{it.name}.next()}
                       .iterate()
                      m")
end

we do the same thing but for forked. this may seem a bit strange to you, but what i am doing is kind of like the sql equivalent of a left outer join with gremlin.

def forkedrepos(language)
  neo = neography::rest.new
  neo.execute_script("m = [:]
                      g.v.filter{it.type == 'language' && it.name == '#{language}'}
                       .in.transform{m[it.name] = it.in('forked').gather{it.name}.next()}
                       .iterate()
                      m")
end

now we do some ruby magic to put our data into the json format the visualization wants.

get '/hive/:name' do
  repos        = []
  writers      = [] 
  forkers      = []
  temp_forkers = []
  temp_writers = []

  wroterepos(params[:name]).each_pair do |key, value|
    repos << {"name" => key, "imports" => value, "node_type" => "repo"}
    temp_writers << { "name" => value[0] }
  end

  i = 0
  forkedrepos(params[:name]).each_pair do |key, value|
    repos[i]["imports"] =  repos[i]["imports"] + value
    temp_writers[i]["imports"] = value
    temp_forkers << value
    i += 1
  end

  temp_writers.group_by {|i| i["name"]}.each do |w, f|
    writers << {"name" => w, 
                "imports" => f.collect{|y| y["imports"]}.flatten.uniq, 
                "node_type" => "writer"}
  end

  temp_forkers.flatten.uniq.delete_if{|x| writers.collect{|y| y["name"]}.include?(x)}.each do |f|
    forkers << {"name" => f, 
                "imports" => [], 
                "node_type" => "forker"}
  end

  (repos + writers + forkers).to_json
end

the blue color nodes are our repositories, the yellow nodes are our writers, and the green nodes are our forkers. the 12 o’clock axis (the top) shows nodes with only outgoing relationships. the bottom-left axis shows nodes with only incoming relationships. these are the writers without any forks, and the forkers who never started their own public projects. the remaining nodes in the bottom-right have both incoming and outgoing relationships. these are the repository writers who created projects other people found worth forking.

the graphs are ordered across for each row in the following manner:

  • boo, dylan, factor
  • gosu, mirah, nemerle
  • nu, parrot, self

can you see the similarities between boo, factor and numerle? see how different they are from gosu and self? what does the hive plot tell you about these language github repositories?

you can try a live version at hiveplot.herokuapp.com/index.html and as always the code is available on github .

our visualization was done by rich morin and mike bostock with d3.js . is is a hot off the press work in progress. you can follow the action on this d3.js google group thread .

Data (computing) Neo4j Visualization (graphics) Graph (Unix) GitHub Repository (version control) IS-IS Boo (programming language) Factor (programming language) IT

Published at DZone with permission of Max De Marzi, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • What Are the Different Types of API Testing?
  • Choosing the Right Framework for Your Project
  • Integrate AWS Secrets Manager in Spring Boot Application
  • How To Best Use Java Records as DTOs in Spring Boot 3

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: