DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Data
  4. Visualizing RDF Schema inferencing through Neo4J, Tinkerpop, Sail and Gephi

Visualizing RDF Schema inferencing through Neo4J, Tinkerpop, Sail and Gephi

Davy Suvee user avatar by
Davy Suvee
·
Nov. 21, 11 · Interview
Like (0)
Save
Tweet
Share
5.61K Views

Join the DZone community and get the full member experience.

Join For Free

last week, the neo4j plugin for gephi was released. gephi is an open-source visualization and manipulation tool that allows users to interactively browse and explore graphs . the graphs themselves can be loaded through a variety of file formats. thanks to martin škurla , it is now possible to load and lazily explore graphs that are stored in a neo4j data store.

in one of my previous articles , i explained how neo4j and the tinkerpop framework can be used to load and query rdf triples. the newly released neo4j plugin now allows to visually browse these rdf triples and perform some more fancy operations such as finding patterns and executing social network analysis algorithms from within gephi itself. tinkerpop’s sail ouplementation also supports the notion of rdf schema inferencing . inferencing is the process where new (rdf) data is automatically deducted from existing (rdf) data through reasoning . unfortunately, the sail reasoner cannot easily be integrated within gephi, as the gephi plugin grabs a lock on the neo4j store and no rdf data can be added, except through the plugin itself.

being able to visualize the rdf schema reasoning process and graphically indicate which rdf triples were added manually and which rdf data was automatically inferred would be a nice to have. to implement this feature, we should be able to push graph changes from tinkerpop and neo4j to gephi. luckily, the gephi graph streaming plugin allows us to do just that. in the rest of this article, i will detail how to setup the required gephi environment and how we can stream (inferred) rdf data from neo4j to gephi.

1. adding the (inferred) rdf data

let’s start by setting up the required neo4j/tinkerpop/sail environment that we will use to store and infer rdf triples. the setup is similar to the one explained in my previous tinkerpop article . however, instead of wrapping our graphsail as a sailrepository , we will wrap it as a forwardchainingrdfsinferencer . this inferencer will listen for rdf triples that are added and/or removed and will automatically execute rdf schema inferencing, applying the rules as defined by the rdf semantics recommendation .

neograph = new neo4jgraph("var/rdf");
// let's use manual transaction mode
neograph.setmaxbuffersize(0);
sail = new forwardchainingrdfsinferencer(new graphsail(neograph));
sail.initialize();
connection = sail.getconnection();

we are now ready to add rdf triples. let’s create a simple loop that allows us to read-in rdf triples and add them to the sail store.

inferenceloop loop = new inferenceloop();
scanner in = new scanner(system.in);
while (true) {
   system.out.println("provide rdf statement:");
   system.out.print("=> ");
   string input = in.nextline();
   system.out.println("the following edges were created:");
   loop.inference(input);
}

the inference method itself is rather simple. we first start by parsing the rdf subject , predicate and object . next, we start a new transaction, add the statement and commit the transaction. this will not only add the rdf triple to our neo4j store but will additionally run the rdf schema inferencing process and automatically add the inferred rdf triples. pretty easy!

// parses and add the rdf statement accordingly
public void inference(string statement) throws sailexception, interruptedexception {
   string[] triple = statement.split(" ");
   inference(new uriimpl(triple[0]), new uriimpl(triple[1]), new uriimpl(triple[2]));
}

// add the inference
public void inference(uri subject, uri predicate, uri object) throws sailexception, interruptedexception {
   neograph.starttransaction();
   connection.addstatement(subject, predicate, object);
   connection.commit();
   neograph.stoptransaction(transactionalgraph.conclusion.success);
}

but how do we retrieve the inferred rdf triples that were added through the inference process? although the forwardchainingrdfsinferencer allows us to register a listener that is able to detect changes to the graph, it does not provide the required api to distinct between the manually added or inferred rdf triples . luckily, we can still access the underlying neo4j store and capture these graph changes by implementing the neo4j transactioneventhandler interface. after a transaction is committed, we can fetch the newly created relationships (i.e. rdf triples). for each of these relationships, the start node (i.e. rdf subject), end node (i.e. rfd object) and relationship type (i.e. rdf predicate) can be retrieved. in case a rdf triple was added through inference, the value of the boolean property “inferred” is “true” . we filter the relationships to the ones that are defined within our domain (as otherwise the full rdfs meta model will be visualized as well). finally we push the relevant nodes and edges.

public class pushtransactioneventhandler implements transactioneventhandler {

   private int id = 1;

   public void aftercommit(transactiondata transactiondata, object o) {
      // retrieve the created relationships. (the relevant nodes will be retrieved through these relationships)
      iterable relationships = transactiondata.createdrelationships();

      // iterate and add
      for (relationship relationship : relationships) {
         // retrieve the labels
         string start = (string)relationship.getstartnode().getproperty("value");
         string end = (string)relationship.getendnode().getproperty("value");
         string predicate = relationship.gettype().tostring();

         // limit the relationships that are shown to our own domain
         if (!start.startswith("http://www.w3.org") && !end.startswith("http://www.w3.org")) {
            // check whether the relationship is inferred or not
            boolean inferred = (boolean)relationship.getproperty("inferred",false);
            // retrieve the more meaningful names
            start = getname(start);
            end = getname(end);
            predicate = getname(predicate);
            // push the start and end nodes (they will only be created once)
            pushutility.pushnode(start);
            pushutility.pushnode(end);
            pushutility.pushedge(id++, start, end, predicate, inferred);
         }
      }
   }

   ...

}

2. pushing the (inferred) rdf data

the streaming plugin for gephi allows reading and visualizing data that is send to its master server. this master server is a rest interface that is able to receive graph data through a json interface. the pushutility used in the pushtransactioneventhandler is responsible for generating the required json edge and node data format and pushing it to the gephi master.

public class pushutility {

   private static final string url = "http://localhost:8080/workspace0?operation=updategraph";
   private static final string nodejson = "{\"an\":{\"%1$s\":{\"label\":\"%1$s\"}}}";
   private static final string edgejson = "{\"ae\":{\"%1$d\":{\"source\":\"%2$s\",\"target\":\"%3$s\",\"directed\":true,\"label\":\"%4$s\",\"inferred\":\"%5$b\"}}}";

   private static void push(string message) {
      try {
         // create a connection and push the node or edge json message
         httpurlconnection con = (httpurlconnection) new url(url).openconnection();
         con.setrequestmethod("post");
         con.setdooutput(true);
         con.getoutputstream().write(message.getbytes("utf-8"));
         con.getinputstream();
      }
      catch(exception e) {
         e.printstacktrace();
      }
   }

   // pushes a node
   public static void pushnode(string label) {
      push(string.format(nodejson, label));
   }

   // pushes an edge
   public static void pushedge(int id, string source, string target, string label, boolean inferred) {
      push(string.format(edgejson, id, source, target, label, inferred));
      system.out.println(string.format(edgejson, id, source, target, label, inferred));
   }

}

3. visualizing the (inferred) rdf data

start the gephi streaming master server. this will allow gephi to receive the (inferred) rdf triples that we send it through its rest interface. let’s run our java application and add the following rdf triples:

http://datablend.be/example/teaches http://www.w3.org/2000/01/rdf-schema#domain http://datablend.be/example/teacher
http://datablend.be/example/teaches http://www.w3.org/2000/01/rdf-schema#range http://datablend.be/example/student
http://datablend.be/example/davy http://datablend.be/example/teaches http://datablend.be/example/bob

the first two rdf triples above state that a teacher teaches a student . the last rdf triple states that davy teaches bob . as a result, the rdf schema inferencer deducts that davy must be a teacher and that bob must be a student . let’s have a look at what gephi visualized for us.

gephi

mmm … that doesn’t really look impressive :-) . let’s use some formatting. first apply force atlas lay-outing . afterwards, scale the edges and enable the labels on both the edges and the nodes. finally, apply partitioning on the edges by coloring the arrows using the inferred property on the edges. we can now clearly identify the inferred rdf statements (i.e. davy being a teacher and bob being a student).

gephi

let’s add some additional rdf triples.

http://datablend.be/example/teacher http://www.w3.org/2000/01/rdf-schema#subclassof http://datablend.be/example/person
http://datablend.be/example/student http://www.w3.org/2000/01/rdf-schema#subclassof http://datablend.be/example/person

basically, these rdf triples state that both teacher and student are subclasses of person . as a result, the rdfs inferencer is able to deduct that both davy and bob must be persons . the gephi visualization is updated accordingly.

gephi

4. conclusion

with just a few lines of code we are able to stream (inferred) rdf triples to gephi and make use of its powerful visualization and analysis tools to explore and inspect our datasets. as always, the complete source code can be found on the datablend public github repository . make sure to surf the internet to find some other nice gephi streaming examples, the coolest one probably being the visualization of the egyptian revolution on twitter .

source: http://datablend.be/?p=1146

Gephi Neo4j Schema Resource Description Framework SAIL (programming language) Data (computing)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Introduction to Spring Cloud Kubernetes
  • Getting a Private SSL Certificate Free of Cost
  • Specification by Example Is Not a Test Framework
  • Spring Cloud

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: