Over a million developers have joined DZone.

And Now for Something Completely Different: Using OWL with Neo4j

DZone's Guide to

And Now for Something Completely Different: Using OWL with Neo4j

· Database Zone
Free Resource

MongoDB Atlas is a database as a service that makes it easy to deploy, manage, and scale MongoDB. So you can focus on innovation, not operations. Brought to you in partnership with MongoDB.

Originally authored by Stefanie Wiegand

Why would you want to do this?

OWL has been around for a while now and is used for a variety of semantic applications. Ontologies are freely available and help developers to create models for real world scenarios. They can be instantiated, combined and enriched using SWRL rules and a reasoner such as Hermit or Pellet. The reasons for creating such a representation of data differ: natural language processing, reusing data across domains or contextualization are just some of many. The data obtained is then stored in a knowledge base from which it can be retrieved using SPARQL queries. But if it's all already there, what's the point of combining it with a graph database?

While SPARQL certainly has its strong points like using different ontologies at the same time and the similarity to well known SQL, it also has weaknesses. Triple stores which are the starting point for most SPARQL applications consume a lot of disk space compared to relational databases. They are also slow for very large datasets.

Neo4j stores whole graphs as opposed to “just” triples. It has an easy to learn and easy to use query language and a web based, graphical, interface which allows users to easily browse and explore the graph. Also it is fast for querying and scales very well to handle larger datasets.

As it is the case most of the time, there is no ideal solution and it really depends on the use case, considering, for example:
  • The amount and frequency and connectedness of incoming data
  • Importance of speed and size
  • The type of query executed on the database

The Playground

There is the PROV-O ontology, which models causation and influence between activities, agents and entities. This concept is fairly abstract but useful to answer a number of questions related to the origin of entities (and what may have influenced them throughout their lifetime). It is applicable to a number of fields, for example social networking (“Who was the author of the blog post that influenced Peter to write his mashup?”) or experimenting (“Who was the last person to access the experiment before it failed and when did he access it?”).

PROV-O is used in the BonFIRE project, which is a multi-site cloud experimentation and testing facility. There are people (agents) conducting experiments using resources (entities). At an infrastructure level, to perform their experiments, they create, use and destroy (activities) compute nodes, storages and virtual networks (entities). After their experiment has finished, they download the results (entities) from the virtual machine for further analysis. These results are influenced by a large number of activities and agents, and often it is difficult to determine how such a result came to be, who was involved in its formation or why it is different from other results. Using provenance, these questions can be answered.


In BonFIRE, the data arrives on a RabbitMQ as a set of JSON messages that look like this:
In this case, bert shut down compute node 123 located on server1. This message is filled into Java classes which are used to transform them (“manually”) into triples. Using the single message from the above example, we derive several triples that would look something like this:
:Action_state.shutdown_1375801302 rdf:type :Action
:Compute_/locations/server1/computes/123 rdf:type :Compute
:Compute_/locations/server1/computes/123 prov:invalidatedBy :Action_state.shutdown_1375801302
:Experimenter_Bert rdf:type :Experimenter
:Experimenter_Bert prov:wasAssociatedWith :Action_state.shutdown_1375801302
The prefixes used are defined in the ontology into which these triples are going to be imported.

The above step is not necessary if the messages are supposed to go into the ontology directly – OWLAPI could be used instead to create individuals, properties and so on. Transforming them to triples however serves as an interface to be able to read data from all kinds of sources as long as it's formatted as triples. If OWLAPI was used instead, the code would have to be changed every time the data changes.

These triples can then be added to an ontology using the OWLRDFConsumer class from the OWLAPI. This adds the triples to the ontology where the reasoner can be invoked to enrich the data. So far, that's not really special. The interesting bit follows after the reasoning has taken place.

Getting Graphy

Now there is this ontology object sitting in the memory, which contains the ontology itself as well as the individuals that came from the triples. Now it could simply be stored in a knowledge base but if it was, you wouldn't be reading about it here :)

An ontology is a graph. It has a top node (owl:Thing) and classes extending it. There are individuals that belong to classes and object properties connecting the individuals. Individuals can have data properties and annotations that can be represented as node properties and relationship properties or as relationship types.

The import of an ontology is pretty straightforward:

Step 1

The only object you need is the ontology object created earlier. It could also be loaded from a file, that doesn't make a difference.
private void importOntology(OWLOntology ontology) throws Exception {
    OWLReasoner reasoner = new Reasoner(ontology);
        if (!reasoner.isConsistent()) {
            logger.error("Ontology is inconsistent");
            //throw your exception of choice here
            throw new Exception("Ontology is inconsistent");
        Transaction tx = db.beginTx();
        try {

Step 2

Create a starting node in Neo4j representing the owl:Thing node. This is the root node of the graph we're going to create.
Node thingNode = getOrCreateNodeWithUniqueFactory("owl:Thing");

Step 3

Get all the classes defined in the ontology and add them to the graph.
             for (OWLClass c :ontology.getClassesInSignature(true)) {
                String classString = c.toString();
                if (classString.contains("#")) {
                    classString = classString.substring(
                Node classNode = getOrCreateNodeWithUniqueFactory(classString);

Step 4

Find out if they have any super classes. If they do, link them. If they don't, link back to owl:Thing. Make sure only to link to the direct super classes! The relationship type used to express the rdf:type property is a custom one named “isA”.
                 NodeSet<OWLClass> superclasses = reasoner.getSuperClasses(c, true);

                if (superclasses.isEmpty()) {
                } else {
                    for (org.semanticweb.owlapi.reasoner.Node<OWLClass>
                     parentOWLNode: superclasses) {
                        OWLClassExpression parent =
                        String parentString = parent.toString();
                        if (parentString.contains("#")) {
                            parentString = parentString.substring(
                        Node parentNode =

Step 5

Now for each class, get all the individuals. Create nodes and link them back to their parent class.
                 for (org.semanticweb.owlapi.reasoner.Node<OWLNamedIndividual> in
                 : reasoner.getInstances(c, true)) {
                    OWLNamedIndividual i = in.getRepresentativeElement();
                    String indString = i.toString();
                    if (indString.contains("#")) {
                        indString = indString.substring(
                    Node individualNode = 

Step 6

For each individual, get all object properties and all data properties. Add them to the graph as node properties or relationships. Make sure to get all axioms, not just the asserted ones.
                     for (OWLObjectPropertyExpression objectProperty:
                     ontology.getObjectPropertiesInSignature()) {

                        object: reasoner.getObjectPropertyValues(i,
                        objectProperty)) {
                            String reltype = objectProperty.toString();
                            reltype = reltype.substring(reltype.indexOf("#")+1,
                            String s =
                            s = s.substring(s.indexOf("#")+1,
                            Node objectNode =

                    for (OWLDataPropertyExpression dataProperty:
                     ontology.getDataPropertiesInSignature()) {

                        for (OWLLiteral object: reasoner.getDataPropertyValues(
                         i, dataProperty.asOWLDataProperty())) {
                            String reltype =
                            reltype = reltype.substring(reltype.indexOf("#")+1, 
                            String s = object.toString();
                            individualNode.setProperty(reltype, s);
        } finally {
That's it, you're done! Now for the fun bit: querying the ontology!


This is the graph now sitting in the database:

It has the ontology as well as all the individuals and properties, represented in their “natural” form. Now the querying can begin. Whether it is a simple query to find out what happened to a specific VM (entity) during its lifecyle
START e=node:name(name="experiment123"), ag=node:name(name="Agent")
MATCH e-[r:hadActivity]->ac-->a-[:isA*]->ag
RETURN distinct e.name as experiment, type(r) as relationship, a.name as agent
ac.name as activity, ac.startedAtTime as starttime, ac.endedAtTime as endtime
ORDER BY starttime


Protege comes with a simple visualization and the possibility to execute SPARQL queries. Neo4j has cypher, which makes querying the imported ontology much more intuitive - ontologies are graphs after all. Also the webadmin interface allows better "exploring" of the graph. Time is not an issue in this case, because the ontology import is not time-critical. It's done only once after the experiment has finished and imports the whole ontology. For an ontology containing several hours of experiment data, the import takes only a few seconds. Once the graph has been imported, querying is fast which makes it a great tool to analyze and visualize ontologies.

MongoDB Atlas is the best way to run MongoDB on AWS — highly secure by default, highly available, and fully elastic. Get started free. Brought to you in partnership with MongoDB.


Published at DZone with permission of Andreas Kollegger, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}