What we’re going to be talking about today is JCypher: Where it came from, what it is today and where we’re going in the future:
An Introduction to the JCypher Java DSL
I first came into contact with the Neo4j graph database and the Cypher graph query language about two years ago. I was immediately fascinated and wanted to start working with it, so I decided to write a native Java DSL – a domain specific language for Cypher.
This DSL is basically a fluent Java API which allows for concatenate method calls in order to formulate language expressions while avoiding method nesting as much as possible. We try to ensure method calls have no more than one argument to ensure the DSL is intuitive to read and write, and — most importantly — to get the most out of completion proposals provided by modern Java IDs.
In the background, this DSL creates Cypher expressions. I also introduced a database abstraction layer so that we could execute these expressions against a graph database in a uniform way – regardless of whether or not the database was remote, embedded or in memory.
Next, there was the question, what Java result should such a query return? The decision was towards a generic graph data model consisting of nodes, relationships, labels, types, and properties. Because the model is so simple, it’s easy to read, navigate and modify. And you can take a modified model — or even an empty model — and store it in a straightforward way back into the database by simply calling store on the model. You don’t have to write query expressions for the actual update because this is completed in the background.
Adding Abstraction Layers via Domain Mapping
At this point in our model, we have access to graph databases at different levels of abstraction: with a query DSL at a lower level, and with a generic graph model at a somewhat higher level.
The next question that naturally arises is: Are there more levels of abstraction that make sense? And the answer is yes.
The next level of abstraction, which we call domain mapping, is to take an arbitrarily complex graph of Java objects, POJOs — Plain Old Java Objects, or what we’ll call domain objects – and store it to the graph database for later retrieval. You don’t have to modify your objects or their classes, add annotations or write a single line of mapping code or configuration because JCypher provides default mapping.
At that same level of abstraction, you need to have the ability to query your graph of domain objects, which JCypher does by providing another Java DSL called Domain Query Language. With Domain Query Language, you can formulate queries based on concepts of your domain model rather than concepts of the underlying graph model.
JCypher also provides some non-functional features such as transactions and concurrency support, which provides the ability for multiple clients to access a single database as well as multi-threaded access from within one client.
Consider the following data model:
In this example, companies and people are subjects, which are related via points of contacts to points of contacts, which can be an electronic address or postal address. The points of contacts can be in certain areas, such as a city, which can then be a part of other areas: states, countries, continents, etc. This is a simple but useful model.
Now let’s have a look at some Java code:
First, we instantiate some domain objects and populate the graph that stands behind this
createmethod, which returns a list of
persons that serve as root objects of the domain object graph.
Next, to access a graph database, we instantiate a
DBaccess object by means of a factory. We specify the database connection as remote and add some properties, most importantly the URL where we have the Neo4j server. Because we’re working with domain models and domains, we also need domain access — which we again create by means of a factory.
A business domain must have a unique name within a database, and in our case the name is
People Domain. Now with the domain access at hand, we simply call store given the domain objects and the “enter a graph of domain objects” is stored into the database.
Executing Domain Queries Using JCypher
To formulate and execute a domain query, we create a query object from the domain access:
Next, we create one or more domain object matches, which plays a central role in domain queries because it serves map domain objects of a certain type. In our case, we match objects of type
Next, we specify some constraints on that domain object match using
WHERE clauses. We specify the person(s) as having the last name “Smith” and the first name of “John.” Consecutive
WHERE clauses are ended by default. If you want to include “or,” you need to insert an
or clause as well as brackets to arbitrarily nest those expressions.
With this query, we are trying to determine who else lives at John Smith’s addresses, which we do by using a graph traversal clause. We start traversing from John Smith forward
FORTH via points of contact — which brings us to John Smith’s addresses — and then continue backward via points of contact to objects of type
person. In that way we have to find another
persons which live at John Smith’s addresses. It’s that simple!
Next, we need to execute the query, and we can retrieve the actual result for every domain object match that we have specified in the query. In our case, we retrieve a list of
persons who live at John Smith’s addresses. And then we enter a query of our domain model.
In the next query, we want to know of those who live at John Smith’s addresses, how many of them live in Europe? In this next query, we only need to
MATCH for the object with a type “area” and the name “Europe”:
In the next step in this query, we collect all areas of all of Smith’s addresses, which is again done with a graph traversal clause. So we start traversing from Smith forward via points of contact, which brings us to Smith’s addresses.
We then continue forward via area, which brings us to the immediate areas of the addresses and then we recursively collect all areas which are reachable via the “part of attribute.” This leads to another
Now we have collected all areas of all Smith’s addresses. To complete the query, we use a
SELECTclause to select all of the Smiths for which the collected areas contain Europe. Next we execute the query and retrieve the actual result, which is a list of persons with the last name Smith who have addresses in Europe.
You can do a lot more interesting and powerful things with domain queries, which are all described in the project’s documentation and the distinct samples project. With all of those domain queries, you don’t have to worry about optimising database structure or mappings for query performance.
Also, because the graph of domain objects is backed by a graph database, navigating the database is really cheap. In contrast with relational databases, you need to optimize database structures almost on a per query basis, especially when it comes to navigating highly connected data.
The Future of JCypher
New JCypher features are being added on a regular basis. For example, in one of the next releases, you’ll be able to store domain queries for later use, even if you don’t have access to the Java code which originally created it.
Another new feature is called JCypher-Server, which is a server-side implementation of JCypher that provides a RESTful API along with a WEB-UI. This will allow users to experiment with domain models and domain queries. It’s in the very early stages, but it will grow functionally over the next few months.