Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Neo4j: Set Based Operations with the Experimental Cypher Optimizer

DZone's Guide to

Neo4j: Set Based Operations with the Experimental Cypher Optimizer

· Java Zone ·
Free Resource

Verify, standardize, and correct the Big 4 + more– name, email, phone and global addresses – try our Data Quality APIs now at Melissa Developer Portal!

A few months ago I wrote about cypher queries which look for a missing relationship and showed how you could optimise them by re-working the query slightly.

To refresh, we wanted to find all the people in the London office that I hadn’t worked with given this model…

…and this initial query:

MATCH (p:Person {name: "me"})-[:MEMBER_OF]->(office {name: "London Office"})<-[:MEMBER_OF]-(colleague)
WHERE NOT (p-[:COLLEAGUES]->(colleague))
RETURN COUNT(colleague)

This took on average 7.46 seconds to execute using cypher-query-tuning so we came up with the following version which took 150 ms on average:

MATCH (p:Person {name: "me"})-[:COLLEAGUES]->(colleague)
WITH p, COLLECT(colleague) as marksColleagues
MATCH (colleague)-[:MEMBER_OF]->(office {name: "London Office"})<-[:MEMBER_OF]-(p)
WHERE NOT (colleague IN marksColleagues)
RETURN COUNT(colleague)

With the release of Neo4j 2.1 we can now make use of Ronja – the experimental Cypher optimiser – which performs much better for certain types of queries. I thought I’d give it a try against this one.

We can use the experimental optimiser by prefixing our query like so:

cypher 2.1.experimental MATCH (p:Person {name: "me"})-[:MEMBER_OF]->(office {name: "London Office"})<-[:MEMBER_OF]-(colleague)
WHERE NOT (p-[:COLLEAGUES]->(colleague))
RETURN COUNT(colleague)

If we run that through the query tuner we get the following results:

$ python set-based.py
 
cypher 2.1.experimental MATCH (p:Person {name: "me"})-[:MEMBER_OF]->(office {name: "London Office"})<-[:MEMBER_OF]-(colleague)
WHERE NOT (p-[:COLLEAGUES]->(colleague))
RETURN COUNT(colleague)
Min 0.719580888748 50% 0.723278999329 95% 0.741609430313 Max 0.743646144867
 
 
MATCH (p:Person {name: "me"})-[:COLLEAGUES]->(colleague)
WITH p, COLLECT(colleague) as marksColleagues
MATCH (colleague)-[:MEMBER_OF]->(office {name: "London Office"})<-[:MEMBER_OF]-(p)
WHERE NOT (colleague IN marksColleagues)
RETURN COUNT(colleague)
Min 0.706955909729 50% 0.715770959854 95% 0.731880950928 Max 0.733670949936

As you can see there’s not much in it – our original query now runs as quickly as the optimised one. Ronja #ftw!

Give it a try on your slow queries and see how it gets on. There’ll certainly be some cases where it’s slower but over time it should be faster for a reasonable chunk of queries.


Developers! Quickly and easily gain access to the tools and information you need! Explore, test and combine our data quality APIs at Melissa Developer Portal – home to tools that save time and boost revenue. Our APIs verify, standardize, and correct the Big 4 + more – name, email, phone and global addresses – to ensure accurate delivery, prevent blacklisting and identify risks in real-time.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}