Over a million developers have joined DZone.

neo4j/cypher/Lucene: Dealing with special characters

DZone's Guide to

neo4j/cypher/Lucene: Dealing with special characters

· Database Zone
Free Resource

Whether you work in SQL Server Management Studio or Visual Studio, Redgate tools integrate with your existing infrastructure, enabling you to align DevOps for your applications with DevOps for your SQL Server databases. Discover true Database DevOps, brought to you in partnership with Redgate.

neo4j uses Lucene to handle indexing of nodes and relationships in the graph but something that can be a bit confusing at first is how to handle special characters in Lucene queries.

For example let’s say we set up a database with the following data:

CREATE ({name: "-one"})
CREATE ({name: "-two"})
CREATE ({name: "-three"})
CREATE ({name: "four"})

And for whatever reason we only wanted to return the nodes that begin with a hyphen.

A hyphen is a special character in Lucene so if we forget to escape it we’ll end up with an impressive stack trace:

START p = node:node_auto_index("name:-*") RETURN p;
==> RuntimeException: org.apache.lucene.queryParser.ParseException: Cannot parse 'name:-*': Encountered " "-" "- "" at line 1, column 5.
==> Was expecting one of:
==>     <BAREOPER> ...
==>     "(" ...
==>     "*" ...
==>     <QUOTED> ...
==>     <TERM> ...
==>     <PREFIXTERM> ...
==>     <WILDTERM> ...
==>     "[" ...
==>     "{" ...
==>     <NUMBER> ...

So we change our query to escape the hyphen:

START p = node:node_auto_index("name:\-*") RETURN p;

which results in the following exception:

==> SyntaxException: invalid escape sequence
==> Think we should have better error message here? Help us by sending this query to cypher@neo4j.org.
==> Thank you, the Neo4j Team.
==> "START p = node:node_auto_index("name:\-*") RETURN p"

The problem is that the cypher parser also treats ‘\’ as an escape character so we need to use two of them to make our query do what we want:

START p = node:node_auto_index("name:\\-*") RETURN p;
==> +------------------------+
==> | p                      |
==> +------------------------+
==> | Node[4]{name:"-one"}   |
==> | Node[5]{name:"-two"}   |
==> | Node[6]{name:"-three"} |
==> +------------------------+
==> 3 rows

Alternatively, as Chris pointed out, we could make use of parameters in which case we don’t need to worry about how the cypher parser handles escaping:

require 'neography'
neo = Neography::Rest.new
query = "START p = node:node_auto_index({query}) RETURN p"
result = neo.execute_query(query, { :query => 'name:\-*'})
p result["data"].map { |x| x[0]["data"] }
$ bundle exec ruby params.rb
[{"name"=>"-one"}, {"name"=>"-two"}, {"name"=>"-three"}]

It’s easier than you think to extend DevOps practices to SQL Server with Redgate tools. Discover how to introduce true Database DevOps, brought to you in partnership with Redgate


Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}