Over a million developers have joined DZone.
Silver Partner

neo4j/cypher/Lucene: Dealing with special characters

· Database Zone

The Database Zone is brought to you in partnership with DBmaestro.  Learn more about what Continuous Delivery for the Database actually is and overcome mistrust issues.

neo4j uses Lucene to handle indexing of nodes and relationships in the graph but something that can be a bit confusing at first is how to handle special characters in Lucene queries.

For example let’s say we set up a database with the following data:

CREATE ({name: "-one"})
CREATE ({name: "-two"})
CREATE ({name: "-three"})
CREATE ({name: "four"})

And for whatever reason we only wanted to return the nodes that begin with a hyphen.

A hyphen is a special character in Lucene so if we forget to escape it we’ll end up with an impressive stack trace:

START p = node:node_auto_index("name:-*") RETURN p;
==> RuntimeException: org.apache.lucene.queryParser.ParseException: Cannot parse 'name:-*': Encountered " "-" "- "" at line 1, column 5.
==> Was expecting one of:
==>     <BAREOPER> ...
==>     "(" ...
==>     "*" ...
==>     <QUOTED> ...
==>     <TERM> ...
==>     <PREFIXTERM> ...
==>     <WILDTERM> ...
==>     "[" ...
==>     "{" ...
==>     <NUMBER> ...
==>

So we change our query to escape the hyphen:

START p = node:node_auto_index("name:\-*") RETURN p;

which results in the following exception:

==> SyntaxException: invalid escape sequence
==> 
==> Think we should have better error message here? Help us by sending this query to cypher@neo4j.org.
==> 
==> Thank you, the Neo4j Team.
==> 
==> "START p = node:node_auto_index("name:\-*") RETURN p"
==>

The problem is that the cypher parser also treats ‘\’ as an escape character so we need to use two of them to make our query do what we want:

START p = node:node_auto_index("name:\\-*") RETURN p;
==> +------------------------+
==> | p                      |
==> +------------------------+
==> | Node[4]{name:"-one"}   |
==> | Node[5]{name:"-two"}   |
==> | Node[6]{name:"-three"} |
==> +------------------------+
==> 3 rows

Alternatively, as Chris pointed out, we could make use of parameters in which case we don’t need to worry about how the cypher parser handles escaping:

require 'neography'
 
neo = Neography::Rest.new
query = "START p = node:node_auto_index({query}) RETURN p"
result = neo.execute_query(query, { :query => 'name:\-*'})
 
p result["data"].map { |x| x[0]["data"] }
$ bundle exec ruby params.rb
[{"name"=>"-one"}, {"name"=>"-two"}, {"name"=>"-three"}]



















The Database Zone is brought to you in partnership with DBmaestro.  Learn more about what Continuous Delivery for the Database actually is and overcome mistrust issues.

Topics:

Published at DZone with permission of Mark Needham , DZone MVB .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}