Full-Text-Indexing (FTS) in Neo4j 2.0
Join the DZone community and get the full member experience.
Join For FreeWith Neo4j 2.0, we got automatic schema indexes based on labels and properties for exact lookups of nodes on property values.
Fulltext and other indexes (spatial, range) are on the roadmap but not addressed yet.
For fulltext indexes you still have to use legacy indexes.
As you probably don’t want to add nodes to an index manually, the existing “auto-index” mechanism should be a good fit.
To use that automatic index you have to configure the auto-index upfront to be a fulltext index and then secondly enable it in your settings.
Setup Node Auto-Index as Fulltext-Index
To configure the auto-index as fulltext index for your Neo4j Server use:
POST http://localhost:7474/db/data/index/node/
Accept: application/json; charset=UTF-8
Content-Type: application/json
{
"name" : "node_auto_index",
"config" : {
"type" : "fulltext",
"provider" : "lucene"
}
}
You should get a response like this:
201: Created
Content-Type: application/json; charset=UTF-8
Location: http://localhost:7474/db/data/index/node/node_auto_index/
{
"template" : "http://localhost:7474/db/data/index/node/node_auto_index/{key}/{value}",
"type" : "fulltext",
"provider" : "lucene"
}
Enable Node Auto-Index for certain properties
Configure and enable the auto-index in your conf/neo4j-server.properties. You have to enable the auto-index and also list the properties to be indexed upfront, before you insert any data.
node_auto_indexing=true
node_keys_indexable=title,description
If you configure it after the fact you have to re-set the properties with a cypher statement like this:
MATCH (n)
WHERE has(n.title)
SET n.title=n.title
If you already have many nodes in your database you have to batch it manually to cater for the transaction size limits, like this (increase SKIP by 50000 from 0 to until the query returns zero):
MATCH (n)
WHERE has(n.title)
SKIP 150000 LIMIT 50000
SET n.title=n.title
RETURN COUNT(*)
Using the Fulltext Auto-Index
You can use the fulltext auto-index by using a START-clause in Cypher, you can pass in any kind of lucene query syntax there.
START movie=node:node_auto_index("title:matr*")
MATCH (movie:Movie)<-[r:RATED]-(user)
WHERE r.rating > 4
RETURN movie, count(*) AS number, avg(r.rating) AS ratings
ORDER BY ratings desc, number desc
Java-API
You can also set it up programmatically in the Java API like this:
db.index().forNodes( "node_auto_index",
MapUtil.stringMap( IndexManager.PROVIDER, "lucene", "type", "fulltext" ) );
And pass your configuration to your EmbeddedGraphDatabase
GraphDatatabaseService db = new GraphDatabaseFactory().newEmbeddedGraphDatabaseBuilder(DB_PATH)
.setConfig("node_auto_indexing","true").setConfig("node_keys_indexable","title,description")
.newGraphDatabase();
And then use it like this:
IndexHits<Node> nodes = db.index().forNodes( "node_auto_index").query("title:matr*");
for (Node n : nodes) {
// do something
}
// remember to close indexhits if you don't exhaust it
nodes.close();
Custom Configuration
You can also configure additional specifics for the fulltext index, like a custom
analyzer class, just pass it to the config.
{
"name" : "node_auto_index",
"config" : {
"type" : "fulltext",
"provider" : "lucene"
"to_lower_case" : true,
"analyzer" : "com.example.indexing.MyAnalyzer"
}
}
Published at DZone with permission of , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments