Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

DZone's Guide to

# Neo4j 2.1.2: Finding Where I Am in a Linked List

· Java Zone ·
Free Resource

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Download Microservices for Java Developers: A hands-on introduction to frameworks and containers. Brought to you in partnership with Red Hat.

I was recently asked how to calculate the position of a node in a linked list and realized that as the list increases in size, this is one of the occasions when we should write an unmanaged extension, rather than using Cypher.

I wrote a quick bit of code to create a linked list with 10,000 elements in it:

```public class Chains
{
public static void main(String[] args)
{
String simpleChains = "/tmp/longchains";
populate( simpleChains, 10000 );
}

private static void populate( String path, int chainSize )
{
GraphDatabaseService db = new GraphDatabaseFactory().newEmbeddedDatabase( path );
try(Transaction tx = db.beginTx()) {
Node currentNode = null;
for ( int i = 0; i < chainSize; i++ )
{
Node node = db.createNode();

if(currentNode != null) {
currentNode.createRelationshipTo( node, NEXT );
}
currentNode = node;
}
tx.success();
}

db.shutdown();
}
}```

To find our distance from the end of the linked list we could write the following cypher query:

```match n  where id(n) = {nodeId}  with n
match path = (n)-[:NEXT*]->()
RETURN id(n) AS nodeId, length(path) AS length
ORDER BY length DESC
LIMIT 1;```

For simplicity we’re finding a node by it’s internal node id and then finding the ‘NEXT’ relationships going out from this node recursively. We then filter the results so that we only get the longest path back which will be our distance to the end of the list.

I noticed that this query would sometimes take 10s of seconds so I wrote a version using the Java Traversal API to see whether I could get it any quicker.

This is the Java version:

```try(Transaction tx = db.beginTx()) {
Node startNode = db.getNodeById( nodeId );
TraversalDescription traversal = db.traversalDescription();
Traverser traverse = traversal
.depthFirst()
.relationships( NEXT, Direction.OUTGOING )
.sort( new Comparator<Path>()
{
@Override
public int compare( Path o1, Path o2 )
{
return Integer.valueOf( o2.length() ).compareTo( o1 .length() );
}
} )
.traverse( startNode );

Collection<Path> paths = IteratorUtil.asCollection( traverse );

int maxLength = traverse.iterator().next().length();
System.out.print( maxLength );

tx.failure();
}```

This is a bit more verbose than the cypher version but computes the same result. We’ve sorted the paths by length using a comparator to ensure we get the longest path back first.

I created a little program to warm up the caches and kick off a few iterations where I queried from different nodes and returned the length and time taken. These were the results:

```--------
(Traversal API) Node:    1, Length: 9998, Time (ms):  15
(Cypher) Node:    1, Length: 9998, Time (ms): 26225
(Traversal API) Node:  456, Length: 9543, Time (ms):  10
(Cypher) Node:  456, Length: 9543, Time (ms): 24881
(Traversal API) Node:  761, Length: 9238, Time (ms):   9
(Cypher) Node:  761, Length: 9238, Time (ms): 9941
--------
(Traversal API) Node:    1, Length: 9998, Time (ms):   9
(Cypher) Node:    1, Length: 9998, Time (ms): 12537
(Traversal API) Node:  456, Length: 9543, Time (ms):   8
(Cypher) Node:  456, Length: 9543, Time (ms): 15690
(Traversal API) Node:  761, Length: 9238, Time (ms):   7
(Cypher) Node:  761, Length: 9238, Time (ms): 9202
--------
(Traversal API) Node:    1, Length: 9998, Time (ms):   8
(Cypher) Node:    1, Length: 9998, Time (ms): 11905
(Traversal API) Node:  456, Length: 9543, Time (ms):   7
(Cypher) Node:  456, Length: 9543, Time (ms): 22296
(Traversal API) Node:  761, Length: 9238, Time (ms):   8
(Cypher) Node:  761, Length: 9238, Time (ms): 8739
--------```

Interestingly when I reduced the size of the linked list to 1000 the difference wasn’t so pronounced:

```--------
(Traversal API) Node:    1, Length: 998, Time (ms):   5
(Cypher) Node:    1, Length: 998, Time (ms): 174
(Traversal API) Node:  456, Length: 543, Time (ms):   2
(Cypher) Node:  456, Length: 543, Time (ms):  71
(Traversal API) Node:  761, Length: 238, Time (ms):   1
(Cypher) Node:  761, Length: 238, Time (ms):  13
--------
(Traversal API) Node:    1, Length: 998, Time (ms):   2
(Cypher) Node:    1, Length: 998, Time (ms): 111
(Traversal API) Node:  456, Length: 543, Time (ms):   1
(Cypher) Node:  456, Length: 543, Time (ms):  40
(Traversal API) Node:  761, Length: 238, Time (ms):   1
(Cypher) Node:  761, Length: 238, Time (ms):  12
--------
(Traversal API) Node:    1, Length: 998, Time (ms):   3
(Cypher) Node:    1, Length: 998, Time (ms): 129
(Traversal API) Node:  456, Length: 543, Time (ms):   2
(Cypher) Node:  456, Length: 543, Time (ms):  48
(Traversal API) Node:  761, Length: 238, Time (ms):   0
(Cypher) Node:  761, Length: 238, Time (ms):  12
--------```

which is good news as most linked lists that we’ll create will be in the 10s – 100s range rather than 10,000 which was what I was faced with.

I’m sure cypher will reach parity for this type of query in future which will be great as I like writing cypher much more than I do Java. For now though it’s good to know we have a backup option to call on when necessary.

The code is available as a gist if you want to play around with it further.

Download Building Reactive Microservices in Java: Asynchronous and Event-Based Application Design. Brought to you in partnership with Red Hat

Topics:

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.