Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Thoughts on Software Development Neo4j: Cypher -- Step by Step to Create a Linked List of Adjacent Nodes Using UNWIND

DZone's Guide to

Thoughts on Software Development Neo4j: Cypher -- Step by Step to Create a Linked List of Adjacent Nodes Using UNWIND

sing Neo4j 2.2's cyper query language rather than 2.0, we can actually solve this problem in what I believe is a more intuitive way using the UNWIND function.

· Database Zone
Free Resource

Running out of memory? Learn how Redis Enterprise enables large dataset analysis with the highest throughput and lowest latency while reducing costs over 75%! 

In late 2013 I wrote a post showing how to create a linked list connecting different football seasons together using Neo4j’s Cypher query language, a post I’ve frequently copy & pasted from!

Now 18 months later, and using Neo4j 2.2 rather than 2.0, we can actually solve this problem in what I believe is a more intuitive way using the UNWIND function. Credit for the idea goes to Michael, I’m just the messenger.

To recap, we had a collection of football seasons and we wanted to connect adjacent seasons to each other to allow easy querying between seasons. The following is the code we used:

1. CREATE (:Season {name: "2013/2014", timestamp: 1375315200})
2. CREATE (:Season {name: "2012/2013", timestamp: 1343779200})
3. CREATE (:Season {name: "2011/2012", timestamp: 1312156800})
4. CREATE (:Season {name: "2010/2011", timestamp: 1280620800})
5. CREATE (:Season {name: "2009/2010", timestamp: 1249084800})
01. MATCH (s:Season)
02. WITH s
03. ORDER BY s.timestamp
04. WITH COLLECT(s) AS seasons
05.   
06. FOREACH(i in RANGE(0, length(seasons)-2) |
07.     FOREACH(si in [seasons[i]] |
08.         FOREACH(si2 in [seasons[i+1]] |
09.             MERGE (si)-[:NEXT]->(si2))))

Our goal is to replace those 3 FOREACH loops with something a bit easier to understand. To start with, let’s run the first part of the query to get some intuition of what we’re trying to do:

01. MATCH (s:Season)
02. WITH s
03. ORDER BY s.timestamp
04. RETURN COLLECT(s) AS seasons
05.   
06. ==> +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
07. ==> | seasons                                                                                                                                                                                                                                                     |
08. ==> +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
09. ==> | [Node[1973]{timestamp:1249084800,name:"2009/2010"},Node[1972]{timestamp:1280620800,name:"2010/2011"},Node[1971]{timestamp:1312156800,name:"2011/2012"},Node[1970]{timestamp:1343779200,name:"2012/2013"},Node[1969]{timestamp:1375315200,name:"2013/2014"}] |
10. ==> +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

So at this point we’ve got all the seasons in an array going from 2009/2010 up to 2013/2014. We want to create a ‘NEXT’ relationship between 2009/2010 -> 2010/2011, 2010/2011 -> 2011/2012 and so on.

To achieve this we need to get the adjacent seasons split into two columns, like so:

1. 2009/2010   2010/2011
2. 2010/2011   2011/2012
3. 2011/2012   2012/2013
4. 2012/2013   2013/2014

If we can get the data into that format then we can apply a MERGE between the two fields to create the ‘NEXT’ relationship. So how do we do that?

If we were in Python we’d be calling for the zip function which we could apply like this:

1. >>> seasons = ["2009/2010", "2010/2011", "2011/2012", "2012/2013", "2013/2014"]
2.   
3. >>> zip(seasons, seasons[1:])
4. [('2009/2010', '2010/2011'), ('2010/2011', '2011/2012'), ('2011/2012', '2012/2013'), ('2012/2013', '2013/2014')]

Unfortunately we don’t have an equivalent function in Cypher but we can achieve the same outcome by creating 2 columns with adjacent integer values. The RANGE and UNWIND functions are our friends here:

1. returnRANGE(0,4)
2.   
3. ==> +-------------+
4. ==> | RANGE(0,4)  |
5. ==> +-------------+
6. ==> | [0,1,2,3,4] |
7. ==> +-------------+
01. UNWIND RANGE(0,4) as idx
02. RETURN idx, idx +1;
03.   
04. ==> +--------------+
05. ==> | idx | idx +1|
06. ==> +--------------+
07. ==> | 0   | 1      |
08. ==> | 1   | 2      |
09. ==> | 2   | 3      |
10. ==> | 3   | 4      |
11. ==> | 4   | 5      |
12. ==> +--------------+
13. ==> 5rows

Now all we need to do is plug this code into our original query where ‘idx’ and ‘idx + 1′ represent indexes into the array of seasons. We use a range which stops 1 element early since there isn’t anywhere to connect our last season to:

01. MATCH (s:Season)
02. WITH s
03. ORDER BY s.timestamp
04. WITH COLLECT(s) AS seasons
05. UNWIND RANGE(0,LENGTH(seasons) - 2) as idx
06. RETURN seasons[idx], seasons[idx+1]
07.   
08. ==> +-------------------------------------------------------------------------------------------------------+
09. ==> | seasons[idx]                                      | seasons[idx+1]                                    |
10. ==> +-------------------------------------------------------------------------------------------------------+
11. ==> | Node[1973]{timestamp:1249084800,name:"2009/2010"} | Node[1972]{timestamp:1280620800,name:"2010/2011"} |
12. ==> | Node[1972]{timestamp:1280620800,name:"2010/2011"} | Node[1971]{timestamp:1312156800,name:"2011/2012"} |
13. ==> | Node[1971]{timestamp:1312156800,name:"2011/2012"} | Node[1970]{timestamp:1343779200,name:"2012/2013"} |
14. ==> | Node[1970]{timestamp:1343779200,name:"2012/2013"} | Node[1969]{timestamp:1375315200,name:"2013/2014"} |
15. ==> +-------------------------------------------------------------------------------------------------------+
16. ==> 4rows

Now we’ve got all the adjacent seasons lined up we complete the query with a call to MERGE:

01. MATCH (s:Season)
02. WITH s
03. ORDER BY s.timestamp
04. WITH COLLECT(s) AS seasons
05. UNWIND RANGE(0,LENGTH(seasons) - 2) as idx
06. WITH seasons[idx] AS s1, seasons[idx+1] AS s2
07. MERGE (s1)-[:NEXT]->(s2)
08.   
09. ==> +-------------------+
10. ==> | No data returned. |
11. ==> +-------------------+
12. ==> Relationships created: 4

And we’re done. Hopefully I can remember this approach more than I did the initial one!

Running out of memory? Never run out of memory with Redis Enterprise databaseStart your free trial today.

Topics:
nosql ,neo4j 2.2 ,unwind ,database

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}