Neo4j & Cypher: Flatten a Collection
Join the DZone community and get the full member experience.
Join For FreeEvery now and then in Cypher land we’ll end up with a collection of arrays, often created via the COLLECT function, that we want to squash down into one array.
For example let’s say we have the following array of arrays…
$ RETURN [[1,2,3], [4,5,6], [7,8,9]] AS result; ==> +---------------------------+ ==> | result | ==> +---------------------------+ ==> | [[1,2,3],[4,5,6],[7,8,9]] | ==> +---------------------------+ ==> 1 row
…and we want to return the array [1,2,3,4,5,6,7,8,9].
Many programming languages have a ‘flatten’ function and although cypher doesn’t we can make our own by using the REDUCE function:
$ WITH [[1,2,3], [4,5,6], [7,8,9]] AS result RETURN REDUCE(output = [], r IN result | output + r) AS flat; ==> +---------------------+ ==> | flat | ==> +---------------------+ ==> | [1,2,3,4,5,6,7,8,9] | ==> +---------------------+ ==> 1 row
Here we’re passing the array ‘output’ over the collection and adding the individual arrays ([1,2,3], [4,5,6] and [7,8,9]) to that array as we iterate over the collection.
If we’re working with numbers in Neo4j 2.0.1 we’ll get this type exception with this version of the code:
==> SyntaxException: Type mismatch: expected Any, Collection<Any> or Collection<Collection<Any>> but was Integer (line 1, column 148)
We can easily work around that by coercing the type of ‘output’ like so:
WITH [[1,2,3], [4,5,6], [7,8,9]] AS result RETURN REDUCE(output = range(0,-1), r IN result | output + r);
Of course this is quite a simple example but we can handle more complicated scenarios as well by using nested calls to REDUCE. For example let’s say we wanted to completely flatten this array:
$ RETURN [[1,2,3], [4], [5, [6, 7]], [8,9]] AS result; ==> +-------------------------------+ ==> | result | ==> +-------------------------------+ ==> | [[1,2,3],[4],[5,[6,7]],[8,9]] | ==> +-------------------------------+ ==> 1 row
We could write the following cypher code:
$ WITH [[1,2,3], [4], [5, [6, 7]], [8,9]] AS result RETURN REDUCE(output = [], r IN result | output + REDUCE(innerOutput = [], innerR in r | innerOutput + innerR)) AS flat; ==> +---------------------+ ==> | flat | ==> +---------------------+ ==> | [1,2,3,4,5,6,7,8,9] | ==> +---------------------+ ==> 1 row
Here we have an outer REDUCE function which iterates over [1,2,3], [4], [5, [6,7]] and [8,9] and then an inner REDUCE function which iterates over those individual arrays.
If we had more nesting then we could just introduce another level of nesting!
Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments