Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Neo4j / Cypher: Converting Queries from 1.9 to 2.0

DZone's Guide to

Neo4j / Cypher: Converting Queries from 1.9 to 2.0

· Java Zone ·
Free Resource

Verify, standardize, and correct the Big 4 + more– name, email, phone and global addresses – try our Data Quality APIs now at Melissa Developer Portal!

I’ve been playing around with the most recent Neo4j 2.0 milestone release – 2.0.0-M05 – and one of the first things I did was translate the queries from my football data set which were written against Neo4j 1.9.

The following query calculates the number of goals scored by players in matches that were shown on television, not on television and in total.

START player=node:players('name:*')
MATCH player-[:played|subbed_on]->stats-[:in]->game-[t?:on_tv]->channel
WITH COLLECT([stats.goals, TYPE(t)]) AS games, player
RETURN player.name,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE HEAD(TAIL(g)) IS NULL)| goals + HEAD(g')) AS nonTvGoals,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE HEAD(TAIL(g)) <> NULL)| goals + HEAD(g')) AS tvGoals,
       REDUCE(goals = 0, g' in games | goals + HEAD(g')) AS totalGoals
ORDER BY tvGoals DESC
LIMIT 10

We use a legacy index to get all of the players, find the games they participated in, check if those games were on television and then group by player to work out whether the goals were scored in televised games.

When we evaluate that query we get the following result:

==> +--------------------------------------------------------+
==> | player.name        | nonTvGoals | tvGoals | totalGoals |
==> +--------------------------------------------------------+
==> | "Robin Van Persie" | 11         | 15      | 26         |
==> | "Gareth Bale"      | 8          | 13      | 21         |
==> | "Luis Suárez"      | 12         | 11      | 23         |
==> | "Theo Walcott"     | 5          | 9       | 14         |
==> | "Demba Ba"         | 7          | 8       | 15         |
==> | "Edin Dzeko"       | 7          | 7       | 14         |
==> | "Santi Cazorla"    | 5          | 7       | 12         |
==> | "Juan Mata"        | 6          | 6       | 12         |
==> | "Steven Gerrard"   | 3          | 6       | 9          |
==> | "Carlos Tevez"     | 5          | 6       | 11         |
==> +--------------------------------------------------------+
==> 10 rows

The first step was to get rid of the legacy index and replace it with a label based one. I created a label ‘Player’ for this purpose:

START player = node:players('name:*') SET player :Player RETURN player
CREATE INDEX on :Player(name)

Then I got rid of the legacy index lookup and replaced it with a label based one:

MATCH (player:Player)-[:played|subbed_on]->stats-[:in]->game-[t?:on_tv]->channel
WITH COLLECT([stats.goals, TYPE(t)]) AS games, player
RETURN player.name,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE HEAD(TAIL(g)) IS NULL)| goals + HEAD(g')) AS nonTvGoals,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE HEAD(TAIL(g)) <> NULL)| goals + HEAD(g')) AS tvGoals,
       REDUCE(goals = 0, g' in games | goals + HEAD(g')) AS totalGoals
ORDER BY tvGoals DESC
LIMIT 10

Unfortunately you can’t use optional relationships like this:

PatternException: Can't use optional patterns without explicit START clause. Optional relationships: `t`

A neat workaround which Andres showed me is to first match all of the players and then make them available using a WITH statement:

MATCH (player:Player)
WITH player
MATCH player-[:played|subbed_on]->stats-[:in]->game-[t?:on_tv]->channel
WITH COLLECT([stats.goals, TYPE(t)]) AS games, player
RETURN player.name,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE HEAD(TAIL(g)) IS NULL)| goals + HEAD(g')) AS nonTvGoals,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE HEAD(TAIL(g)) <> NULL)| goals + HEAD(g')) AS tvGoals,
       REDUCE(goals = 0, g' in games | goals + HEAD(g')) AS totalGoals
ORDER BY tvGoals DESC
LIMIT 10

If we evaluate this query it returns us the same results as before. However, given that we now have better collection support it seemed a shame to still use HEAD and TAIL so I replaced those with the following:

MATCH (player:Player)
WITH player
MATCH player-[:played|subbed_on]->stats-[:in]->game-[t?:on_tv]->channel
WITH COLLECT([stats.goals, TYPE(t)]) AS games, player
RETURN player.name,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE g[1] IS NULL)| goals + g'[0]) AS nonTvGoals,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE g[1] <> NULL)| goals + g'[0]) AS tvGoals,
       REDUCE(goals = 0, g' in games | goals + g'[0]) AS totalGoals
ORDER BY tvGoals DESC
LIMIT 10

I didn’t like those array indexes either and since we can now create maps I thought the query would be clearer if we created one on line 4 of the query rather than an array:

MATCH (player:Player)
WITH player
MATCH player-[:played|subbed_on]->stats-[:in]->game-[t?:on_tv]->channel
WITH COLLECT({goals: stats.goals, ontv: TYPE(t)}) AS games, player
RETURN player.name,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE g.ontv IS NULL)| goals + g'.goals) AS nonTvGoals,
       REDUCE(goals = 0, g' IN FILTER(g IN games WHERE g.ontv <> NULL)| goals + g'.goals) AS tvGoals,
       REDUCE(goals = 0, g' in games | goals + g'.goals) AS totalGoals
ORDER BY tvGoals DESC
LIMIT 10

The next tweak would be to remove the FILTER and replace that with a list comprehension:

MATCH (player:Player)
WITH player
MATCH player-[:played|subbed_on]->stats-[:in]->game-[t?:on_tv]->channel
WITH COLLECT({goals: stats.goals, ontv: TYPE(t)}) AS games, player
RETURN player.name,
       REDUCE(goals = 0, g' IN [g IN games WHERE g.ontv IS NULL] | goals + g'.goals) AS nonTvGoals,
       REDUCE(goals = 0, g' IN [g IN games WHERE g.ontv <> NULL] | goals + g'.goals) AS tvGoals,
       REDUCE(goals = 0, g' in games | goals + g'.goals) AS totalGoals
ORDER BY tvGoals DESC
LIMIT 10

Although the query ends up being a couple of lines longer than its 1.9 cousin I think there’s less noise and the intent is clearer.


Developers! Quickly and easily gain access to the tools and information you need! Explore, test and combine our data quality APIs at Melissa Developer Portal – home to tools that save time and boost revenue. Our APIs verify, standardize, and correct the Big 4 + more – name, email, phone and global addresses – to ensure accurate delivery, prevent blacklisting and identify risks in real-time.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}