DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Join us tomorrow at 1 PM EST: "3-Step Approach to Comprehensive Runtime Application Security"
Save your seat
  1. DZone
  2. Data Engineering
  3. Databases
  4. Neo4j/Cypher: Combining COUNT and COLLECT in One Query

Neo4j/Cypher: Combining COUNT and COLLECT in One Query

Mark Needham user avatar by
Mark Needham
·
Feb. 27, 13 · Interview
Like (0)
Save
Tweet
Share
9.99K Views

Join the DZone community and get the full member experience.

Join For Free

Curator's Note: Check raw code snippets to see properly formatted tables.

In my continued playing around with football data I wanted to write a cypher query against neo4j which would show me which teams had missed the most penalties this season and who missed them.

I started off with a query that returned all the penalties that have been missed this season and the games those missed happened in:

START player = node:players('name:*')
MATCH player-[:missed_penalty_in]-game, 
      player-[:played|subbed_on]-stats-[:in]-game,
      stats-[:for]-team,
      game-[:home_team]-home,
      game-[:away_team]-away
RETURN player.name, team.name, home.name, away.name

+-------------------------------------------------------------------------------------------------+
| player.name          | team.name              | home.name              | away.name              |
+-------------------------------------------------------------------------------------------------+
| "Papiss Demba Cisse" | "Newcastle United"     | "Newcastle United"     | "Norwich City"         |
| "Wayne Rooney"       | "Manchester United"    | "Manchester United"    | "Arsenal"              |
| "Mikel Arteta"       | "Arsenal"              | "Arsenal"              | "Fulham"               |
| "David Silva"        | "Manchester City"      | "Manchester City"      | "Southampton"          |
| "Frank Lampard"      | "Chelsea"              | "Manchester City"      | "Chelsea"              |
| "Adel Taarabt"       | "Queens Park Rangers"  | "Queens Park Rangers"  | "Norwich City"         |
| "Javier Hernández"   | "Manchester United"    | "Manchester United"    | "Wigan Athletic"       |
| "Robin Van Persie"   | "Manchester United"    | "Southampton"          | "Manchester United"    |
| "Jonathan Walters"   | "Stoke City"           | "Fulham"               | "Stoke City"           |
| "Shane Long"         | "West Bromwich Albion" | "West Bromwich Albion" | "Liverpool"            |
| "Steven Gerrard"     | "Liverpool"            | "Liverpool"            | "West Bromwich Albion" |
| "Lucas Piazon"       | "Chelsea"              | "Chelsea"              | "Aston Villa"          |
+-------------------------------------------------------------------------------------------------+
12 rows

(there should actually be another penalty miss for Jonathan Walters against Chelsea but for some reason the data source has missed it off!

I then grouped the penalty misses by team so that I’d have one row for each team and a collection showing the people who’d missed.

We can use the COLLECT function to do the latter:

START player = node:players('name:*')
MATCH player-[:missed_penalty_in]-game, 
      player-[:played|subbed_on]-stats-[:in]-game,
      stats-[:for]-team
RETURN DISTINCT team.name, COLLECT(player.name) AS players

I wanted to order the teams by the number of penalties they’d missed so Manchester United would be first in the table in this case and initially tried to order the results by a count of players:
START player = node:players('name:*')
MATCH player-[:missed_penalty_in]-game, 
      player-[:played|subbed_on]-stats-[:in]-game,
      stats-[:for]-team
RETURN DISTINCT team.name, COLLECT(player.name) AS players
ORDER BY COUNT(player.name)

which doesn’t actually compile:

SyntaxException: Aggregation expressions must be listed in the RETURN clause to be used in ORDER BY

I tried a few other variations such as the following:

START player = node:players('name:*')
MATCH player-[:missed_penalty_in]-game, 
      player-[:played|subbed_on]-stats-[:in]-game,
      stats-[:for]-team
RETURN DISTINCT team.name, COUNT(player.name) AS numberOfPlayers, 
       COLLECT(player.name) AS players
ORDER BY numberOfPlayers DESC

which again doesn’t compile:

SyntaxException: Aggregation expressions must be listed in the RETURN clause to be used in ORDER BY

I eventually found a post by Andres where he explains that you need to split the query into two and make use of WITH if you want to make use of two aggregation expressions.

I ended up with the following query which does the job:

START player = node:players('name:*')
MATCH player-[:missed_penalty_in]-game, 
      player-[:played|subbed_on]-stats-[:in]-game,
      stats-[:for]-team
WITH DISTINCT team, COLLECT(player.name) AS players
 
MATCH player-[:missed_penalty_in]-game, 
      player-[:played|subbed_on]-stats-[:in]-game,
      stats-[:for]-team
WITH DISTINCT team, COUNT(player) AS numberOfPlayers, players
 
RETURN team.name, players
ORDER BY numberOfPlayers DESC
+---------------------------------------------------------------------------------+
| team.name              | players                                                |
+---------------------------------------------------------------------------------+
| "Manchester United"    | ["Wayne Rooney","Javier Hernández","Robin Van Persie"] |
| "Chelsea"              | ["Frank Lampard","Lucas Piazon"]                       |
| "Liverpool"            | ["Steven Gerrard"]                                     |
| "Manchester City"      | ["David Silva"]                                        |
| "Newcastle United"     | ["Papiss Demba Cisse"]                                 |
| "Queens Park Rangers"  | ["Adel Taarabt"]                                       |
| "Stoke City"           | ["Jonathan Walters"]                                   |
| "Arsenal"              | ["Mikel Arteta"]                                       |
| "West Bromwich Albion" | ["Shane Long"]                                         |
+---------------------------------------------------------------------------------+
9 rows




Database

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Top 10 Secure Coding Practices Every Developer Should Know
  • 5 Factors When Selecting a Database
  • Better Performance and Security by Monitoring Logs, Metrics, and More
  • OpenID Connect Flows

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: