Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Neo4j & Cypher: Finding Movies by Decade

DZone's Guide to

Neo4j & Cypher: Finding Movies by Decade

· Java Zone
Free Resource

Microservices! They are everywhere, or at least, the term is. When should you use a microservice architecture? What factors should be considered when making that decision? Do the benefits outweigh the costs? Why is everyone so excited about them, anyway?  Brought to you in partnership with IBM.

I was recently asked how to find the number of movies produced per decade in the movie data set that comes with the Neo4j browser and can be imported with the following command:

:play movies

We want to get one row per decade and have a count alongside so the easiest way is to start with one decade and build from there.

MATCH (movie:Movie)
WHERE movie.released >= 1990 and movie.released <= 1999
RETURN 1990 + "-" + 1999 as years, count(movie) AS movies
ORDER BY years

Note that we’re doing a label scan of all nodes of type Movie as there are no indexes for range queries. In this case it’s fine as we have few movies but If we had 100s of thousands of movies then we’d want to optimise the WHERE clause to make use of an IN which would then use any indexes.

If we run the query we get the following result:

==> +----------------------+
==> | years       | movies |
==> +----------------------+
==> | "1990-1999" | 21     |
==> +----------------------+
==> 1 row

Let’s pull out the start and end years so they’re explicitly named:

WITH 1990 AS startDecade, 1999 AS endDecade
MATCH (movie:Movie)
WHERE movie.released >= startDecade and movie.released <= endDecade
RETURN startDecade + "-" + endDecade as years, count(movie)
ORDER BY years

Now we need to create a collection of start and end years so we can return more than one. We can use the UNWIND function to take a collection of decades and run them through the rest of the query:

UNWIND [{start: 1970, end: 1979}, {start: 1980, end: 1989}, {start: 1980, end: 1989}, {start: 1990, end: 1999}, {start: 2000, end: 2009}, {start: 2010, end: 2019}] AS row
WITH row.start AS startDecade, row.end AS endDecade
MATCH (movie:Movie)
WHERE movie.released >= startDecade and movie.released <= endDecade
RETURN startDecade + "-" + endDecade as years, count(movie)
ORDER BY years
==> +----------------------------+
==> | years       | count(movie) |
==> +----------------------------+
==> | "1970-1979" | 2            |
==> | "1980-1989" | 2            |
==> | "1990-1999" | 21           |
==> | "2000-2009" | 13           |
==> | "2010-2019" | 1            |
==> +----------------------------+
==> 5 rows

Alistair pointed out that we can simplify this even further by using the RANGE function:

UNWIND range(1970,2010,10) as startDecade
WITH startDecade, startDecade + 9 as endDecade
MATCH (movie:Movie)
WHERE movie.released >= startDecade and movie.released <= endDecade
RETURN startDecade + "-" + endDecade as years, count(movie)
ORDER BY years

And here’s a graph gist for you to play with.

Discover how the Watson team is further developing SDKs in Java, Node.js, Python, iOS, and Android to access these services and make programming easy. Brought to you in partnership with IBM.

Topics:

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}