Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Scheduling Meetings With Neo4j

DZone's Guide to

Scheduling Meetings With Neo4j

Scheduling can be a pain even on the best of days. Let's see if Neo4j can come to the rescue and help make this task an easier one.

· Database Zone ·
Free Resource

Running out of memory? Learn how Redis Enterprise enables large dataset analysis with the highest throughput and lowest latency while reducing costs over 75%! 

One of the symptoms of any fast-growing company is the lack of available meeting rooms. The average office worker gets immense satisfaction to their otherwise mundane workday when they get to kick someone else out of the meeting room they booked. Of course, that joy can be cut short (along with their career) once realizing some unnoticed VIP was unceremoniously kicked out. It's not a super exciting use case, but today, I'm going to show you how to use Neo4j to perform some scheduling gymnastics.

Let's start with what the data model looks like:

So, we have a Person that sits in a Cubicle that is located in a Floor that has meeting Rooms where Meetings are booked and these Meetings are attended by People. That's a nice circle model right there. Let's build an example with three people each in their own cubicle, two floors, four meeting rooms, two in each floor, and a bunch of meetings. We'll also have one of the people booked in one of the existing meetings. We will use Longs for the times representing Unix Epoc Time in milliseconds. In Neo4j 3.4, we will have legitimate date and datetime data types, so you will be able to create date times like localdatetime({year:1984, month:10, day:11, hour:12, minute:31, second:14}) instead of this hot mess, but regardless, here is the Cypher for this example:

CREATE (person1:Person {name: "Max"})
CREATE (person2:Person {name: "Alex"})
CREATE (person3:Person {name: "Andrew"})
CREATE (cube1A:Cubicle {name: "F1A"})
CREATE (cube1B:Cubicle {name: "F1B"})
CREATE (cube2A:Cubicle {name: "F2A"})
CREATE (floor1:Floor {name: "Floor 1"})
CREATE (floor2:Floor {name: "Floor 2"})
CREATE (person1)-[:SITS_IN]->(cube1A)
CREATE (person2)-[:SITS_IN]->(cube1B)
CREATE (person3)-[:SITS_IN]->(cube1C)
CREATE (cube1A)-[:LOCATED_IN]->(floor1)
CREATE (cube1B)-[:LOCATED_IN]->(floor1)
CREATE (cube1C)-[:LOCATED_IN]->(floor2)
CREATE (room1:Room {name:"Room 1"})
CREATE (room2:Room {name:"Room 2"})
CREATE (room3:Room {name:"Room 3"})
CREATE (room4:Room {name:"Room 4"})
CREATE (room1)-[:LOCATED_IN]->(floor1)
CREATE (room2)-[:LOCATED_IN]->(floor1)
CREATE (room3)-[:LOCATED_IN]->(floor2)
CREATE (room4)-[:LOCATED_IN]->(floor2)
CREATE (m1:Meeting {start_time: 1521534600000, end_time:1521538200000}) // 8:30-9:30am
CREATE (m2:Meeting {start_time: 1521543600000, end_time:1521550800000}) // 11-1pm
CREATE (m3:Meeting {start_time: 1521550800000, end_time:1521558000000}) // 1-3pm
CREATE (m4:Meeting {start_time: 1521534600000, end_time:1521543600000}) // 8:30-11am
CREATE (m5:Meeting {start_time: 1521550800000, end_time:1521554400000}) // 1-2pm
CREATE (m6:Meeting {start_time: 1521561600000, end_time:1521565200000}) // 4-5pm
CREATE (m7:Meeting {start_time: 1521558000000, end_time:1521561600000}) // 3-4pm
CREATE (room1)-[:IS_BOOKED_ON_2018_03_20]->(m1)
CREATE (room1)-[:IS_BOOKED_ON_2018_03_20]->(m2)
CREATE (room1)-[:IS_BOOKED_ON_2018_03_20]->(m3)
CREATE (room2)-[:IS_BOOKED_ON_2018_03_20]->(m4)
CREATE (room2)-[:IS_BOOKED_ON_2018_03_20]->(m5)
CREATE (room2)-[:IS_BOOKED_ON_2018_03_20]->(m6)
CREATE (room4)-[:IS_BOOKED_ON_2018_03_20]->(m7)
CREATE (person2)-[:HAS_MEETING_ON_2018_03_20]->(m7)

This Cypher script creates this lovely set of data:

By looking at it, we can see how it is all connected, but it is not immediately obvious what times in what rooms people are able to meet. That is going to be our question. Give a set of meeting attendees and a datetime range and find me the available meeting times in the rooms that are in the same floor as at least one of the attendees. Let's try building this query together once piece at a time. So, the first thing I want to do is find out what time ranges I have to eliminate because one of the attendees is already booked for another meeting.

MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
RETURN p, m
╒═════════════════╤═════════════════════════════════════════════════════╕
│"p"              │"m"                                                  │
╞═════════════════╪═════════════════════════════════════════════════════╡
│{"name":"Max"}   │null                                                 │
├─────────────────┼─────────────────────────────────────────────────────┤
│{"name":"Alex"}  │{"end_time":1521561600000,"start_time":1521558000000}│
├─────────────────┼─────────────────────────────────────────────────────┤
│{"name":"Andrew"}│null                                                 │
└─────────────────┴─────────────────────────────────────────────────────┘

So, it looks like Alex is busy from 3-4 PM. Next, we need to figure out where everyone sits, what floor they are in, and what rooms we are able to meet in. So, our query looks like this:

MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
RETURN DISTINCT r
ORDER BY r.name

This gets us Rooms 1-4 as expected:

╒═════════════════╕
│"r"              │
╞═════════════════╡
│{"name":"Room 1"}│
├─────────────────┤
│{"name":"Room 2"}│
├─────────────────┤
│{"name":"Room 3"}│
├─────────────────┤
│{"name":"Room 4"}│
└─────────────────┘

OK, so far, so good. Now, we need to know if those rooms have already been booked for other meetings today.

MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, m ORDER BY m.start_time
RETURN r.name, COLLECT(DISTINCT m) AS meetings
ORDER BY r.name

This query tells us that Room 1 has three meetings scheduled; Room 2 has three, as well; Room 3 is wide open; Room 4 just has one. But it is really hard to see the actual times since they are shown as Longs.

╒════════╤══════════════════════════════════════════════════════════════════════╕
│"r.name"│"meetings"                                                            │
╞════════╪══════════════════════════════════════════════════════════════════════╡
│"Room 1"│[{"end_time":1521538200000,"start_time":1521534600000},{"end_time":152│
│        │1550800000,"start_time":1521543600000},{"end_time":1521558000000,"star│
│        │t_time":1521550800000}]                                               │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 2"│[{"end_time":1521543600000,"start_time":1521534600000},{"end_time":152│
│        │1554400000,"start_time":1521550800000},{"end_time":1521565200000,"star│
│        │t_time":1521561600000}]                                               │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 3"│[]                                                                    │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 4"│[{"end_time":1521561600000,"start_time":1521558000000}]               │
└────────┴──────────────────────────────────────────────────────────────────────┘

If you are using the Neo4j APOC plugin, we can use the apoc.date.format function to make them friendlier. In Neo4j 3.4, you will be able to use a built-in function datetime.FromEpochMillis for the same thing.

MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, m ORDER BY m.start_time
RETURN r.name, EXTRACT (x IN COLLECT(DISTINCT m) | apoc.date.format(x.start_time,'ms','HH:mm') 
                                            + ' to ' + apoc.date.format(x.end_time,'ms','HH:mm')) AS meetings
ORDER BY r.name

Here we go; now, that is way more readable:

╒════════╤════════════════════════════════════════════════════╕
│"r.name"│"meetings"                                          │
╞════════╪════════════════════════════════════════════════════╡
│"Room 1"│["08:30 to 09:30","11:00 to 13:00","13:00 to 15:00"]│
├────────┼────────────────────────────────────────────────────┤
│"Room 2"│["08:30 to 11:00","13:00 to 14:00","16:00 to 17:00"]│
├────────┼────────────────────────────────────────────────────┤
│"Room 3"│[]                                                  │
├────────┼────────────────────────────────────────────────────┤
│"Room 4"│["15:00 to 16:00"]                                  │
└────────┴────────────────────────────────────────────────────┘

Alright, let's combine the two queries together and see what rooms we can meet in and what times we can't meet in those rooms because they are either already booked, or one of our attendees is busy:

MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
WITH COLLECT(m) AS occupied
MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r, occupied
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, COLLECT(DISTINCT m) + occupied AS meetings
UNWIND meetings AS m
WITH r, m ORDER BY m.start_time
RETURN r.name, EXTRACT (x IN COLLECT(m) | apoc.date.format(x.start_time,'ms','HH:mm') 
                               + ' to ' + apoc.date.format(x.end_time,'ms','HH:mm')) AS meetings
ORDER BY r.name
╒════════╤═════════════════════════════════════════════════════════════════════╕
│"r.name"│"meetings"                                                           │
╞════════╪═════════════════════════════════════════════════════════════════════╡
│"Room 1"│["08:30 to 09:30","11:00 to 13:00","13:00 to 15:00","15:00 to 16:00"]│
├────────┼─────────────────────────────────────────────────────────────────────┤
│"Room 2"│["08:30 to 11:00","13:00 to 14:00","15:00 to 16:00","16:00 to 17:00"]│
├────────┼─────────────────────────────────────────────────────────────────────┤
│"Room 3"│["15:00 to 16:00"]                                                   │
├────────┼─────────────────────────────────────────────────────────────────────┤
│"Room 4"│["15:00 to 16:00","15:00 to 16:00"]                                  │
└────────┴─────────────────────────────────────────────────────────────────────┘

Now, we could stop here and let our application mark those times as unavailable and call it a day. But what we really want is the opposite of that. We want the times that the rooms and attendees are available. So, how do we figure that out? Well, for each meeting, we want to find the next meeting start time for each room. The time slot between meetings is what we are after, defined by the entry's end time and the start time of the next event. To perform this, we are going to use a double-unwind, which is basically "for each thing in the list, I want to pair it (get a cross product) with every other thing in the list." Typically, this is the last thing you want to do since making a cross product can be very expensive, but it makes perfect sense for this query. We only care about the times where one meeting start time is greater than or equal to the other end time, and from these, we will grab our time slot as the query below shows:

MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
WITH COLLECT(m) AS occupied
MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r, occupied
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, [{start_time:1521565200000, end_time:1521534600000}] + COLLECT(m) + occupied AS meetings
UNWIND meetings AS m
WITH r, [min(m.start_time), max(m.end_time)] AS rslot, COLLECT(m) AS mm
WITH r, rslot, mm
UNWIND mm AS m1
UNWIND mm AS m2
WITH r, rslot, m1, m2 WHERE (m2.start_time >= m1.end_time)
WITH r, rslot, [m1.end_time, min(m2.start_time)] AS slot
ORDER BY slot[0]
RETURN r.name, EXTRACT (x IN COLLECT(slot) | apoc.date.format(x[0],'ms','HH:mm') 
                               + ' to ' + apoc.date.format(x[1],'ms','HH:mm')) AS available
ORDER BY r.name

Our output looks close, but it's not quite there. Rooms 3 and 4 look correct, but for Room 1 and 2, we have start time and end times that are the same:

╒════════╤══════════════════════════════════════════════════════════════════════╕
│"r.name"│"available"                                                           │
╞════════╪══════════════════════════════════════════════════════════════════════╡
│"Room 1"│["08:30 to 08:30","09:30 to 11:00","13:00 to 13:00","15:00 to 15:00","│
│        │16:00 to 17:00"]                                                      │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 2"│["08:30 to 08:30","11:00 to 13:00","14:00 to 15:00","16:00 to 16:00","│
│        │17:00 to 17:00"]                                                      │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 3"│["08:30 to 15:00","16:00 to 17:00"]                                   │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 4"│["08:30 to 15:00","16:00 to 17:00"]                                   │
└────────┴──────────────────────────────────────────────────────────────────────┘

So, let's fix that by not allowing any slots that start and end at the same time:

MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
WITH COLLECT(m) AS occupied
MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r, occupied
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, [{start_time:1521565200000, end_time:1521534600000}] + COLLECT(m) + occupied AS meetings
UNWIND meetings AS m
WITH r, [min(m.start_time), max(m.end_time)] AS rslot, COLLECT(m) AS mm
WITH r, rslot, mm
UNWIND mm AS m1
UNWIND mm AS m2
WITH r, rslot, m1, m2 WHERE (m2.start_time >= m1.end_time)
WITH r, rslot, [m1.end_time, min(m2.start_time)] AS slot
ORDER BY slot[0]
WITH r, [[1521534600000, rslot[0]]] + collect(slot) + [[rslot[1], 1521565200000]] AS open
WITH r, filter(x IN open WHERE x[0]<>x[1]) AS available
UNWIND available AS dups
WITH r, COLLECT(DISTINCT dups) AS tslots
RETURN r.name AS Room , EXTRACT (x IN tslots | apoc.date.format(x[0],'ms','HH:mm') 
                                    + ' to ' + apoc.date.format(x[1],'ms','HH:mm')) AS Available
ORDER BY r.name

...and there we go:

╒════════╤═══════════════════════════════════╕
│"Room"  │"Available"                        │
╞════════╪═══════════════════════════════════╡
│"Room 1"│["09:30 to 11:00","16:00 to 17:00"]│
├────────┼───────────────────────────────────┤
│"Room 2"│["11:00 to 13:00","14:00 to 15:00"]│
├────────┼───────────────────────────────────┤
│"Room 3"│["08:30 to 15:00","16:00 to 17:00"]│
├────────┼───────────────────────────────────┤
│"Room 4"│["08:30 to 15:00","16:00 to 17:00"]│
└────────┴───────────────────────────────────┘

Pretty neat, right? To be totally honest, I didn't come up with this query by myself. I had a ton of help from Alex Price and Andrew Bowman.

I asked Michael Hunger, and he had another idea: ordering the meeting times and using lists and ranges instead of a double unwind to get the same answer. Here, he is also using apoc.date.parse('2018-03-20 08:30:00') instead of 1521534600000 to make the query more readable. Yes, these dates will be much nicer to work with in Neo4j 3.4... I can't wait, either.

MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
WITH COLLECT(m) AS occupied
MATCH (p:Person)‐[:SITS_IN]‐>(c:Cubicle)‐[:LOCATED_IN]‐>(f:Floor)<‐[:LOCATED_IN]‐(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH DISTINCT r, occupied
OPTIONAL MATCH (r)‐[:IS_BOOKED_ON_2018_03_20]‐>(m:Meeting)
WITH r, occupied + COLLECT(m {.start_time, .end_time}) AS meetings
UNWIND meetings AS m
WITH r, m order by m.start_time
WITH r, COLLECT(m) as meetings
WITH r,meetings,
               {end_time:apoc.date.parse('2018-03-20 08:30:00')} + 
               meetings + 
               {start_time:apoc.date.parse('2018-03-20 17:00:00')} AS bookedSlots
WITH r, meetings,[idx in range(0,size(bookedSlots)-2) | {start_time:(bookedSlots[idx]).end_time,end_time:(bookedSlots[idx+1]).start_time}] as allSlots
WITH r, meetings,[slot IN allSlots WHERE slot.end_time - slot.start_time > 10*60*1000] as openSlots
WITH r, [slot IN openSlots WHERE 
              NONE(m IN meetings WHERE slot.start_time < m.start_time < slot.end_time 
                                    OR slot.start_time < m.end_time < slot.end_time)] as freeSlots
RETURN r, [slot IN freeSlots | apoc.date.format(slot.start_time,'ms','HH:mm')+" to "+apoc.date.format(slot.end_time,'ms','HH:mm')] as free
ORDER BY r.name;

If you want expert help with your Cypher queries (and anything else Neo4j), be sure to join our Neo4j Users Slack Group, where over 7,500 Neo4j users hang out.

Running out of memory? Never run out of memory with Redis Enterprise databaseStart your free trial today.

Topics:
neo4j ,graph database ,cypher ,scheduling ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}