Have you ever wondered about how they schedule trains? They need to minimize the commune between any 2 relevant train stations, while complying to their resource limits: train sizes, train speed, physical location of train sets, employee working hours, employee qualifications, occupancy of train station platforms, weekly maintenance, ...
Did you ever try to plan a meeting, just to find that on the one day that all participants can attend, all meeting rooms are occupied? Or there's no available beamer?
The world is full of planning problems.
In this article I'll focus on solving a very simple lesson schedule:
- A school has teachers, student groups and a bunch of lessons (which combine 1 teacher with 1 student group). Those lessons need to fit into a limited number of timeslots.
- The goal is to avoid conflicts: a teacher can't teach 2 lessons in the same timeslot and a student group can't attend 2 lessons in the same timeslot either.
Size does matter
A planning problem has a score function, which determines the score for any possible solution we can construct. Based on that score, we can compare solutions and determine the optimal solution. There is usually only 1 optimal solution, but the number of possible solutions is huge.
For example, let's take a medium school with 30 student groups, 40 teachers and 33 timeslots. Each student group has a lesson in each timeslot, which makes 990 lessons. In how many distinct ways can those 990 lessons be scheduled into 33 timeslots? In other words, how many possible solutions does it have? Well, this example has 33 ^ 990 or a little over 2e1503 possible solutions. Yes, for a medium school, that number is a 2 with 1503 zero's behind it!
Calculating the score of a single solution can take quite some time, as every teacher and student group needs to checked on conflicts with any other teacher or student group. Due the sheer size of the search space and real-world time constraints, it normally suffices to find a “good enough” solution, better than those of human planners.
There are different ways to solve a planning problem:
Random: There's no overarching entity which plans the system. For example, traffic is solved almost randomly: everyone just picks their route and departure time as they please. If you ever drove through a major city, once during rush hour and once an hour after rush hour, you know this is nowhere near optimal.
Exhaustive: Through brute force every possible solution is evaluated to determine the optimal solution. In practice this simply doesn't work: if we could evaluate a billion possible solutions each millisecond, it would still take over 6e1483 years to evaluate all the possible solutions for our medium school. Even smart exhaustive algorithms, such as branch and bound, take ages.
Deterministic: The resources are put into the planning based on simple rules. The solution is far from optimal, but it's sometimes “good enough”. For example, patients in a clinic are treated in a FIFO queue (first come, first serve), but emergency patients get moved to the front of the queue. In our lesson schedule example, we could schedule one lesson at a time by assigning it a timeslot that doesn't already have a lesson with the same teacher or student group. However, we will more then likely end up with lessons that can't be assigned.
Metaheuristic: The algorithm wades through the possible solutions in an intelligent manner and remembers the best solution it encounters. It decides which solutions to visit, guided by the score of the already encountered solutions. There are various metaheuristic algorithms, such as tabu search, simulated annealing, ant colony optimization, genetic algorithms, ... For example, a greedy algorithm starts from a random solution and evaluates every solution that moves 1 lesson to a different timeslot, and it takes the best of those solutions as the new solution to continue from.
In practice, it's usually recommended to start with a deterministic algorithm and improve upon that solution with a metaheuristic algorithm.
Introducing Drools Solver
Drools Solver is an open source Java library to help you solve planning problems, by using metaheuristic algorithms. The API isn't completely stable yet, but there's already a full manual and 5 extensive examples available.
Drools Solver uses the Drools rule engine for score calculation, which greatly reduces the complexity and effort to write very scalable constraints in a declarative manner. In our lesson schedule example, we could implement a teacher conflict like this:
$l : Lesson($id : id, $t : teacher, $ts : timeslot);
exists Lesson(id > $id, teacher == $t, timeslot == $ts);
Notice how the rule to detect student group conflicts isn't mixed into this one: it's completely separated. And because the rules don't define how they should be detected, it's left up to the Drools rule engine to use for-loops, hash tables or something much more efficient (like a ReteOO network).
Drools Solver supports several algorithms to find a solution in a limited amount of time. Currently its supports hill climbing, several forms of tabu search, simulated annealing and great deluge. A tabu search configuration for our lesson schedule example with a time limit of 5 minutes, looks like this:
You can easily switch to the simulated annealing metaheuristic by changing a few lines in the configuration. There's even a benchmarker utility which you can use to play out different configurations against each other and determine the best one for your use case.
Want to know more about Drools Solver? Take a look at the manual or run the examples of the download package. Your feed back is always welcome on the Drools user mailing list.
Reference manual: http://users.telenet.be/geoffrey/tmp/solver/manual/html_single/
Mailing list through Gmane: http://dir.gmane.org/gmane.comp.java.drools.user
About the author
Geoffrey De Smet is the main developer of Drools Solver. He finished 4th in the International Timetabling Competition 2007's examination track with an early version of Drools Solver.