An Elegant Matching Algorithm In Ruby
Join the DZone community and get the full member experience.
Join For FreeRecently, I wanted to sit down a learn ruby (independent of rails), so I
grabbed a fairly standard hacking problem and went to town on it. I
now love ruby more than ever.
The Problem:
Given a set
of people and a set of jobs, where each person may take a different
amount of time to do each job, optimally match people to jobs (1:1) to
minimize the amount of time it will take to complete all jobs.
Many people will recognize this as a standard matching problem for which the Hungarian Algorithm
is a known solution. But for those that have implemented the Hungarian
Algorithm (or seen implementations of it), you know well enough to
steer clear. It is error prone, and a very specific algorithm. So, I
sought to implement a more elegant (and generally applicable) solution
using graphs.
I found this article over on topcoder describing max-flow algorithms and the beauty of such. I fell in love and decided that I needed to solve this with max-flow.
The formulation:
Lets convert our problem to a bipartite graph.
Let one set of nodes be the people, and the other set of nodes be the
jobs. Create an edge from each person to each job, with a weight (NOT
capacity!) equal to the time it will take that person to do that job.
In
our situation, the capacity for each line is one since only one person
can do a job. Flow is ofcourse initialized to zero for each edge (no one
is doing any of the jobs). Lastly, we connect every job to a SINK node
in the graph.
The philosophy and algorithm: (the important part)
Essentially,
we'll be finding shortest paths in the graph, from each person to the
SINK (via jobs) iterating through each of the people. On each
iteration, we augment the graph with the path.
To recap the
topcoder article, augmenting the graph consists of incrementing the flow
for each edge in the augmenting path and adjusting edges to represent
the new flow/capacity. There is an edge that represents "residual
capacity" with capacity == capacity - flow, and there is an edge in the
reverse direction that represents "upstream flow" with capacity == flow.
REMEMBER, in our case:
Capacity is ALWAYS 1.
Flow is either 1 or 0.
This is ENTIRELY independent of the cost/value/weight of an edge.
What
does that mean to us you say? Well, in our case there are only two
situations, a person is assigned to a job (flow == 1), or a person is
not assigned ot a job (flow == 0). In the first case, where a person is
assigned a job, there is an edge from the job to the person. During the
algorithm, this edge essentially represents the path to UNDO the
assignment. In the second case, the edge simply represents the
making that assignment.
Source: http://brianoneill.blogspot.com/2009/07/elegant-matching-algorithm-in-ruby.html
Opinions expressed by DZone contributors are their own.
Comments