Ranking Systems: What I’ve Learned So Far
Join the DZone community and get the full member experience.Join For Free
I often go off on massive tangents reading all about a new topic, but I don’t record what I’ve read, so if I go back to the topic again in the future I have to start from scratch, which is quite frustrating.
I started off by reading a paper written by James Keener about the Perron-Frobenius Theorem and the ranking of American football teams.
The Perron-Frobenius Theorem asserts the following:
a real square matrix with positive entries has a unique largest real eigenvalue and that the corresponding eigenvector has strictly positive components
This is applicable for network-based ranking systems as we can build up a matrix of teams, store a value representing their performance against each other, and then calculate an ordered ranking based on eigenvector centrality.
I also came across the following articles describing different network-based approaches to ranking teams/players in tennis and basketball respectively:
- A network-based dynamical ranking system for competitive sports
- Using Graph Theory to Predict NCAA March Madness Basketball
Unfortunately I haven’t come across any corresponding code showing how to implement those algorithms, so I need to do a bit more reading and figure out how to do it.
In the world of non-network-based ranking systems I came across 3 algorithms:
- Elo – This is a method originally developed to calculate the relative skill of chess players.
Players start out with an average rating which then increases/decreases based on games they take part in. If they beat someone much more highly ranked, then they’d gain a lot of points, whereas losing to someone similarly ranked wouldn’t affect their ranking too much.
- Glicko – This method was developed as the author, Mark Glickman, detected some flaws in the Elo rating system around the reliability of players’ ratings.
This algorithm therefore introduces the concept of a ratings deviation (RD) to measure uncertainty in a rating. If a player plays regularly they’d have a low RD and if they don’t, it’d be higher. This is then taken into account when assigning points based on games between different players.
- TrueSkill – This one was developed by Microsoft Research to rank players using Xbox Live. This seems similar to Glicko in that it has a rating and uncertainty for each player. TrueSkill’s FAQs suggest the following difference between the two:
Glicko was developed as an extension of ELO and was thus naturally limited to two player matches which end in either win or loss. Glicko cannot update skill levels of players if they compete in multi-player events or even in teams. The logistic model would make it computationally expensive to deal with team and multi-player games. Moreover, chess is usually played in pre-set tournaments and thus matching the right opponents was not considered a relevant problem in Glicko. In contrast, the TrueSkill ranking system offers a way to measure the quality of a match between any set of players.
Scott Hamilton has an implementation of all these algorithms in Python that I need to play around with. He based his algorithms on a blog post written by Jeff Moser in which he explains probabilities, the Gaussian distribution, Bayesian probability and factor graphs in deciphering the TrueSkill algorithm. Moser has created a project implementing TrueSkill in C# on GitHub.
I follow tennis and football reasonably close so I thought I’d do a bit of reading about the main two rankings I know about there as well:
- UEFA club coefficients – used to rank football clubs that have taken part in a European competition over the last five seasons. It takes into account the importance of the match but not the strength of the opposition
- ATP Tennis Rankings – used to rank tennis players on a rolling basis over the last 12 months. They take into account the importance of a tournament and the round a player reached to assign ranking points.
Now that I’ve recorded all that it’s time to go and play with some of them!
Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.