Big Data: Olympics Swimming Lap Charts
Big Data: Olympics Swimming Lap Charts
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
Tony Hirst offers some insight into a series of data visualizations published recently in the New York Times. The content of this article was originally published on his blog, OUseful.Info
Part of the promise of sports data journalism is the ability to use data from an event to enrich the reporting of that event. One of the graphical devices widely used in motor racing is the lap chart, which shows the relative positions of each car at the end of each lap:
Another, more complex chart, and one that can be quite hard to read when you first come across it, is the race history chart, which shows the laptime of each car relative to the average laptime (calculated over the whole of the race) of the race winner:
Both of these charts can be used to illustrate the progression of a race, and even in some cases to identify stories that might otherwise have been missed (particularly races amongst back markers, for example). For Olympics events particularly, where reporting is often at a local level (national and local press reporting on the progression of their athletes, as well as the winning athletes), timing data may be one of the few sources available for finding out what actually happened to a particular competitor who didn’t feature in the typical coverage that's focussed on the head of the race.
I’ve also experimented with some other views, including a race summary chart that captures the start position, end of first lap position, final position and range of positions held at the end of each lap by each driver:
One of the ways of using this chart is as a quick summary of the race position chart, as well as a tool for highlighting possible “driver of the day” candidates.
A rich lap chart might also be used to convey information about the distance between cars as well as their relative positions. Here’s one experiment I tried (using Gephi to visualise the data) in which node size is proportional to the time to the lead car, and the colour is related to the time to the car behind (when red is hot, the car behind is close):
(You might also be able to imagine a variant of this chart where we fix the y-value so each row shows data relating to one particular driver. Looking along a row then allows us to see how exciting a race they had.)
All of these charts can be calculated from lap time data. Some of them can be calculated from data describing the position held by each competitor at the end of each lap. But whatever the case, the data is what drives the visualisation.
A part of me was hoping to see laptime data for Olympics track, swimming and cycling events, but if any exists, I haven’t found a reliable source yet. What found encouraging, though, was that the New York Times (in many ways one of the organisations that is seeing the value of using visualised data-driven storytelling in its daily activities) made some split time data available – and put it to work – in the swimming events:
Here, the NYT gives split data showing the times achieved in each leg by the relay team members, along with a lap chart that has a higher level of detail, showing the position of each team at the end of each 50 meter length (I think?!). The progression of each of the medal winners is highlighted using an appropriate colour theme.
The chart provides an illustration that can be used to help a reporter identify different stories about how the race progressed, whether or not it is included in the final piece. The graphic can also be used as a sidebar illustration of a race report.
Lap charts also lend themselves to interactive views, or highlighted customisations that can be used to illustrate competition between selected individuals. Here’s another F1 example, this time from the f1fanatic blog:
(I have to admit, I prefer this sort of chart with grayed options for the unhighlighted drivers because it gives a better sense of the position churn happening elsewhere in the race.)
Of course, without the data generating these charts can be difficult …
… which is to say: if you know where lap data is for any of the 2012 Olympics events, please post a link to the source in the comments below :-)
Published at DZone with permission of Eric Genesky . See the original article here.
Opinions expressed by DZone contributors are their own.