# Are These "Staggering" Odds Really So Staggering?

# Are These "Staggering" Odds Really So Staggering?

Join the DZone community and get the full member experience.

Join For FreeHortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

I was supposed to take a holiday break, but Frédéric, professor in Tours, came back to me this morning with a tickling question. He asked me what were the odds that the Champions League draw produces exactly the same pairings from the practice draw, and the official one - an occurrence the Daily Mail describes as "staggering" at 2,000,000 to 1 odds?

To be honest, I don’t know much about soccer, so here is what happened, with the practice draw (on the left, on December 19th) and the official one (on the right, on December 20th),

Clearly, the pairs are identical, but not the order. Actually, at first, I was suprised that even which team plays at home first, was iddentical. But (it seams that) teams that play at home first are the ones that ended second after the previous stage of the competition.

And to be more specific about those draws, those pairs were obtained using real urns, real balls, so it is pure randomness (again, as far as I understood). But with very specific rules. For instance, two teams from the same country cannot play together (or one against the other) at this stage. Or teams that ended first after the previous turn can only play with (or against) teams that ended second. Actually, Frederic sent me an xls file, with a possibility matrix.

Let us find all possible pairs, regardless which team plays at home first (again, we do not care here since the order is defined by the rule mentioned above). Doing the maths might have been a bit complicated, with all those contraints. With a small code, it is possible to list all possible pairs, for those eight games. Let us import our possibility matrix,

> n=16 > uefa=read.table( + "http://freakonometrics.blog.free.fr/public/data/uefa.csv", + sep=",",header=TRUE) > LISTEIMPOSSIBLE=matrix( + (rep(1:n,n))*(uefa[1:n,2:(n+1)]=="NON"),n,n)

I can fix the first team (in my list, the fourth one is the first team that ended second). Then, I look at all possible second one (that will play with the first one),

> a1=1 > "%notin%" <- function(x, table){x[match(x, table, nomatch = 0) == 0]} > posa2=((a1+1):n)%notin%LISTEIMPOSSIBLE[,a1]

Then, consider the second team that ended second (the sixth one in my list). And look at all possible fourth team (that will play this second game), i.e exluding the one that were already drawn, and those that are not possible,

> b1=6 > posb2=(1:n)%notin%c(LISTEIMPOSSIBLE[,b1],a2)

Etc. So, given the list of home teams,

> a1=4 > b1=6 > c1=8 > d1=9 > e1=12 > f1=14 > g1=15 > h1=16

consider the following loops,

> posa2=(1:n)%notin%c(LISTEIMPOSSIBLE[,a1]) > for(a2 in posa2){ + posb2=(1:n)%notin%c(LISTEIMPOSSIBLE[,b1],a2) + for(b2 in posb2){ + posc2=(1:n)%notin%c(LISTEIMPOSSIBLE[,c1],a2,b2) + for(c2 in posc2){ + posd2=(1:n)%notin%c(LISTEIMPOSSIBLE[,d1],a2,b2,c2) + for(d2 in posd2){ + pose2=(1:n)%notin%c(LISTEIMPOSSIBLE[,e1],a2,b2,c2,d2) + for(e2 in pose2){ + posf2=(1:n)%notin%c(LISTEIMPOSSIBLE[,f1],a2,b2,c2,d2,e2) + for(f2 in posf2){ + posg2=(1:n)%notin%c(LISTEIMPOSSIBLE[,g1],a2,b2,c2,d2,e2,f2) + for(g2 in posg2){ + posh2=(1:n)%notin%c(LISTEIMPOSSIBLE[,h1],a2,b2,c2,d2,e2,f2,g2) + for(h2 in posh2){ + s=s+1 + V=c(a1,a2,b1,b2,c1,c2,d1,d2,e1,e2,f1,f2,g1,g2,h1,h2) + cat(s,V,"\n") + M=rbind(M,V) + }}}}}}}}

With the print option, we end up with

5461 4 13 6 11 8 5 9 2 12 10 14 3 15 7 16 1 5462 4 13 6 11 8 5 9 2 12 10 14 7 15 1 16 3 5463 4 13 6 11 8 5 9 2 12 10 14 7 15 3 16 1

i.e.

> nrow(M) [1] 5463

possible pairs (the list can be found here, where numbers are the same as the one in the csv file). Which was the probability mentioned in acomment in the article mentioned previously dailymail.co.uk/…. So the probability to have exactly the same output after the practise and the official draws was (in %)

> 100/nrow(M) [1] 0.01830496

Which is not *that* small when we think about it….

And if someone has a mathematical expression for this probability, I am interested. The only reliable method I found was to list all possible pairs (the csv file is available if someone wants to check). But I am not satisfied….

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub. Join the discussion.

Published at DZone with permission of Arthur Charpentier , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

## {{ parent.title || parent.header.title}}

## {{ parent.tldr }}

## {{ parent.linkDescription }}

{{ parent.urlSource.name }}