Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

DZone's Guide to

# More on "Staggering Odds" - Visualizing Probabilities

· Big Data Zone ·
Free Resource

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Following my previous post, a few more things. As mentioned by Frédéric, it is – indeed – possible to compute the probability of all pairs. More precisely, all pairs are not as likely to occur: some teams can play against (almost) eveyone, while others cannot. From the previous table, it is possible to compute probability that the last team plays against team 1. Or team 2 (numbers are from the  xls file mentioned previously). To make it simple

```> table(M[,2*n])/length(M[,2*n])*100

1        2        3        5        7       10       11
11.82500 12.61212 12.61212 13.25279 19.31173 18.70767 11.67856```

Here, the last team (as I did rank them) has 11.8% chances to play against team 1, and 19.3% to play against team 7. If we compute all the probabilities, we obtain

```> S
1     2     3     5     7    10    11    13
4   0.00 14.16 14.16  0.00 22.22 21.25 13.05 15.13
6  12.52 13.19 13.19 14.11 20.13  0.00 12.35 14.47
8  18.78  0.00 19.54 21.50  0.00  0.00 18.39 21.76
9  18.78 19.54  0.00 21.50  0.00  0.00 18.39 21.76
12 14.68 15.54 15.54 16.56  0.00 23.19 14.47  0.00
14 11.64 12.37 12.37 13.05 18.96 18.25  0.00 13.34
15 11.77 12.55 12.55  0.00 19.36 18.59 11.64 13.50
16 11.82 12.61 12.61 13.25 19.31 18.70 11.67  0.00```

that can be visualized below

White areas cannot be reached, while red ones are more likely. Here, we compute probability that home team (given on the x-axis) plays against some visitor team (on the y-axis). The fact that those probabilities are not uniform seems odd. But I guess it comes from those constraints…

Another weird point: it is possible to reach a deadlock. At least with the technique I have been using. So far, I did not count them. But we can, simply the following code

```> U=c(4,6,8,9,12,14,15,16)
> a1=U[1]
> b1=U[2]
> c1=U[3]
> d1=U[4]
> e1=U[5]
> f1=U[6]
> g1=U[7]
> h1=U[8]
> a2=b2=c2=d2=e2=f2=g2=h2=NA
> posa2=(1:n)%notin%c(LISTEIMPOSSIBLE[,a1])
> if(length(posa2)==0){na=na+1}
> for(a2 in posa2){
+ posb2=(1:n)%notin%c(LISTEIMPOSSIBLE[,b1],a2)
+ if(length(posb2)==0){na=na+1}
+ for(b2 in posb2){
+ posc2=(1:n)%notin%c(LISTEIMPOSSIBLE[,c1],a2,b2)
+ if(length(posc2)==0){na=na+1}
+ for(c2 in posc2){
+ posd2=(1:n)%notin%c(LISTEIMPOSSIBLE[,d1],
+ a2,b2,c2)
+ if(length(posd2)==0){na=na+1}
+ for(d2 in posd2){
+ pose2=(1:n)%notin%c(LISTEIMPOSSIBLE[,e1],
+ a2,b2,c2,d2)
+ if(length(pose2)==0){na=na+1}
+ for(e2 in pose2){
+ posf2=(1:n)%notin%c(LISTEIMPOSSIBLE[,f1],
+ a2,b2,c2,d2,e2)
+ if(length(posf2)==0){na=na+1}
+ for(f2 in posf2){
+ posg2=(1:n)%notin%c(LISTEIMPOSSIBLE[,g1],
+ a2,b2,c2,d2,e2,f2)
+ if(length(posg2)==0){na=na+1}
+ for(g2 in posg2){
+ posh2=(1:n)%notin%c(LISTEIMPOSSIBLE[,h1],
+ a2,b2,c2,d2,e2,f2,g2)
+ if(length(posh2)==0){na=na+1}
+ for(h2 in posh2){
+ s=s+1
+ V=c(a1,a2,b1,b2,c1,c2,d1,d2,e1,e2,f1,f2,g1,g2,h1,h2)
+ }}}}}}}}```

On the initial ordering of home team, the number of deadlocks was

```> na
[1] 657```

The probability of obtaining a deadlock is then

```> 657/(657+5463)
[1] 0.1073529```

(657 scenarios ended in a dead end, while 5463 ended well). The worst case was obtained when we considered

` [1]    6    4   16   14   12   15    8    9`

In that case, the probability of obtaining a deadlock was

```> 4047/(4047+5463)
[1] 0.4255521```

Here, it clearly depends on the ordering. So if we draw – randomly – the order of the home teams, i.e.

`> Urandom=sample(U,size=8)`

the distribution of the probablity of having a deadlock is

All those computations were based on my understanding of the drawings. But Kristof (aka @ciebiera), on his blog krzysztofciebiera.blogspot.ca/… obtained different results. For instance, based on my previous computations, the probability to obtain identical pairs was 0.018349% (1 chance out of 5463), but Kristof obtained – based on the UEFA procedure (as he called it) – a probability of 0.0181337%. Which is not _ strictly – the same, but both computations yield relatively close results…

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.