Over a million developers have joined DZone.

Fuzzy Puzzles: Having My Baby

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

A friend at work, Drew Fustin, proposed this puzzle in our group chat one day as I was meandering on about Bayesian shiny things.

“Alexis has a baby. Before she can even tell which gender it is the doctor sweeps the baby away into the nursery. Prior to this new baby being added there were two girls and an unknown number of boys. You pick a baby at random from the nursery now that Alexis’s baby has been added to it. Given that you happen to choose a girl, what is the probability Alexis gave birth to a girl?””

My first reaction was “How can the number of boys not matter?” Of course it doesn’t in the end and that’s why this is a great puzzle.

What are the knowns?

First we start with our knowns and unknowns. Let’s define the concepts we’ll be working with throughout solving the problem.

P( Ag ) = probability that Alexis had a girl. 
P( Ab ) = probability that Alexis had a boy. 
P( Cg ) = probability a girl is chosen from the nursery. 
P( Cb ) = probability a boy is chosen from the nursery. 
y = 2 girls + n boys + 1 new baby (boy or girl)

With this in mind, the probability we are solving for is P( Ag | Cg ) which is the probability Alexis had a girl given a girl is chosen from the nursery.

The question to answer: 
P( Ag | Cg ) = ?

The easy givens: 
P( Ag ) = .5 
P( Ab ) = .5 
P(Ag, Cg) = P(Ag | Cg) * P( Cg ) = P( Cg | Ag ) * P( Ag ) 
P(Ag | Cg) = P( Cg | Ag ) * P( Ag ) / P( Cg )

We can’t know the exact probability of choosing a girl since we don’t know exactly how many babies are in the nursery. Because of this we leave the unknown as a variable: y = number of babies in nursery after a new child is added 
P( Cg | Ag ) = 3/y 
P( Cg | Ab ) = 2/y

The subtle known

There’s one last subtle thing that we can know. We can create an equation that represents all of the possible scenarios. This is the “hard” given: 
1 = P( Ag, Cg ) + P( Ab, Cg ) + P( Ag, Cb ) + P( Ab, Cb )

We know that P(Cb) = 0 since P(Cg) = 1. Given that: 
1 = P( Ag, Cg ) + P( Ab, Cg ) 
1 = P(Ag) * P(Cg | Ag) + P(Ab) * P(Cg | Ab)

Conditional probabilities are fun because we could write the second line above that way or as 1 = P(Cg) * P(Ag | Cg) + P(Cg) * P(Ab | Cg). The reason I chose the way I did is because I can replace P(Cg | Ag) and P(Cg | Ab) with 3/y and 2/y respectively.

When the number of babies stops mattering

Next remember this statement above: 
P(Ag | Cg) = P( Cg | Ag ) * P( Ag ) / P( Cg )

and that P( Cg ) = 1 since we know we chose a girl from the nursery. We can now rearrange this equation and set it equal to the one above like this: 
1 = P( Cg | Ag ) * P( Ag ) / P(Ag | Cg) 
1 = P(Ag) * P(Cg | Ag) + P(Ab) * P(Cg | Ab)

Setting them equal: 
P( Cg | Ag ) * P( Ag ) / P(Ag | Cg) = P(Ag) * P(Cg | Ag) + P(Ab) * P(Cg | Ab)

Now substituting in our knowns: 
(3/y) * .5 / P(Ag | Cg) = .5 * (3/y) + .5 * (2/y)

Remember how we don’t know how many babies are in the nursery and labelled it as y? Watch them all cancel out in this step: 
3 * .5 / P(Ag | Cg) = .5 * 3 + .5 * 2

Solve for P(Ag | Cg) 
P(Ag | Cg) = (3 * .5) / (.5 * 3 + .5 * 2) 
P(Ag | Cg) = .6

So there it is! Given that we chose a girl from the nursery there is a 60% chance she gave birth to a girl!

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks


Published at DZone with permission of Justin Bozonier, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}