Solving for Probability From Entropy

DZone 's Guide to

Solving for Probability From Entropy

Entropy is a method that is basic to various machine learning classification tasks. Read on to learn more about it.

· AI Zone ·
Free Resource

If a coin comes up heads with probability p and tails with probability 1-p, the entropy in the coin flip is:

S = –p log2 p – (1-p) log2 (1-p).

It’s common to start with p and compute entropy, but recently, I had to go the other way around: given entropy, solve for p. It’s easy to come up with an approximate solution.Image title

Entropy, in this case, is approximately quadratic:

S ≈ 4p(1-p)

And so:

p ≈ (1 ± √(1-S))/2.

This is a good approximation if S is near 0 or 1 but mediocre in the middle. You could use solve for p numerically, say with Newton’s method, to get more accuracy if needed.

If you have any thoughts or questions, leave a comment down below.

Image titleUpdate

As Sjoerd Visscher pointed out in the comments on the original post, the quadratic approximation for entropy is much better if you raise it to the power 3/4. When I added this new approximation to the graph above, the new approximation agreed with the correct value to within the thickness of the plotting line.

To make the approximation error visible, here’s the log of the absolute value of the error of the two approximations on a log scale.

approximation error on log scale

The error in the new approximation is about an order of magnitude smaller, sometimes more.

The improved approximation for entropy is:

S ≈ (4p(1-p))3/4

So the new approximation for probability is:

p ≈ (1 ± √(1-S4/3))/2.

artificial intelligence, data science, entropy, machine learning, tutorial

Published at DZone with permission of John Cook , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}