# Bayes Factors vs. P-Values

# Bayes Factors vs. P-Values

Join the DZone community and get the full member experience.

Join For Free**Learn how to operationalize machine learning and data science projects to monetize your AI initiatives. Download the Gartner report now. **

Bayesian analysis and Frequentist analysis often lead to the same conclusions by different routes. But sometimes the two forms of analysis lead to starkly different conclusions.

The following illustration of this difference comes from a talk by Luis Pericci last week. He attributes the example to “Bernardo (2010)” though I have not been able to find the exact reference.

In an experiment to test the existence of extra sensory perception (ESP), researchers wanted to see whether a person could influence some process that emitted binary data. (I’m going from memory on the details here, and I have not found Bernardo’s original paper. However, you could ignore the experimental setup and treat the following as hypothetical. The point here is not to investigate ESP but to show how Bayesian and Frequentist approaches could lead to opposite conclusions.)

The null hypothesis was that the individual had no influence on the stream of bits and that the true probability of any bit being a 1 is *p* = 0.5. The alternative hypothesis was that *p* is not 0.5. There were *N* = 104,490,000 bits emitted during the experiment, and *s* = 52,263,471 were 1’s. The *p*-value, the probability of an imbalance this large or larger under the assumption that *p* = 0.5, is 0.0003. Such a tiny *p*-value would be regarded as extremely strong evidence in favor of ESP given the way *p*-values are commonly interpreted.

The Bayes factor, however, is 18.7, meaning that the null hypothesis appears to be about 19 times more likely than the alternative. The alternative in this example uses Jeffreys’ prior, Beta(0.5, 0.5).

So given the data and assumptions in this example, the Frequentist concludes there is very strong evidence **for** ESP while the Bayesian concludes there is strong evidence **against** ESP.

The following Python code shows how one might calculate the *p*-value and Bayes factor.

```
from scipy.stats import binom
from scipy import log, exp
from scipy.special import betaln
N = 104490000
s = 52263471
# sf is the survival function, i.e. complementary cdf
# ccdf multiplied by 2 because we're doing a two-sided test
print("p-value: ", 2*binom.sf(s, N, 0.5))
# Compute the log of the Bayes factor to avoid underflow.
logbf = N*log(0.5) - betaln(s+0.5, N-s+0.5) + betaln(0.5, 0.5)
print("Bayes factor: ", exp(logbf))
```

**Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Our Chief Data Scientist discusses the source of most headlines about AI failures here. **

Published at DZone with permission of John Cook , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

## {{ parent.title || parent.header.title}}

## {{ parent.tldr }}

## {{ parent.linkDescription }}

{{ parent.urlSource.name }}