Over a million developers have joined DZone.

The Limits of Statistics

DZone's Guide to

The Limits of Statistics

· Java Zone ·
Free Resource

Bring content to any platform with the open-source BloomReach CMS. Try for free.

When statisticians analyze data, they don’t just look at the data you bring to them. They also consider hypothetical data that you could have brought. In other words, they consider what could have happened as well as what actually did happen.

This may seem strange, and sometimes it does lead to strange conclusions. But often it is undeniably the right thing to do. It also leads to endless debates among statisticians. The cause of the debates lies at the root of statistics.

The central dogma of statistics is that data should be viewed as realizations of random variables. This has been a very fruitful idea, but it has its limits. It’s a reification of the world. And like all reifications, it eventually becomes invisible to those who rely on it.

Data are what they are. In order to think of the data as having come from a random process, you have to construct a hypothetical process that could have produced the data. Sometimes there is near universal agreement on how this should be done. But often different statisticians create different hypothetical worlds in which to place the data. This is at the root of such arguments as how to handle multiple testing.

You can debunk any conclusion by placing the data in a large enough hypothetical model. Suppose it’s Jake’s birthday, and when he comes home, there are Scrabble tiles on the floor spelling out “Happy birthday Jake.” You might conclude that someone arranged the tiles to leave him a birthday greeting. But if you are so inclined, you could attribute the apparent pattern to chance. You could argue that there are many people around the world who have dropped bags of Scrabble tiles, and eventually something like this was bound to happen. If that seems to be an inadequate explanation, you could take a “many worlds” approach and posit entire new universes. Not only are people dropping Scrabble tiles in this universe, they’re dropping them in countless other universes too. We’re only remarking on Jake’s apparent birthday greeting because we happen to inhabit the universe in which it happened.

BloomReach CMS: the API-first CMS of the future. Open-source & enterprise-grade. - As a Java developer, you will feel at home using Maven builds and your favorite IDE (e.g. Eclipse or IntelliJ) and continuous integration server (e.g. Jenkins). Manage your Java objects using Spring Framework, write your templates in JSP or Freemarker. Try for free.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}