# Data Analysis vs. Statistics

# Data Analysis vs. Statistics

### John Tukey preferred the term “data analysis” over “statistics.” It seems Tukey was reserving the term “statistics” for that portion of data analysis which is rigorously based on probability.

Join the DZone community and get the full member experience.

Join For FreeJohn Tukey preferred the term “data analysis” over “statistics.” In his paper *Data Anaysis, Computation and Mathematics*, he explains why.

My title speaks of “data analysis” not “statistics”, and of “computation” not “computing science”; it does not speak of “mathematics”, but only last. Why? …

My brother-in-squared-law, Francis J. Anscombe has commented on my use of “data analysis” in the following words:

Whereas the content of Tukey’s remarks is always worth pondering, some of his terminology is hard to take. He seems to identify “statistics” with the grotesque phenomenon generally known as “mathematical statistics”, and finds it necessary to replace “statistical analysis” with “data analysis.”

(Tukey calls Anscombe his “brother-in-squared-law” because Anscombe was a fellow statistician as well as his brother-in-law. At first I thought Tukey had said “brother-in-law-squared”, which could mean his brother-in-law’s brother-in-law, but I suppose it was a pun on the role of least-square methods in statistics.)

Tukey later says

I … shall stick to this attitude today, and shall continue to use the words “data analysis”, in part to indicate that we can take probability seriously, or leave it alone, as may from time to time be appropriate or necessary.

It seems Tukey was reserving the term “statistics” for that portion of data analysis which is rigorously based on probability.

Published at DZone with permission of John Cook , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

## {{ parent.title || parent.header.title}}

{{ parent.tldr }}

## {{ parent.linkDescription }}

{{ parent.urlSource.name }}