Over a million developers have joined DZone.

The Difference Between Prediction Intervals and Confidence Intervals

DZone's Guide to

The Difference Between Prediction Intervals and Confidence Intervals

· Big Data Zone
Free Resource

Learn best practices according to DataOps. Download the free O'Reilly eBook on building a modern Big Data platform.

Pre­dic­tion inter­vals and con­fi­dence inter­vals are not the same thing. Unfor­tu­nately the terms are often con­fused, and I am often fre­quently cor­rect­ing the error in stu­dents’ papers and arti­cles I am review­ing or editing.

A pre­dic­tion inter­val is an inter­val asso­ci­ated with a ran­dom vari­able yet to be observed, with a spec­i­fied prob­a­bil­ity of the ran­dom vari­able lying within the inter­val. For exam­ple, I might give an 80% inter­val for the fore­cast of GDP in 2014. The actual GDP in 2014 should lie within the inter­val with prob­a­bil­ity 0.8. Pre­dic­tion inter­vals can arise in Bayesian or fre­quen­tist statistics.

A con­fi­dence inter­val is an inter­val asso­ci­ated with a para­me­ter and is a fre­quen­tist con­cept. The para­me­ter is assumed to be non-​​random but unknown, and the con­fi­dence inter­val is com­puted from data. Because the data are ran­dom, the inter­val is ran­dom. A 95% con­fi­dence inter­val will con­tain the true para­me­ter with prob­a­bil­ity 0.95. That is, with a large num­ber of repeated sam­ples, 95% of the inter­vals would con­tain the true parameter.

A Bayesian con­fi­dence inter­val, also known as a “cred­i­ble inter­val”, is an inter­val asso­ci­ated with the pos­te­rior dis­tri­b­u­tion of the para­me­ter. In the Bayesian per­spec­tive, para­me­ters are treated as ran­dom vari­ables, and so have prob­a­bil­ity dis­tri­b­u­tions. Thus a Bayesian con­fi­dence inter­val is like a pre­dic­tion inter­val, but asso­ci­ated with a para­me­ter rather than an observation.

I think the dis­tinc­tion between pre­dic­tion and con­fi­dence inter­vals is worth pre­serv­ing because some­times you want to use both. For exam­ple, con­sider the regression

\[ y_i = \alpha + \beta x_i + e_i \>

where y_i is the change in GDP from quar­ter i-1 to quar­ter ix_i is the change in the unem­ploy­ment rate from quar­ter i-1 to quar­ter i, and e_i\sim\text{N}(0,\sigma^2). (This regres­sion model is known as Okun’s law in macro­eco­nom­ics.) In this case, both con­fi­dence inter­vals and pre­dic­tion inter­vals are inter­est­ing. You might be inter­ested in the con­fi­dence inter­val asso­ci­ated with the mean value of y when x=0; that is, the mean growth in GDP when the unem­ploy­ment rate does not change. You might also be inter­ested in the pre­dic­tion inter­val for y when x=0; that is, the likely range of future val­ues of GDP growth when the unem­ploy­ment rate does not change.

The dis­tinc­tion is mostly retained in the sta­tis­tics lit­er­a­ture. How­ever, in econo­met­rics it is com­mon to use “con­fi­dence inter­vals” for both types of inter­val (e.g.,Granger & New­bold, 1986). I once asked Clive Granger why he con­fused the two con­cepts, and he dis­missed my objec­tion as fuss­ing about triv­i­al­i­ties. I dis­agreed with him then, and I still do.

I have seen some­one com­pute a con­fi­dence inter­val for the mean, and use it as if it was a pre­dic­tion inter­val for a future obser­va­tion. The trou­ble is, con­fi­dence inter­vals for the mean are much nar­rower than pre­dic­tion inter­vals, and so this gave him an exag­ger­ated and false sense of the accu­racy of his fore­casts. Instead of the inter­val con­tain­ing 95% of the prob­a­bil­ity space for the future obser­va­tion, it con­tained only about 20%.

So I ask sta­tis­ti­cians to please pre­serve this dis­tinc­tion. And I ask econo­me­tri­cians to stop being so sloppy about ter­mi­nol­ogy. Unfor­tu­nately, I can’t con­tinue my debate with Clive Granger. I rather hoped he would come to accept my point of view.

Find the perfect platform for a scalable self-service model to manage Big Data workloads in the Cloud. Download the free O'Reilly eBook to learn more.


Published at DZone with permission of Rob J Hyndman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}