Over a million developers have joined DZone.
Platinum Partner

The Difference Between Prediction Intervals and Confidence Intervals

· Big Data Zone

The Big Data Zone is presented by Exaptive.  Learn how rapid data application development can address the data science shortage.

Pre­dic­tion inter­vals and con­fi­dence inter­vals are not the same thing. Unfor­tu­nately the terms are often con­fused, and I am often fre­quently cor­rect­ing the error in stu­dents’ papers and arti­cles I am review­ing or editing.

A pre­dic­tion inter­val is an inter­val asso­ci­ated with a ran­dom vari­able yet to be observed, with a spec­i­fied prob­a­bil­ity of the ran­dom vari­able lying within the inter­val. For exam­ple, I might give an 80% inter­val for the fore­cast of GDP in 2014. The actual GDP in 2014 should lie within the inter­val with prob­a­bil­ity 0.8. Pre­dic­tion inter­vals can arise in Bayesian or fre­quen­tist statistics.

A con­fi­dence inter­val is an inter­val asso­ci­ated with a para­me­ter and is a fre­quen­tist con­cept. The para­me­ter is assumed to be non-​​random but unknown, and the con­fi­dence inter­val is com­puted from data. Because the data are ran­dom, the inter­val is ran­dom. A 95% con­fi­dence inter­val will con­tain the true para­me­ter with prob­a­bil­ity 0.95. That is, with a large num­ber of repeated sam­ples, 95% of the inter­vals would con­tain the true parameter.

A Bayesian con­fi­dence inter­val, also known as a “cred­i­ble inter­val”, is an inter­val asso­ci­ated with the pos­te­rior dis­tri­b­u­tion of the para­me­ter. In the Bayesian per­spec­tive, para­me­ters are treated as ran­dom vari­ables, and so have prob­a­bil­ity dis­tri­b­u­tions. Thus a Bayesian con­fi­dence inter­val is like a pre­dic­tion inter­val, but asso­ci­ated with a para­me­ter rather than an observation.

I think the dis­tinc­tion between pre­dic­tion and con­fi­dence inter­vals is worth pre­serv­ing because some­times you want to use both. For exam­ple, con­sider the regression

\[ y_i = \alpha + \beta x_i + e_i \>

where y_i is the change in GDP from quar­ter i-1 to quar­ter ix_i is the change in the unem­ploy­ment rate from quar­ter i-1 to quar­ter i, and e_i\sim\text{N}(0,\sigma^2). (This regres­sion model is known as Okun’s law in macro­eco­nom­ics.) In this case, both con­fi­dence inter­vals and pre­dic­tion inter­vals are inter­est­ing. You might be inter­ested in the con­fi­dence inter­val asso­ci­ated with the mean value of y when x=0; that is, the mean growth in GDP when the unem­ploy­ment rate does not change. You might also be inter­ested in the pre­dic­tion inter­val for y when x=0; that is, the likely range of future val­ues of GDP growth when the unem­ploy­ment rate does not change.

The dis­tinc­tion is mostly retained in the sta­tis­tics lit­er­a­ture. How­ever, in econo­met­rics it is com­mon to use “con­fi­dence inter­vals” for both types of inter­val (e.g.,Granger & New­bold, 1986). I once asked Clive Granger why he con­fused the two con­cepts, and he dis­missed my objec­tion as fuss­ing about triv­i­al­i­ties. I dis­agreed with him then, and I still do.

I have seen some­one com­pute a con­fi­dence inter­val for the mean, and use it as if it was a pre­dic­tion inter­val for a future obser­va­tion. The trou­ble is, con­fi­dence inter­vals for the mean are much nar­rower than pre­dic­tion inter­vals, and so this gave him an exag­ger­ated and false sense of the accu­racy of his fore­casts. Instead of the inter­val con­tain­ing 95% of the prob­a­bil­ity space for the future obser­va­tion, it con­tained only about 20%.

So I ask sta­tis­ti­cians to please pre­serve this dis­tinc­tion. And I ask econo­me­tri­cians to stop being so sloppy about ter­mi­nol­ogy. Unfor­tu­nately, I can’t con­tinue my debate with Clive Granger. I rather hoped he would come to accept my point of view.

The Big Data Zone is presented by Exaptive.  Learn about how to rapidly iterate data applications, while reusing existing code and leveraging open source technologies.

Topics:

Published at DZone with permission of Rob J Hyndman , DZone MVB .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}