Over a million developers have joined DZone.

The Difference Between Prediction Intervals and Confidence Intervals

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

Pre­dic­tion inter­vals and con­fi­dence inter­vals are not the same thing. Unfor­tu­nately the terms are often con­fused, and I am often fre­quently cor­rect­ing the error in stu­dents’ papers and arti­cles I am review­ing or editing.

A pre­dic­tion inter­val is an inter­val asso­ci­ated with a ran­dom vari­able yet to be observed, with a spec­i­fied prob­a­bil­ity of the ran­dom vari­able lying within the inter­val. For exam­ple, I might give an 80% inter­val for the fore­cast of GDP in 2014. The actual GDP in 2014 should lie within the inter­val with prob­a­bil­ity 0.8. Pre­dic­tion inter­vals can arise in Bayesian or fre­quen­tist statistics.

A con­fi­dence inter­val is an inter­val asso­ci­ated with a para­me­ter and is a fre­quen­tist con­cept. The para­me­ter is assumed to be non-​​random but unknown, and the con­fi­dence inter­val is com­puted from data. Because the data are ran­dom, the inter­val is ran­dom. A 95% con­fi­dence inter­val will con­tain the true para­me­ter with prob­a­bil­ity 0.95. That is, with a large num­ber of repeated sam­ples, 95% of the inter­vals would con­tain the true parameter.

A Bayesian con­fi­dence inter­val, also known as a “cred­i­ble inter­val”, is an inter­val asso­ci­ated with the pos­te­rior dis­tri­b­u­tion of the para­me­ter. In the Bayesian per­spec­tive, para­me­ters are treated as ran­dom vari­ables, and so have prob­a­bil­ity dis­tri­b­u­tions. Thus a Bayesian con­fi­dence inter­val is like a pre­dic­tion inter­val, but asso­ci­ated with a para­me­ter rather than an observation.

I think the dis­tinc­tion between pre­dic­tion and con­fi­dence inter­vals is worth pre­serv­ing because some­times you want to use both. For exam­ple, con­sider the regression

\[ y_i = \alpha + \beta x_i + e_i \>

where y_i is the change in GDP from quar­ter i-1 to quar­ter ix_i is the change in the unem­ploy­ment rate from quar­ter i-1 to quar­ter i, and e_i\sim\text{N}(0,\sigma^2). (This regres­sion model is known as Okun’s law in macro­eco­nom­ics.) In this case, both con­fi­dence inter­vals and pre­dic­tion inter­vals are inter­est­ing. You might be inter­ested in the con­fi­dence inter­val asso­ci­ated with the mean value of y when x=0; that is, the mean growth in GDP when the unem­ploy­ment rate does not change. You might also be inter­ested in the pre­dic­tion inter­val for y when x=0; that is, the likely range of future val­ues of GDP growth when the unem­ploy­ment rate does not change.

The dis­tinc­tion is mostly retained in the sta­tis­tics lit­er­a­ture. How­ever, in econo­met­rics it is com­mon to use “con­fi­dence inter­vals” for both types of inter­val (e.g.,Granger & New­bold, 1986). I once asked Clive Granger why he con­fused the two con­cepts, and he dis­missed my objec­tion as fuss­ing about triv­i­al­i­ties. I dis­agreed with him then, and I still do.

I have seen some­one com­pute a con­fi­dence inter­val for the mean, and use it as if it was a pre­dic­tion inter­val for a future obser­va­tion. The trou­ble is, con­fi­dence inter­vals for the mean are much nar­rower than pre­dic­tion inter­vals, and so this gave him an exag­ger­ated and false sense of the accu­racy of his fore­casts. Instead of the inter­val con­tain­ing 95% of the prob­a­bil­ity space for the future obser­va­tion, it con­tained only about 20%.

So I ask sta­tis­ti­cians to please pre­serve this dis­tinc­tion. And I ask econo­me­tri­cians to stop being so sloppy about ter­mi­nol­ogy. Unfor­tu­nately, I can’t con­tinue my debate with Clive Granger. I rather hoped he would come to accept my point of view.

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks


Published at DZone with permission of Rob J Hyndman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}