Over a million developers have joined DZone.

Generating quantile forecasts in R

DZone's Guide to

Generating quantile forecasts in R

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

From today’s email:

I have just fin­ished read­ing a copy of ‘Forecasting:Principles and Prac­tice’ and I have found the book really inter­est­ing. I have par­tic­u­larly enjoyed the case stud­ies and focus on prac­ti­cal applications.

After fin­ish­ing the book I have joined a fore­cast­ing com­pe­ti­tion to put what I’ve learnt to the test. I do have a cou­ple of queries about the fore­cast­ing out­puts required. The out­put required is a quan­tile fore­cast, is this the same as pre­dic­tion inter­vals? Is there any R func­tion to pro­duce quan­tiles from 0 to 99?

If you were able to point me in the right direc­tion regard­ing the above it would be greatly appreciated.

Many Thanks,

Pre­sum­ably the com­pe­ti­tion is GEFCOM2014 which I’ve posted about before.

The future value of a time series is unknown, so you can think of it as a ran­dom vari­able, and its dis­tri­b­u­tion is the “fore­cast dis­tri­b­u­tion”. A “quan­tile fore­cast” is a quan­tile of the fore­cast dis­tri­b­u­tion. The usual point fore­cast is often the mean or the median of the fore­cast dis­tri­b­u­tion. A pre­dic­tion inter­val is a range of spec­i­fied cov­er­age prob­a­bil­ity under that dis­tri­b­u­tion. For exam­ple, if we assume the fore­cast dis­tri­b­u­tion is nor­mal, then the 95% pre­dic­tion inter­val is defined by the 2.5% and 97.5% quan­tiles of the fore­cast distribution.

Still assum­ing nor­mal­ity, we could gen­er­ate the fore­cast quan­tiles from 1% to 99% in R using

qnorm((1:99)/100, m, s)

where mu and sigma are the esti­mated mean and stan­dard devi­a­tion of the fore­cast dis­tri­b­u­tion. So if you are using the fore­cast pack­age in R, you can do some­thing like this:

fit <- auto.arima(WWWusage)
fc <- forecast(fit, h=20, level=95)
qf <- matrix(0, nrow=99, ncol=20)
m <- fc$mean
s <- (fc$upper-fc$lower)/1.96/2
for(h in 1:20)
  qf[,h] <- qnorm((1:99)/100, m[h], s[h])
matlines(101:120, t(qf), col=rainbow(120), lty=1)

Of course, assum­ing a nor­mal dis­tri­b­u­tion is rather restric­tive and not very inter­est­ing. For a more inter­est­ing but much more com­pli­cated approach to gen­er­at­ing quan­tiles, see my 2010 paper on Den­sity fore­cast­ing for long-​​term peak elec­tric­ity demand.

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.


Published at DZone with permission of Rob J Hyndman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}