Over a million developers have joined DZone.

The Forecast Mean After Back-​​Transformation

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

Many func­tions in the fore­cast pack­age for R will allow a Box-​​Cox trans­for­ma­tion. The mod­els are fit­ted to the trans­formed data and the fore­casts and pre­dic­tion inter­vals are back-​​transformed. This pre­serves the cov­er­age of the pre­dic­tion inter­vals, and the back-​​transformed point fore­cast can be con­sid­ered the median of the fore­cast den­si­ties (assum­ing the fore­cast den­si­ties on the trans­formed scale are sym­met­ric). For many pur­poses, this is accept­able, but occa­sion­ally the mean fore­cast is required. For exam­ple, with hier­ar­chi­cal fore­cast­ing the fore­casts need to be aggre­gated, and medi­ans do not aggre­gate but means do.

It is easy enough to derive the mean fore­cast using a Tay­lor series expan­sion. Sup­pose f(x) rep­re­sents the back-​​transformation func­tion, \mu is the mean on the trans­formed scale and \sigma^2 is the vari­ance on the trans­formed scale. Then using the first three terms of a Tay­lor expan­sion around \mu, the mean on the orig­i­nal scale is given by

   

Box-​​Cox transformations

For a Box-​​Cox transformation,

 

So

and the back­trans­formed mean is given by

 

There­fore, to adjust the back-​​transformed mean obtained by R, the fol­low­ing code can be used.

library(fpp)
 
fit <- ets(eggs, lambda=0)
fc <- forecast(fit, h=50, level=95)
fvar <- ((BoxCox(fc$upper,fit$lambda)-BoxCox(fc$lower,fit$lambda))/qnorm(0.975)/2)^2
plot(fc)
fc$mean <- fc$mean * (1 + 0.5*fvar)
lines(fc$mean,col='red')
 
fit <- ets(eggs, lambda=0.2)
fc <- forecast(fit, h=50, level=95)
fvar <- ((BoxCox(fc$upper,fit$lambda)-BoxCox(fc$lower,fit$lambda))/qnorm(0.975)/2)^2
plot(fc)
fc$mean <- fc$mean * (1 + 0.5*fvar*(1-fit$lambda)/(fc$mean)^(2*fit$lambda))
lines(fc$mean,col='red')

The sec­ond of these plots is shown below. The blue line shows the fore­cast medi­ans while the red line shows the fore­cast means.

Scaled logis­tic transformation

In my pre­vi­ous post on trans­for­ma­tions, I described the scaled logit trans­for­ma­tion for bound­ing a fore­cast between spec­i­fied lim­its a and b. In this case,

 

and so

and the back-​​transformed mean is given by

 

In R, this can be cal­cu­lated as follows.

# Bounds
a <- 50
b <- 400
# Transform data
y <- log((eggs-a)/(b-eggs))
fit <- ets(y)
fc <- forecast(fit, h=50, level=0.95)
fvar <- ((fc$upper=fc$lower)/qnorm(0.975)/2)^2
emu <- exp(fc$mean)
# Back-transform forecasts
fc$mean <- (b-a)*exp(fc$mean)/(1+exp(fc$mean)) + a
fc$lower <- (b-a)*exp(fc$lower)/(1+exp(fc$lower)) + a
fc$upper <- (b-a)*exp(fc$upper)/(1+exp(fc$upper)) + a
fc$x <- eggs
# Plot result on original scale
plot(fc)
# Compute forecast mean
fc$mean <- 1/(1+emu)^3*((a+b*emu)*(1+emu)^2 + fvar*(b-a)*emu*(1-emu)/2)
lines(fc$mean,col='red')


Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:

Published at DZone with permission of Rob J Hyndman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}