Over a million developers have joined DZone.

The Forecast Mean After Back-​​Transformation

DZone's Guide to

The Forecast Mean After Back-​​Transformation

· Big Data Zone
Free Resource

See how the beta release of Kubernetes on DC/OS 1.10 delivers the most robust platform for building & operating data-intensive, containerized apps. Register now for tech preview.

Many func­tions in the fore­cast pack­age for R will allow a Box-​​Cox trans­for­ma­tion. The mod­els are fit­ted to the trans­formed data and the fore­casts and pre­dic­tion inter­vals are back-​​transformed. This pre­serves the cov­er­age of the pre­dic­tion inter­vals, and the back-​​transformed point fore­cast can be con­sid­ered the median of the fore­cast den­si­ties (assum­ing the fore­cast den­si­ties on the trans­formed scale are sym­met­ric). For many pur­poses, this is accept­able, but occa­sion­ally the mean fore­cast is required. For exam­ple, with hier­ar­chi­cal fore­cast­ing the fore­casts need to be aggre­gated, and medi­ans do not aggre­gate but means do.

It is easy enough to derive the mean fore­cast using a Tay­lor series expan­sion. Sup­pose f(x) rep­re­sents the back-​​transformation func­tion, \mu is the mean on the trans­formed scale and \sigma^2 is the vari­ance on the trans­formed scale. Then using the first three terms of a Tay­lor expan­sion around \mu, the mean on the orig­i­nal scale is given by


Box-​​Cox transformations

For a Box-​​Cox transformation,



and the back­trans­formed mean is given by


There­fore, to adjust the back-​​transformed mean obtained by R, the fol­low­ing code can be used.

fit <- ets(eggs, lambda=0)
fc <- forecast(fit, h=50, level=95)
fvar <- ((BoxCox(fc$upper,fit$lambda)-BoxCox(fc$lower,fit$lambda))/qnorm(0.975)/2)^2
fc$mean <- fc$mean * (1 + 0.5*fvar)
fit <- ets(eggs, lambda=0.2)
fc <- forecast(fit, h=50, level=95)
fvar <- ((BoxCox(fc$upper,fit$lambda)-BoxCox(fc$lower,fit$lambda))/qnorm(0.975)/2)^2
fc$mean <- fc$mean * (1 + 0.5*fvar*(1-fit$lambda)/(fc$mean)^(2*fit$lambda))

The sec­ond of these plots is shown below. The blue line shows the fore­cast medi­ans while the red line shows the fore­cast means.

Scaled logis­tic transformation

In my pre­vi­ous post on trans­for­ma­tions, I described the scaled logit trans­for­ma­tion for bound­ing a fore­cast between spec­i­fied lim­its a and b. In this case,


and so

and the back-​​transformed mean is given by


In R, this can be cal­cu­lated as follows.

# Bounds
a <- 50
b <- 400
# Transform data
y <- log((eggs-a)/(b-eggs))
fit <- ets(y)
fc <- forecast(fit, h=50, level=0.95)
fvar <- ((fc$upper=fc$lower)/qnorm(0.975)/2)^2
emu <- exp(fc$mean)
# Back-transform forecasts
fc$mean <- (b-a)*exp(fc$mean)/(1+exp(fc$mean)) + a
fc$lower <- (b-a)*exp(fc$lower)/(1+exp(fc$lower)) + a
fc$upper <- (b-a)*exp(fc$upper)/(1+exp(fc$upper)) + a
fc$x <- eggs
# Plot result on original scale
# Compute forecast mean
fc$mean <- 1/(1+emu)^3*((a+b*emu)*(1+emu)^2 + fvar*(b-a)*emu*(1-emu)/2)

New Mesosphere DC/OS 1.10: Production-proven reliability, security & scalability for fast-data, modern apps. Register now for a live demo.


Published at DZone with permission of Rob J Hyndman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}