# HTS with Regressors

# HTS with Regressors

Join the DZone community and get the full member experience.

Join For Free**The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.**

The hts package for R allows for forecasting hierarchical and grouped time series data. The idea is to generate forecasts for all series at all levels of aggregation without imposing the aggregation constraints, and then to reconcile the forecasts so they satisfy the aggregation constraints. (An introduction to reconciling hierarchical and grouped time series is available in this Foresight paper.)

The base forecasts can be generated using any method, with ETS models and ARIMA models provided as options in the `forecast.gts()`

function. As ETS models do not allow for regressors, you will need to choose ARIMA models if you want to include regressors.

Suppose `x`

is a matrix of historical regressors (with each column containing one regressor and with the number of rows equal to the number of time periods of historical data), and `f`

is the corresponding matrix of future regressors (with the number of rows equal to the forecast horizon). Then if `y`

is an `hts`

or `gts`

object, the following code can be used for forecasting:

fc = forecast(y, fmethod="arima", xreg="x", newxreg="f")

That will fit a regression with ARIMA errors to each of the original series (at all levels of aggregation), produce forecasts from each model, and then reconcile the forecasts so they satisfy the aggregation constraints.

### Infant deaths

I will illustrate using infant death numbers for Australia. These are disaggregated by state and sex as shown below. This graph is produced using `plot(infantgts)`

.

*[Click for a larger version of the graph]*

A potential forecasting method is to use a regression model on the log scale with a constant to 1970 and a decreasing trend thereafter. The figure below shows the model for the most aggregated data.

To apply this model to all series, and allow for ARMA errors, we can use the following code.

y = window(infantgts, start=1944) z = pmax(time(y$bts) - 1970, 0) fz = max(z) + 1:10 fc = forecast(y, h=10, fmethod="arima", xreg=z, newxreg=fz, lambda=0) plot(fc)

*[Click for a larger version of the graph]*

`lambda=0`

means the models are fitted to the logged data (although reconciliation must occur on the original scale).
**Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.**

Published at DZone with permission of Rob J Hyndman , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

## {{ parent.title || parent.header.title}}

## {{ parent.tldr }}

## {{ parent.linkDescription }}

{{ parent.urlSource.name }}