Over a million developers have joined DZone.

Constants and ARIMA Models in R

DZone's Guide to

Constants and ARIMA Models in R

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

This post is from my new book Fore­cast­ing: prin­ci­ples and prac­tice, avail­able freely online at OTexts​.com/fpp/.

A non-​​seasonal ARIMA model can be writ­ten as

(1)   \begin{equation*} (1-\phi_1B - \cdots - \phi_p B^p)(1-B)^d y_t = c + (1 + \theta_1 B + \cdots + \theta_q B^q)e_t \end{equation*}

or equiv­a­lently as

(2)   \begin{equation*} (1-\phi_1B - \cdots - \phi_p B^p)(1-B)^d (y_t - \mu t^d/d!) = (1 + \theta_1 B + \cdots + \theta_q B^q)e_t, \end{equation*}

where B is the back­shift oper­a­tor, c = \mu(1-\phi_1 - \cdots - \phi_p ) and \mu is the mean of (1-B)^d y_t. R uses the param­e­triza­tion of equa­tion (2).

Thus, the inclu­sion of a con­stant in a non-​​stationary ARIMA model is equiv­a­lent to induc­ing a poly­no­mial trend of order d in the fore­cast func­tion. (If the con­stant is omit­ted, the fore­cast func­tion includes a poly­no­mial trend of order d-1.) When d=0, we have the spe­cial case that \mu is the mean of y_t.

Includ­ing con­stants in ARIMA mod­els using R


By default, the arima() com­mand in R sets c=\mu=0 when d>0 and pro­vides an esti­mate of \mu when d=0. The para­me­ter \mu is called the “inter­cept” in the R out­put. It will be close to the sam­ple mean of the time series, but usu­ally not iden­ti­cal to it as the sam­ple mean is not the max­i­mum like­li­hood esti­mate when p+q>0.

The arima() com­mand has an argu­ment include.mean which only has an effect when d=0 and is TRUE by default. Set­ting include.mean=FALSE will force \mu=0.


The Arima() com­mand from the fore­cast pack­age pro­vides more flex­i­bil­ity on the inclu­sion of a con­stant. It has an argu­ment include.mean which has iden­ti­cal func­tion­al­ity to the cor­re­spond­ing argu­ment for arima(). It also has an argu­ment include.drift which allows \mu\ne0 when d=1. For d>1, no con­stant is allowed as a qua­dratic or higher order trend is par­tic­u­larly dan­ger­ous when fore­cast­ing. The para­me­ter \mu is called the “drift” in the R out­put when d=1.

There is also an argu­ment include.constant which, if TRUE, will set include.mean=TRUE if d=0 and include.drift=TRUE when d=1. If include.constant=FALSE, both include.mean and include.drift will be set to FALSE. If include.constant is used, the val­ues of include.mean=TRUE and include.drift=TRUE are ignored.

When d=0 and include.drift=TRUE, the fit­ted model from Arima() is

    \[(1-\phi_1B - \cdots - \phi_p B^p) (y_t - a - bt) = (1 + \theta_1 B + \cdots + \theta_q B^q)e_t.\]

In this case, the R out­put will label a as the “inter­cept” and b as the “drift” coefficient.


The auto.arima() func­tion auto­mates the inclu­sion of a con­stant. By default, for d=0 or d=1, a con­stant will be included if it improves the AIC value; for d>1 the con­stant is always omit­ted. If allowdrift=FALSE is spec­i­fied, then the con­stant is only allowed when d=0.

Even­tual fore­cast functions

The even­tual fore­cast func­tion (EFF) is the limit of \hat{y}_{t+h|t} as a func­tion of the fore­cast hori­zon h as h\rightarrow\infty.

The con­stant c has an impor­tant effect on the long-​​term fore­casts obtained from these models.

  • If c=0 and d=0, the EFF will go to zero.
  • If c=0 and d=1, the EFF will go to a non-​​zero con­stant deter­mined by the last few observations.
  • If c=0 and d=2, the EFF will fol­low a straight line with inter­cept and slope deter­mined by the last few observations.
  • If c\ne0 and d=0, the EFF will go to the mean of the data.
  • If c\ne0 and d=1, the EFF will fol­low a straight line with slope equal to the mean of the dif­fer­enced data.
  • If c\ne0 and d=2, the EFF will fol­low a qua­dratic trend.

Sea­sonal ARIMA models

If a sea­sonal model is used, all of the above will hold with d replaced by d+D where D is the order of sea­sonal dif­fer­enc­ing and d is the order of non-​​seasonal dif­fer­enc­ing.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}