Over a million developers have joined DZone.

Fitting Models to Short Time Series

DZone's Guide to

Fitting Models to Short Time Series

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

Fol­low­ing my post on fit­ting mod­els to long time series, I thought I’d tackle the oppo­site prob­lem, which is more com­mon in busi­ness environments.

I often get asked how few data points can be used to fit a time series model. As with almost all sam­ple size ques­tions, there is no easy answer. It depends on the num­ber of model para­me­ters to be esti­mated and the amount of ran­dom­ness in the data. The sam­ple size required increases with the num­ber of para­me­ters to be esti­mated, and the amount of noise in the data.

Using least squares esti­ma­tion, or some other non-​​regularized esti­ma­tion method, it is pos­si­ble to esti­mate a model only if you have more obser­va­tions than para­me­ters.  (If you use the LASSO, or some other reg­u­lar­iza­tion tech­nique, it is pos­si­ble to esti­mate a model with fewer obser­va­tions than para­me­ters.) How­ever, there is no guar­an­tee that a fit­ted model will be any good for fore­cast­ing, espe­cially when the data are noisy.

Some text­books pro­vide rules-​​of-​​thumb giv­ing min­i­mum sam­ple sizes for var­i­ous time series mod­els. These are mis­lead­ing and unsub­stan­ti­ated in the­ory or prac­tice. Fur­ther, they ignore the under­ly­ing vari­abil­ity of the data and often over­look the num­ber of para­me­ters to be esti­mated as well. There is, for exam­ple, no jus­ti­fi­ca­tion what­ever for the magic num­ber of 30 often given as a min­i­mum for ARIMA modelling.

The only rea­son­able approach is to first check that there are enough obser­va­tions to esti­mate the model, and then to test if the model per­forms well out-​​of-​​sample. With short series, there is not enough data to allow some obser­va­tions to be with­eld for test­ing pur­poses. How­ever, the AIC can be used as a proxy for the one-​​step fore­cast out-​​of-​​sample MSE (see here). The AIC allows both the num­ber of para­me­ters and the amount of noise to be taken into account.

What tends to hap­pen with short series is that the AIC sug­gests very sim­ple mod­els because any­thing with more than one or two para­me­ters will pro­duce poor fore­casts due to the esti­ma­tion error.  I applied the auto.arima() func­tion from the fore­cast pack­age in R to all the series from the M-​​competition with fewer than 20 obser­va­tions. There were a total of 144 series, of which 32 had mod­els with zero para­me­ters (ran­dom walks), 95 had mod­els with one para­me­ter, 15 had mod­els with two para­me­ters and 2 series had mod­els with three para­me­ters. For what it’s worth, here is the code.

n <- unlist(lapply(M1,function(x){length(x$x)}))
n <- n[n<20]
series <- names(n)
nparam <- numeric(length(n))
for(i in 1:length(n))
  fit <- auto.arima(M1[[series[i]]]$x)
  nparam[i] <- length(fit$coef)

Sea­sonal mod­els bring their own dif­fi­cul­ties because the sea­son­al­ity usu­ally takes up m-1 degrees of free­dom where m is the sea­sonal period (e.g., m=12 for monthly data). Fourier terms are one way to reduce the prob­lem — use­ful when­ever the ratio of m to sam­ple size is large. Fur­ther com­ments on sea­son­al­ity and sam­ple size are in my short Fore­sight paper with Andrey Kostenko: “Min­i­mum sam­ple size require­ments for sea­sonal fore­cast­ing mod­els”, although I wrote that for a sta­tis­ti­cally unso­phis­ti­cated audi­ence, so there is no men­tion of the LASSO or AIC as pos­si­ble solutions.

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.


Published at DZone with permission of Rob J Hyndman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}