Over a million developers have joined DZone.

R: Seasonal Periods

DZone's Guide to

R: Seasonal Periods

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

I get ques­tions about this almost every week. Here is an exam­ple from a recent com­ment on this blog:

I have two large time series data. One is sep­a­rated by sec­onds inter­vals and the other by min­utes. The length of each time series is 180 days. I’m using R (3.1.1) for fore­cast­ing the data. I’d like to know the value of the “fre­quency” argu­ment in the ts() func­tion in R, for each data set. Since most of the exam­ples and cases I’ve seen so far are for months or days at the most, it is quite con­fus­ing for me when deal­ing with equally sep­a­rated sec­onds or min­utes. Accord­ing to my under­stand­ing, the “fre­quency” argu­ment is the num­ber of obser­va­tions per sea­son. So what is the “sea­son” in the case of seconds/​minutes? My guess is that since there are 86,400 sec­onds and 1440 min­utes a day, these should be the val­ues for the “freq” argu­ment. Is that correct?

The same ques­tion was asked on cross​val​i​dated​.com.

Yes, the “fre­quency” is the num­ber of obser­va­tions per sea­son. This is the oppo­site of the def­i­n­i­tion of fre­quency in physics, or in Fourier analy­sis, where “period” is the length of the cycle, and “fre­quency” is the inverse of period. When using the ts() func­tion in R, the fol­low­ing choices should be used.

Data frequency
Annual 1
Quar­terly 4
Monthly 12
Weekly 52

Actu­ally, there are not 52 weeks in a year, but 365.25÷7 = 52.18 on aver­age. But most func­tions which use ts objects require inte­ger frequency.

Once the fre­quency of obser­va­tions is smaller than a week, then there is usu­ally more than one way of han­dling the fre­quency. For exam­ple, hourly data might have a daily sea­son­al­ity (frequency=24), a weekly sea­son­al­ity (frequency=24x7=168) and an annual sea­son­al­ity (frequency=24x365.25=8766). If you want to use a ts object, then you need to decide which of these is the most important.

An alter­na­tive is to use a msts object (defined in the forecast pack­age) which han­dles mul­ti­ple sea­son­al­ity time series. Then you can spec­ify all the fre­quen­cies that might be rel­e­vant. It is also flex­i­ble enough to han­dle non-​​integer frequencies.

Data fre­quen­cies

minute hour day week year

7 365.25

24 168 8766

48 336 17532
60 1440 10080 525960
Sec­onds 60 3600 86400 604800 31557600

You won’t nec­es­sar­ily want to include all of these fre­quen­cies — just the ones that are likely to be present in the data. For exam­ple, any nat­ural phe­nom­ena (e.g., sun­shine hours) is unlikely to have a weekly period, and if your data are mea­sured in one-​​minute inter­vals over a 3 month period, there is no point includ­ing an annual frequency.

For exam­ple, the taylor data set from the forecast pack­age con­tains half-​​hourly elec­tric­ity demand data from Eng­land and Wales over about 3 months in 2000. It was defined as

taylor <- msts(x, seasonal.periods=c(48,336)

One con­ve­nient model for mul­ti­ple sea­sonal time series is a TBATS model:

taylor.fit <- tbats(taylor)

(Warn­ing: this takes a few minutes.)

If an msts object is used with a func­tion designed for ts objects, the largest sea­sonal period is used as the “fre­quency” attribute.

Related Posts:

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.


Published at DZone with permission of Rob J Hyndman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}