Over a million developers have joined DZone.

Time Series Data Library now on DataMarket

DZone's Guide to

Time Series Data Library now on DataMarket

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

The Time Series Data Library is a col­lec­tion of about 800 time series that I have main­tained since about 1992, and hosted on my per­sonal web­site. It includes data from a lot of time series text­books, as well as many other series that I’ve either col­lected for stu­dent projects or help­ful peo­ple have sent to me.

I’ve now moved the col­lec­tion onto Data­Mar­ket which pro­vides much bet­ter facil­i­ties for main­tain­ing and using time series data. You can eas­ily search the col­lec­tion, graph any series, fil­ter by sea­sonal period, and so on. You can also export data in many for­mats. Each data set has its own short link; for exam­ple, the famous Cana­dian lynx data is at http://​data​.is/​K​y69xY.

One par­tic­u­larly use­ful fea­ture is the abil­ity to read directly into R using the rdata­mar­ket pack­age. All you need to know is the short link. For exam­ple, to down­load “Deaths from gun-​​related homi­cides in Aus­tralia, 1915–2004″, use the fol­low­ing R code:

deaths <- dmseries("http://data.is/Ky6vVf")

The data is set to zoo class. To make it of ts class, use

deaths <- as.ts(deaths[,1])

In this case, deaths only con­tained one col­umn, but in gen­eral mul­ti­vari­ate time series can be down­loaded in this manner.

Data­Mar­ket con­tains thou­sands of other time series from orga­ni­za­tions includ­ing Euro­stat, the IMF, the United Nations, Gap­min­der, and many more. Some time series require a sub­scrip­tion, but many can be used freely. The time series in the TSDL will remain freely available.

I’m grate­ful to Data­Mar­ket for agree­ing to host my library with­out charge, and I encour­age every­one inter­ested in time series analy­sis to check them out.

If you use any data from the TSDL in a pub­li­ca­tion, please use the fol­low­ing citation:

Hyn­d­man, R.J. Time Series Data Library, http://​data​.is/​T​S​D​Ldemo. Accessed on <insert date here>.

The data files will remain on my web­site so that exist­ing links will not be bro­ken.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}