Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Using Big Data to Improve Forecasting

DZone's Guide to

Using Big Data to Improve Forecasting

When you are dealing with a large volume of data, there are two solutions: increase computer performance or reduce the quantity of information needed.

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

Technologies such as AI and big data have become increasingly proficient at spotting trends in large data sets. Indeed, so proficient have they become, that the University of Toronto's Ajay Agrawal, Joshua Gans, Avi Goldfarb argue that lowering the cost of prediction will be the main benefit delivered by AI in the near term.

A sign of the progress being made comes via a recent paper from researchers at the University of Cordoba, which chronicles the work they've done in producing an accurate forecasting machine. They were able to provide accurate forecasts with less data than has been used in previous models.

"When you are dealing with a large volume of data, there are two solutions. You either increase computer performance, which is very expensive, or you reduce the quantity of information needed for the process to be done properly, " the authors explain.

Predictive Modeling

The authors believe that producing reliable results from a predictive model requires both a good number of variables, and a good number of examples to play with. The researchers attempted to reduce the number of examples by focusing on quality rather than quantity, thus hopefully reducing the amount of computing power required to derive meaningful results.

"We have developed a technique that can tell you which set of examples you need so that the forecast is not only reliable but could even be better," the researchers explain.

Indeed, in some instances they were able to reduce the amount of data fed into the system by around 80% without reducing the overall accuracy of the predictions. This not only saves computing power, but also the amount of energy used reduces, the time taken to process the data, and ultimately the cost of making the predictions.

If you're looking for real-time insights, this can be especially important, as both the timeliness of the predictions improve, but also the lowering of costs make it more cost effective to perform more frequent predictions.

Suffice it to say, the system has only thus far been used in laboratory environments, but it's nonetheless an interesting sign of the progress being made in making predictions more affordable.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:
forecasting ,ai ,big data ,predictive analytics

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}