Over a million developers have joined DZone.

The Parable of Google Flu

DZone's Guide to

The Parable of Google Flu

· Big Data Zone
Free Resource

Effortlessly power IoT, predictive analytics, and machine learning applications with an elastic, resilient data infrastructure. Learn how with Mesosphere DC/OS.

In a white paper made publicly available by Harvard University, researchers broach the topic of Google Flu Trends – commonly hailed as an innovative and thorough application of Big Data – and some of its shortcomings. 

It's not an unfamiliar topic: Big Data is a trendy topic right now, and many hail it as the next big thing in public health, marketing, and any field that could benefit from thoughtful, analytics-based insights. But the researchers caution against Big Data hubris:

“Big data hubris” is the often implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis. We have asserted that there are enormous scientific possibilities in big data. However, quantity of data does not mean that one can ignore foundational issues of measurement, construct validity and reliability, and dependencies among data. The core challenge is that most big data that have received popular attention are not the output of instruments designed to produce valid and reliable data amenable for scientific analysis.

In the Google Flu Trends example, a model that was intended to predict traditional Center for Disease Control models in tracking the spread of disease during flu season instead way overestimated the spread of the flu. It provides a cautionary tale about what is useful about Big Data and how we should continue to approach it in the future. By digging deep into the algorithms, being transparent about analytic methods, and not relying solely on the size of the data to be panacea for all problems. (This comes on the heels of a lot of criticism of Big Data as being encouraging towards confirmation bias.)

You can read the white paper here.  BBC News also covered the story with a comprehensive piece that looks at the public response.

Learn to design and build better data-rich applications with this free eBook from O’Reilly. Brought to you by Mesosphere DC/OS.


Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}