Over a million developers have joined DZone.

Might What Big Data Is Saying About Us Be Wrong?

DZone's Guide to

Might What Big Data Is Saying About Us Be Wrong?

All data is not created equal, and just having a large sample is not always sufficient to get the best insights. Read on to learn why.

· Big Data Zone ·
Free Resource

The Architect’s Guide to Big Data Application Performance. Get the Guide.

I've written extensively about the tremendous potential for big data in healthcare to drive enormous changes in how we keep people healthy for longer. It goes without saying however that all data is not created equal, and just having a large sample is not always sufficient to get the best insights.

If we needed reminding, a reminder comes via a recent study from the University of California, Berkeley. It suggests that things like emotion, behavior, and physiology vary hugely between individuals, therefore having an average over a large dataset can still produce a 'norm' that is wide of the mark for individuals.

"If you want to know what individuals feel or how they become sick, you have to conduct research on individuals, not on groups," the researchers say. "Diseases, mental disorders, emotions, and behaviors are expressed within individual people, over time. A snapshot of many people at one moment in time can't capture these phenomena."

Getting Things Wrong

What's more, the potential consequences of using group data in medical scenarios could be tremendous, including misdiagnoses, giving the wrong treatments, and generally perpetuating scientific theories that are based upon aggregated groups rather than individuals. It is however, a situation that the team believes can be solved.

"People shouldn't necessarily lose faith in medical or social science," they explain. "Instead, they should see the potential to conduct scientific studies as a part of routine care. This is how we can truly personalize medicine."

They believe that modern technologies allow us, both as individuals and the medical community, to collect the kind of rich and valuable datasets about each of us as individuals to allow the industry to derive personalized insights. The technology simply wasn't available in the past, hence the preponderance of big and impersonal datasets.

Data was collected and compared on hundreds of people, ranging from healthy individuals to those with a range of disorders, including depression and PTSD. Across six separate experiments, they showed that the group norm was often very different to individual cases.

The findings underline the tremendous importance of working sensibly when it comes to medical data, and wherever possible tapping into the rich seam of personal data that patients are generating rather than relying on aggregated data to make individual inferences.

Learn how taking a DataOps approach will help you speed up processes and increase data quality by providing streamlined analytics pipelines via automation and testing. Learn More.

big data ,big data reporting ,data science

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}