Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Why You Should Use Caution When Dealing with Big Data

DZone's Guide to

Why You Should Use Caution When Dealing with Big Data

You may have the data, the Big Data that is, but you need to exercise caution before jumping to conclusions.

· Big Data Zone
Free Resource

Learn best practices according to DataOps. Download the free O'Reilly eBook on building a modern Big Data platform.

Big-Data-PrivacyBig data is increasingly being used in a predictive capacity to forecast when we might be about to do bad things.  In health and social care settings, it might be used to identify people that abuse alcohol, or even other people.

A recent Australian study highlights the dangers involved in this and the extreme caution we should exert before we go down this particular rabbit hole.

“It is possible that bringing together and mining multiple databases will provide terrific insights into social problems,” the authors say.

There have been tentative moves to use big data to better guide various social work, whether it’s a link between homelessness and mortality in America or the role of big data in criminal behavior in Australia.

Whilst there have been promising applications, such as the Californian project to use big data to inform the location of police officers, there is also the potential for errors to be made, and these errors to be magnified by the scale of the data used to inform the decision.

“You could match the data of homeless people and say a large number are alcoholics, so they should be targeted with alcohol rehabilitation, but what caused their situation is never uncovered,” the authors say.

“We need caution to ensure that we aren’t going to waste resources and insult and stigmatise groups of people,” they continue.

The Use of Big Data in Human Services

The paper explored the use of big data in human services after such an application was rolled out in New Zealand, where policy makers began using big data to predict the chances of someone being a child abuser a few years ago.

When the data was analyzed however, huge holes emerged which made the likelihood of misjudgements emerging very high indeed.  What’s more, in addition to this risk of bad judgements, the data was insufficient to provide much, if any, additional insight over traditional methods.

“Existing tools already tell us the most likely perpetrators, without spending millions of dollars,” the authors say.

“The phenomenal cost – and whether that money could be better spent on services – is something that is quite often overlooked.”

The paper highlights the potential for misuse by drawing parallel with the advertising world, and how the authors demographic information wrongly lead companies to assume he has an interest in golf that could not be further from the truth.

Far from being a panacea therefore, it should be treated with care by anyone hoping to garner insights from vast data sources.

Find the perfect platform for a scalable self-service model to manage Big Data workloads in the Cloud. Download the free O'Reilly eBook to learn more.

Topics:
big data ,big data analytics ,analysis

Published at DZone with permission of Adi Gaskell, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}