Over a million developers have joined DZone.

Privacy Rules are the First Casualty of Big Data

DZone's Guide to

Privacy Rules are the First Casualty of Big Data

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

In our haste to study larger and larger amounts of data and find information, there’s a point getting lost in the excitement...those are people who often haven’t given their permission for their data to be used for just any purpose.

This isn’t a small problem or isolated problem. The use of consumer data to understand the marketplace is one thing, but we’ve gone beyond the question of what’s selling well.

The volume of data we’re able to see today means we can see where, when and how each and every consumer made a purchase. The data about the purchase has become as important as the purchase itself and less likely to be scrutinized for privacy than ever before.

The gotchas

And knowing that we can be knowing so much, there are new levels of responsibility for those who collect, hold and use all of that data. Those folks are operating in a world where the rules were set up for completely different times (the beginning of computerization forty years ago). Back in those days, data was extremely transactional, relatively small and always collected for very, very specific reasons. It was collected, stored and used in separate data repositories that were only occasionally cross-referenced in a way that drew a bigger, more personal picture. And very significantly, data was easily protected on the server before there was such a thing as a firewall or the Internet.

The old rules still exist but those simpler times are over.

Data governance

The biggest challenge of Big Data isn’t the storage or processing. The single biggest challenge, a challenge many don’t see or aren’t talking about, is the need to govern data in ways that protect the individual from manipulation, fraud or worse. That takes a combination of things that readily available but underused like access security, log data management, managed file transfer, and people and processes that are up to the task.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}