Over a million developers have joined DZone.

Big Data For Dummies

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Big data is data, which is so voluminous and complex that it is not capable of being managed with traditional database tools. Although this definition does not do full justice to this concept, it is expected that it will give a reasonable idea of what it is. 

Importantly, to be qualified as big data, the size of the data should be in petabytes or more and its rate of growth should be exponential. 

Big data has caught the fancy of organizations across the board because of its ability to upend traditional business strategies to adapt to the changing times, in the process generating more revenue.

As big data provides access to data in real-time, it can help organizations in improving their cyber security.

This data can also be used for making predictions, which will force organizations to modify their operations accordingly, according to Forrester, an information technology market research firm. 

Big data can also provide insights to companies on consumers buying habits by letting them track and evaluate shopping behavior. 

This concept got a shot in the arm with the proliferation of mobile devices and other technological advances. 

Big data comprises three 'Vs' - Data Volume, Data Variety, and Data Velocity.

Data volume refers to the size of data, as has been already discussed, which is growing at an unbridled pace. In addition, data will be generated from more sources than before which needs to be handled. 

When we talk about data variety, it is the increasing number of formats that big data will need to accommodate. Initially, there were only excel tables, word documents, etc., but now we have PDF files, video streaming, audio and video files. So it can be expected that more such formats will be added in the future, as new applications make their way into the world of IT. 

Data velocity is the capability to analyze huge amounts of data in real-time. 

All these factors present challenges to organizations to manage, analyze, and transfer. A new set of technologies have been and are being, therefore, developed to address them. 

Cloud computing is one of them as it helps organizations to access analysis tools developed specifically for big data.

All the advantages being taken into consideration, big data is being perceived as a threat by some as they aver that it can impinge on people's privacy, besides posing a security threat and a danger of information theft. 

These people's fears are not totally misplaced. To prevent big data from not steeping into such dangerous territories will definitely be a challenge.

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.


The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}