Over a million developers have joined DZone.

5 Trends in Big Data

In this post we’ve got 5 trends that we think are going to play a major role as big data becomes ‘bigger’ data.

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

“Big data” is the common term for the exponential growth and availability of data, both structured and unstructured. Referring to it as ‘big’ data is perhaps somewhat of an understatement – IBM estimated that 2.5 exabytes (EB) of data was generated every day in 2012. Don’t know what an exabyte is? It’s 1 billion gigabytes of data: so that’s roughly 915,000,000,000 GB of data generated in 2012.

If all those zeros aren't sinking in, let’s take a top-of-the range iPhone 6s with 128GB storage per device; 1EB would require over 7.1 billion iPhones to store all that data. Remember, also, that this was in 2012 - statistics predict mobile phone penetration will grow from 61% in 2013 to nearly 70% in 2017. So it’s safe to say these numbers are only going to get bigger. And a lot bigger.

So are we equipped for the sheer amount of data that we’re producing? Can businesses use it to their advantage or will it overwhelm them?

Back in 2001, Doug Laney of Gartner was the first to define big data as “the 3 V’s” - Volume, Velocity and Variety – and the issues they pose to businesses. Today, it is generally agreed that there are far more than 3 characteristics of information, but the 3 V’s still act as the core points of emphasis:


  • Volume: As we pointed to above,this one isn’t too difficult to get your head around. As the years go by, the volume of data is going to increase. Thankfully, storing all this data is not really an issue. However, with decreasing storage costs, other issues emerge: how to determine relevance within large data volumes and how to use analytics to create value from relevant data, for example.
  • Velocity: As if ever-increasing quantities of data wasn’t enough, it’s also coming through at an unprecedented speed. Smart labels (RFID tags), sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with the velocity of data is a challenge for most companies.
  • Variety: Data comes in all types of formats – structured, unstructured, email, video, audio etc. Managing, unifying and governing different varieties of data is something many organizations struggle with.

    Don't Get Left Behind

     Big data gives businesses a window into extremely valuable streams of information - from customer purchasing habits to inventory status – to better support their company and serve their customers. But, as everything about data expands as we move forward, the same is expecting to happen regarding what it can do for the enterprise. This information is sure to transform business processes over the next few years, so understanding what’s in store is the best way to stay ahead of the curve, and in this post we’ve got 5 trends that we think are going to play a major role as big data becomes ‘bigger’ data.

    1.  Involving everyone

    It is expected that companies of all sizes will increasingly adopt a data-centric approach as they encourage collaboration among colleagues, and interact with customers. You only have to go back a few years to a time when big data tools were only available to large corporate companies. Now, however, big data will prompt companies to rethink the way their employees collaborate and use data to identify and quickly adapt to opportunities and challenges.

    2.  The IoT

    Gartner believes that by 2017, more than 20% of customer-facing analytic deployments will provide product tracking information leveraging the Internet of Things. Customers are now demanding a lot more information from their customers, in large part due to the Nexus of Forces (i.e. mobile, social, cloud and information). The IoT is set to spread at as fast a rate as data has, and will create a new style of customer-facing analysis – product tracking. It’s a way for businesses to strengthen relationships with customers, providing uses such as geospatial and performance information.

    3.  Deep Learning

    Whilst still an evolving methodology, deep learning is a set of machine-learning techniques based on neural networking. The concept is that computers will recognize items of interest in large quantities of both unstructured and binary data, and be able to deduce relationships without the need of specific programming instructions. For example, a deep learning algorithm was applied to Wikipedia, and learned of its own accord that California and Texas are both states in the U.S.A, without first being modeled to understand the concept of a state or country. In terms of big data, deep learning could be implemented to identify the different types of data, and other cognitive engagement capabilities that could shape the future of advanced analytics.

    4.  Data agility

    The slow and rigid processes of legacy databases and data warehouses are proving too expensive and time-consuming for many business needs. As such, Data agility has come to the forefront as a driver behind the development of big data technologies. Organizations are beginning to shift their focus from simply capturing and managing data to actively using it. Data agility allows the processing and analyzing of data to impact on operations: leaving the company to respond and adjust to changes in customer preferences, market condition, competitive actions and the status of operations.

    5.  Self-service

    Advances in big data tools and services means IT can ease away from being a bottleneck to the access of data by business users and data analysts. Embracing self-service big data can empower developers, data scientists and data analysts to conduct data exploration directly. Advanced organizations will move to data bindings on execution and away from a central structure, as the quicker nature of self-service will boost the ability to leverage new data sources.

    Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.

    iot data tsunami,deep learning,data architecture,self service

    Published at DZone with permission of Josh Anderson, DZone MVB. See the original article here.

    Opinions expressed by DZone contributors are their own.

    The best of DZone straight to your inbox.

    Please provide a valid email address.

    Thanks for subscribing!

    Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

    {{ parent.title || parent.header.title}}

    {{ parent.tldr }}

    {{ parent.urlSource.name }}