5 Myths About Big Data
5 Myths About Big Data
Myths always tend to build up around anything that's big and transformational—and big data is no exception. Let's consider the top 5 myths and how they can be debunked.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
Ever since the name "big data" was coined to express the mix of structured and unstructured data, tons of businesses have been exploiting to get useful information. But a lot of mystery and hype is also being associated with it. While big data has proved to be beneficial to most businesses, has it really matched the hype that has generated in its wake? As it inevitably happens with transformational innovations, some myths have emerged about big data. Let's consider the top five myths and how they can be debunked.
Myth #1: Big Data Is Really Big
How big is big? It's really subjective. The fact that big data is generated in real-time from all sorts of sources and sensors including pictures, audio, and video means that it is... rather big. If we simply compare this data volume with the structured data that the organizations previously had maintained in a spreadsheet and relational databases, then it is many times larger. But more than big, it is diverse and unstructured. That’s what makes it so challenging to capture, store, process, and derive benefits from. So, "big data" may not always be large in volume, but it is diverse and complex (and is very useful, too).
Myth #2: Big Data Is Always Good Data
Yes, big data is very useful and has started to provide a competitive advantage to several businesses. But is all of the data in "big data" good? The answer is no. Big data contains many errors, as well as missing data. Consider, for example, some teenagers who upload feedback using slang. The machine may be confused as to whether it is positive or negative. They may also wrongly tag some pictures and videos. Such data is likely to mislead and create errors and may need an intelligent model to sift through the data before analysis. It is important to include only the data sets for analysis that appear to be more relevant.
Myth #3: The Big Data Analyst Is God
Data is already too big, and it's getting bigger by the day thanks to high volume, high velocity, and high variety (granularity). A team of analysts won't be able to handle all of the data in a few years from now. We need continual development of tools for users to do their own first level of analysis. Expert analysts will only be needed for deeper analysis and for providing a bigger picture. With the emergence of analytical tools, the preeminence of the data specialist is already on the wane. A good balance needs to be maintained between human analysts and analytical tools.
Myth #4: Big Data Is Only for Big Businesses
Big businesses can afford to spend a lot more on data analytics processes and tools — and they do just that. Many big businesses have an entire team of data analysts and a large collection of analytical tools. That being said, smaller businesses certainly can (and should) get started with inexpensive tools in addition to cloud computing to benefit from big data.
Myth #5: Machine Algorithms Will Replace Human Analysts
Good data analysts have domain experience and well-developed analytical skills. But they can’t perform all of the analysis tasks all by themselves without the support of analytical tools and machine algorithms. Thus, the myth that human analysts will be completely replaced by analytical tools or machine algorithms is equally without basis. There's no doubt that the cost of hiring data analysts is going up, and tools for analysis are continuing to get better and cheaper. But human insight and judgment are invaluable and complex, and can’t (yet) be fully incorporated in machine algorithms. Further, apart from interpreting data, a data analyst can provide in-depth, unmatched explanation and can even recommend corrective actions. Machine algorithms can’t do that yet.
In conclusion, myths always tend to build up around anything that is big and transformational — and big data is no exception. It is up to businesses to use this beast to their competitive advantage. The race is already under way and new businesses are joining it every day.
Opinions expressed by DZone contributors are their own.