Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

When to Use Data Warehousing and Hadoop

DZone's Guide to

When to Use Data Warehousing and Hadoop

How do you know that your data is big? And, if it is big, should you use data warehouses and Hadoop? We examine these questions in this post.

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

Big data is now becoming a new trend and every organization that has a data warehouse has already, at some point, discussed incorporating a big data analytics platform. But, the big question here is, when is the right time for this?

Well, based on personal and professional experience, I'll summarize my point of view in the hopes of making it easier for decision makers and help you consider the right questions before making the actual investment.

  1. Is there any requirement derived from business users for processing extra streams of data, e.g social media, or any specific requirments that require advanced data mining? If not, then stop. Without having a solid business case can't start your journey.
  2. If your current data warehouse data size is less then 10TB then stop. This amount of data could be handled easily by enterprise data warehouses.
  3. Who will be handling the administration activities of the big data cluster? Either you need to hire the professionals or develop an in-house team which will take care of this solution.
  4. Can current ETL engineers do the job of gig gata engineers as well? You can't ask them to do two jobs at the same time. So, you'll need to train them. And, even then, the tasks of stabilizing and taming the big data will be a large undertaking. Consider it seriously.
  5. Organizational politics. Is your organization ready for that much investment? How will you prove the ROI? Storing large amounts of data and doing nothing will do no good for the organization.
  6. Choosing big data vendors is a big and tidious task. Initially, everything will seem fine. But once the per node license ,technical challenges, and lack of business support arrives, you will be in the line of fire.

In my next post, I will draw on my 12 years of experience to explain how to overcome these challenges with practical examples and scenarios that I have been through.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:
big data adoption ,big data ,data warehouse ,apache hadoop

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}