Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

When to Use Data Warehousing and Hadoop

DZone's Guide to

When to Use Data Warehousing and Hadoop

How do you know that your data is big? And, if it is big, should you use data warehouses and Hadoop? We examine these questions in this post.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Big data is now becoming a new trend and every organization that has a data warehouse has already, at some point, discussed incorporating a big data analytics platform. But, the big question here is, when is the right time for this?

Well, based on personal and professional experience, I'll summarize my point of view in the hopes of making it easier for decision makers and help you consider the right questions before making the actual investment.

  1. Is there any requirement derived from business users for processing extra streams of data, e.g social media, or any specific requirments that require advanced data mining? If not, then stop. Without having a solid business case can't start your journey.
  2. If your current data warehouse data size is less then 10TB then stop. This amount of data could be handled easily by enterprise data warehouses.
  3. Who will be handling the administration activities of the big data cluster? Either you need to hire the professionals or develop an in-house team which will take care of this solution.
  4. Can current ETL engineers do the job of gig gata engineers as well? You can't ask them to do two jobs at the same time. So, you'll need to train them. And, even then, the tasks of stabilizing and taming the big data will be a large undertaking. Consider it seriously.
  5. Organizational politics. Is your organization ready for that much investment? How will you prove the ROI? Storing large amounts of data and doing nothing will do no good for the organization.
  6. Choosing big data vendors is a big and tidious task. Initially, everything will seem fine. But once the per node license ,technical challenges, and lack of business support arrives, you will be in the line of fire.

In my next post, I will draw on my 12 years of experience to explain how to overcome these challenges with practical examples and scenarios that I have been through.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
big data adoption ,big data ,data warehouse ,apache hadoop

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}