Over a million developers have joined DZone.

Here's How to Build an Optimal Hadoop Cluster

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

If you're ringing in the New Year by building a Hadoop cluster, then you might want to take a look at Atlantbh's detailed tutorial:

Amount of data stored in database/files is growing every day, using this fact there become a need to build cheaper, mainatenable and scalable environments capable of storing  big amounts of data („Big Data“). Conventional RDBMS systems became too expensive and not scalable based on today’s needs, so it is time to use/develop new techinques that will be able to satisfy our needs.
One of the technologies that lead in these directions is Cloud computing. There are different implementation of Cloud computing but we selected Hadoop – MapReduce framework with Apache licence based on Google Map Reduce framework.
In this document I will try to explain how to build scalable Hadoop cluster where it is possible to store, index, search and maintain practically unlimited ammounts of data.
This article will cover installation and configuration steps divided into these sections:
  • Network architecture
  • Operating System
  • Hardware requirements
  • Hadoop software installation/setup

You can read the complete tutorial at Atlantbh's blog.

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}