If you're ringing in the New Year by building a Hadoop cluster, then you might want to take a look at Atlantbh's detailed tutorial:
Amount of data stored in database/files is growing every day, using this fact there become a need to build cheaper, mainatenable and scalable environments capable of storing big amounts of data („Big Data“). Conventional RDBMS systems became too expensive and not scalable based on today’s needs, so it is time to use/develop new techinques that will be able to satisfy our needs.
One of the technologies that lead in these directions is Cloud computing. There are different implementation of Cloud computing but we selected Hadoop – MapReduce framework with Apache licence based on Google Map Reduce framework.
In this document I will try to explain how to build scalable Hadoop cluster where it is possible to store, index, search and maintain practically unlimited ammounts of data.
This article will cover installation and configuration steps divided into these sections:
- Network architecture
- Operating System
- Hardware requirements
- Hadoop software installation/setup
You can read the complete tutorial at Atlantbh's blog.