Apache Hadoop is a Java framework for large-scale distributed batch processing infrastructure which runs on commodity hardware. The biggest advantage is the ability to scale to hundreds or thousands of computers. Hadoop is designed to efficiently distribute and handle large amounts of work across a set of machines. This talk introduces Hadoop along with MapReduce and HDFS. It looks at possible scenarios where Hadoop fits as a robust solution and will include a case study from a project, where Hadoop is used for bulk inserts and large-scale data analytics.
Nov 28, 12