The first edition of my book went to press on November 2012, just over a year ago! It’s not that long, but in Hadoop years it’s a generation, and there have been many exciting developments in Hadoop and its ecosystem, especially YARN, and the promise of a general-purpose, distributed platform that can support any computing models, beyond MapReduce.
I’m excited to announce that I’ve started work on the second edition of the book, which will bring the existing coverage of the book up to date, and also add new chapters to cover items such as:
- An overview of YARN and how it works
- How MapReduce 2 works as a YARN application
- Recipes for writing your own YARN applications
- Pulling data out of Kafka into HDFS
- Running Storm on YARN and using it to perform aggregations
- Using Spark for in-memory, iterative data processing
The book is currently in MEAP, which is Manning’s early access program. The benefit of this program is that you get new content as it’s being written, and at the end you’ll get the full production-polished version of the book.
I welcome any suggestions or ideas for how the book can be improved at the forum.