Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Dec. 6 to Dec. 12). Here they are, in order of popularity:
In this article, you'll find a top 100 list of the most popular Java libraries, based on 10,000 GitHub projects and an analysis of the top trends in Java. Like the author, you may be surprised by some of the results.
This presentation from Hilary Mason at devs love bacon is an introduction to machine learning for those who have no prior experience with it. Take a look if you're interested in a quick, fun overview to help you get started.
In previous posts, the author mentioned that he wanted to keep using Lucene to build on top of existing knowledge and experience, but do this while scaling reliably and without too much pain. Elasticsearch turned out to be a perfect fit, and in this article, you'll learn why.
Recently, Yelp made available a sample dataset from the greater Phoenix metropolitan area including around 11,000 businesses and 8,000 check-in sets. We are interested in finding out whether it is possible to visually cluster businesses by category based on their check-in data.
This recent article discusses how to debug Hive (Hadoop) through an anecdote regarding a customer's struggling Hive job. According to the author, there are downsides to working with Hadoop, and sometimes it does not offer a lot of information in terms of what has gone wrong.