Over a million developers have joined DZone.

The Best of the Week (Dec. 6): Big Data Zone

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Dec. 6 to Dec. 12). Here they are, in order of popularity:

1. GitHub’s 10,000 Most Popular Java Projects: Here are the Top Libraries They Use

In this article, you'll find a top 100 list of the most popular Java libraries, based on 10,000 GitHub projects and an analysis of the top trends in Java. Like the author, you may be surprised by some of the results.

2. Introduction to Machine Learning: Everything You Need to Know

This presentation from Hilary Mason at devs love bacon is an introduction to machine learning for those who have no prior experience with it. Take a look if you're interested in a quick, fun overview to help you get started.

3. Why Elasticsearch? - Refactoring Story, Part 3

In previous posts, the author mentioned that he wanted to keep using Lucene to build on top of existing knowledge and experience, but do this while scaling reliably and without too much pain. Elasticsearch turned out to be a perfect fit, and in this article, you'll learn why.

4. Yelp Graph: Business Clustering Based on Check-In Data

Recently, Yelp made available a sample dataset from the greater Phoenix metropolitan area including around 11,000 businesses and 8,000 check-in sets. We are interested in finding out whether it is possible to visually cluster businesses by category based on their check-in data.

5. When Hadoop Gets Stuck: Debugging Hive

This recent article discusses how to debug Hive (Hadoop) through an anecdote regarding a customer's struggling Hive job. According to the author, there are downsides to working with Hadoop, and sometimes it does not offer a lot of information in terms of what has gone wrong.

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.


The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}