Over a million developers have joined DZone.

The Best of the Week (Dec. 13): Big Data Zone

DZone's Guide to

The Best of the Week (Dec. 13): Big Data Zone

· Big Data Zone
Free Resource

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Dec. 13 to Dec. 19). Here they are, in order of popularity:

1. Handling Big Data with HBase Part 1: Introduction

[This is part 1 of a series. Check out part 2, part 3, part 4, and part 5 as well.]

HBase is a database that provides real-time, random read and write access to tables meant to store billions of rows and millions of columns. It is designed to run on a cluster of commodity servers and to automatically scale as more servers are added, while retaining the same performance.

2. Apache Lucene: Fast Range Faceting Using Segment Trees and the Java ASM Library

Lucene's facet module recently added support for dynamic range faceting, which shows how many hits match each of a dynamic set of ranges. In this article, you'll find segment tree alternatives to the O(N) linear search generally used to find range matches.

3. Hortonworks vs. Cloudera: Hadoop-er Than Thou?

This article looks at the recent mud-slinging (if you can call it that) going on between Hortonworks and Cloudera. It's got to be good news for Hadoop, at least, and it highlights the widespread influence of the open-source Big Data framework.

4. Data News: Bayesian Statistics in Python, and More

This installment of Arthur Charpentier's regular collection of data science-related links includes Bayesian statistics in Python, contagious diseases in the United States from 1888 (in R), a "24 days of R" advent calendar, and more.

5. Data News: "Programming with Big Data in R," and More

This installment of Arthur Charpentier's data science-related links includes the "Programming with Big Data in R" project, an analysis of matches and mismatches in picture recognition software, and a free ebook on data mining applications with R.

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks


Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}