Over a million developers have joined DZone.

The Best of the Week (Dec. 13): Big Data Zone

DZone's Guide to

The Best of the Week (Dec. 13): Big Data Zone

· Big Data Zone ·
Free Resource

How to Simplify Apache Kafka. Get eBook.

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Dec. 13 to Dec. 19). Here they are, in order of popularity:

1. Handling Big Data with HBase Part 1: Introduction

[This is part 1 of a series. Check out part 2, part 3, part 4, and part 5 as well.]

HBase is a database that provides real-time, random read and write access to tables meant to store billions of rows and millions of columns. It is designed to run on a cluster of commodity servers and to automatically scale as more servers are added, while retaining the same performance.

2. Apache Lucene: Fast Range Faceting Using Segment Trees and the Java ASM Library

Lucene's facet module recently added support for dynamic range faceting, which shows how many hits match each of a dynamic set of ranges. In this article, you'll find segment tree alternatives to the O(N) linear search generally used to find range matches.

3. Hortonworks vs. Cloudera: Hadoop-er Than Thou?

This article looks at the recent mud-slinging (if you can call it that) going on between Hortonworks and Cloudera. It's got to be good news for Hadoop, at least, and it highlights the widespread influence of the open-source Big Data framework.

4. Data News: Bayesian Statistics in Python, and More

This installment of Arthur Charpentier's regular collection of data science-related links includes Bayesian statistics in Python, contagious diseases in the United States from 1888 (in R), a "24 days of R" advent calendar, and more.

5. Data News: "Programming with Big Data in R," and More

This installment of Arthur Charpentier's data science-related links includes the "Programming with Big Data in R" project, an analysis of matches and mismatches in picture recognition software, and a free ebook on data mining applications with R.

12 Best Practices for Modern Data Ingestion. Download White Paper.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}