Over a million developers have joined DZone.

The Best of the Week (Feb. 28): Big Data Zone

DZone 's Guide to

The Best of the Week (Feb. 28): Big Data Zone

· Big Data Zone ·
Free Resource

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Feb. 28 to Mar. 6). Here they are, in order of popularity:

1. Apache Hadoop 2.3.0 Released

This week, Apache Hadoop 2.3.0 was released. There are a lot of bug fixes and small changes in this one - you can read it all in Apache's release notes - but some there are some bigger changes, such as in-memory caching for HDFS and heterogeneous storage hierarchy in HDFS.

2. Big Data: On the Precipice of a Collapse

Before anyone freaks out, the author's talking about a technology collapse, not a market collapse or steep downhill slope of a hype curve. Market demands are pushing our systems to ingest increasing amounts of data in a shorter time, while also making that data available to an increasing variety of queries.

3. Hadoop in Practice, Second Edition

The author has started work on the second edition of his book, which will bring existing coverage up to date, and also add new chapters covering things like YARN, Running Storm on YARN, pulling data out of Kafka into HDFS, using Spark for in-memory, iterative data processing, and more.

4. Python 101: Reading and Writing CSV Files

Python has a vast library of modules that are included with its distribution. The csv module gives the Python programmer the ability to parse CSV (Comma Separated Values) files.

5. Using Oozie 4.4.0 with Hadoop 2.2

The current version of Oozie (4.0.0) doesn’t build correctly when you try and target Hadoop 2.2. The Oozie team have a fix going into release 4.0.1 (see OOZIE-1551), but until then you can hack the Maven files to get it working with 4.0.0.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}