Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

The Best of the Week (Nov. 15): Big Data Zone

DZone's Guide to

The Best of the Week (Nov. 15): Big Data Zone

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Nov. 15 to Nov. 21). Here they are, in order of popularity:

1. Estimating Age from First Name, Part 1

[Be sure to check out part 2 as well]

After reading a post with lists of the trendiest names in US history, the author decided to compile the lists using R. In this post, the author discusses building a dataframe, as well as a function to query the dataframe.

2. FastR: An R Virtual Machine Written in Java

Those of you who work with R (or Java) might be interested in FastR, an R virtual machine written in Java. R is not the fastest or most efficient language out there, and FastR aims to improve on it in a number of ways.

3. Data News: "The Hidden Technology That Makes Twitter Huge," and More

This installment of Arthur Charpentier's regular collection of data science-related links includes analyzing baseball data with R, a profile of a sword-swallowing statistician, and the technology behind Twitter that creates a massive network of data.

4. A New Tool for Analyzing Hadoop Data in Excel

This new tool seems to be centered on creating an easy-to-use interface for analyzing Hadoop data - probably aiming to be more accessible to employees who aren't quite as much in the loop, among other things - and allows users to manipulate data in Excel, which is then scaled to Hadoop's dataset.

5. Recommendation Engine, Part 2: Diverse Recommender

The author's previous post on recommendation systems suffers from a lack of diversity. For example, a list may contain the same book as a soft cover, hard cover, and Kindle version. Because interests are diverse, a better recommendation list should contain items that cover a broad spectrum of the user's interests

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}