Over a million developers have joined DZone.

The Best of the Week (Feb. 7): Big Data Zone

DZone's Guide to

The Best of the Week (Feb. 7): Big Data Zone

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Feb. 7 to Feb. 13). Here they are, in order of popularity:

1. Quick FAILs in Code Questions

Sometimes it takes very little time to know that a candidate is going to be pretty horrible. As you can probably guess, the sort of questions we ask tend to be “find me this data in this sort of file.” Probably the fastest indication is when they send me projects like these.

2. Data News: Bayesian Statistics Made Simple at PyCon 2014, and More

This installment of Arthur Charpentier's regular collection of data science-related links includes a discussion of Bayesian statistics, "Getting rid of the Euler equation," a visualization of music style popularity since 1960, and much more.

3. Exploring Data With RapidMiner by Andrew Chisholm

... before you start working on any serious project, especially if it includes millions of records, read this book. It includes tons of “tips and tricks” and “deep dives” to get the most out of RapidMiner and gives numerous practical advices on data exploration.

4. Python Data Visualization Cookbook

As with all cookbooks, the Python Data Visualization Cookbook won't make you a field expert by reading it, but it will surely carve an image of what Python can do regarding data visualization, and that's the most important thing, based on my personal experience.

5. Build Your Own Custom Lucene Query and Scorer

Every now and then we’ll come across a search problem that can’t be solved with plain Solr relevancy. This usually means a customer knows exactly how documents should be scored. For those specialized cases we prescribe a little out-patient surgery to your Solr install: building your own Lucene Query.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}