Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

The Best of the Week (Oct 24): Big Data Zone

DZone's Guide to

The Best of the Week (Oct 24): Big Data Zone

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (October 24 - October 31). Here they are, in order of popularity:

1. Sentiment Analysis Beyond Tweets

  • Deep belief networks have made it possible to train computers to predict if a sentence is positive, negative or neutral. Most sentiment analysis captures headlines because tweets can be analysed. However are there business applications beyond social networking analytics? Here are five examples:

2. Getting Hadoop Up and Running on Ubuntu

  • In this post my aim is to get Hadoop up and running on a Ubuntu host using Local (Standalone) Mode and on Pseudo-Distributed Mode.

3. Understanding Information Retrieval by Using Apache Lucene and Tika - Part 3

  • This is a sequal of what was presented in part 1 and part 2 of this tutorial; after indexing and querying we can highlight the results of a search by making use of Highlighter(s).

4. Andrews Curves

  • Andrews curves are a method for visualizing multidimensional data by mapping each observation onto a function. It has been shown the Andrews curves are able to preserve means, distance (up to a constant) and variances. Which means that Andrews curves that are represented by functions close together suggest that the corresponding data points will also be close together.

5. Removing Uncited References in a Tex File (with R)

  • Usually, once you have revised the paper, some references were added, others were dropped. But you need to spend some time to check that all references are actually mentioned in the paper. I wanted to work on that manually this week-end, but @3wen suggested to write a simple R function to scan the tex f file (as well as the aux file actually) to remove uncited references.

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}