Over a million developers have joined DZone.

The Best of the Week (Mar 20): Big Data Zone

· Big Data Zone

Read this eGuide to discover the fundamental differences between iPaaS and dPaaS and how the innovative approach of dPaaS gets to the heart of today’s most pressing integration problems, brought to you in partnership with Liaison.

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (March 20 - March 27). Here they are, in order of popularity:

1. Why Isn't Everything Normally Distributed?

  • Why aren’t more phenomena normally distributed? Someone asked me this morning specifically about phenotypes with many genetic inputs. In this article, I will answer this question.

2. Geek Reading March 25, 2015

  • Today we have a few top stories for you. First, Apple announced their acquisition of FoundationDB. I am curious what this purchase is really about, but we will have to wait and see. Both Sides of the Table brings us an interview with Fred Wilson that is very informative. Lastly, Seth Godin talks about how some of the harder things are more worthwhile. It is definitely a good, short read.

3. Machine Learning and Magic

  • When I first heard about a lie detector as a child, I was puzzled. How could a machine detect lies? If it could, why couldn’t you use it to predict the future? For example, you could say “IBM stock will go up tomorrow” and let the machine tell you whether you’re lying. I saw a presentation of a machine learning package the other day. Some of the questions implied that the audience had a magical understanding of machine learning, as if an algorithm could extract answers from data that do not contain the answer.

4. Python: Detecting the Speaker in HIMYM Using Parts of Speech (POS) Tagging

  • Over the last couple of weeks I’ve been experimenting with differentclassifiers to detect speakers in HIMYM transcripts and in all my attempts so far the only features I’ve used have been words. This led to classifiers that were overfitted to the training data so I wanted to generalise them by introducing parts of speech of the words in sentences which are more generic.

5. Python: scikit-learn/lda: Extracting Topics from Qcon Talk Abstracts

  • Following on from Rik van Bruggen’s blog post on a QCon graph he’s created ahead of this week’s conference, I was curious whether we could extract any interesting relationships between talks based on their abstracts. I therefore wanted to extract topics and connect each talk to the topic that describes it best.

Discover the unprecedented possibilities and challenges, created by today’s fast paced data climate and why your current integration solution is not enough, brought to you in partnership with Liaison

Topics:

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}