Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

The Best of the Week (Mar 29-Apr 5): Big Data Zone

DZone's Guide to

The Best of the Week (Mar 29-Apr 5): Big Data Zone

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

In case you missed them, here are the top posts from the Big Data Zone this week--as chosen by yours truly. This week: Machine learning mistaken for magic, why everything isn't normally distributed, why the unified logging layer matters, the nature of EXTREMELY small possibilities, and some python data manipulation.

1. Machine Learning and Magic

When I first heard about a lie detector as a child, I was puzzled. How could a machine detect lies? If it could, why couldn’t you use it to predict the future? For example, you could say “IBM stock will go up tomorrow” and let the machine tell you whether you’re lying. I saw a presentation of a machine learning package the other day. Some of the questions implied that the audience had a magical understanding of machine learning, as if an algorithm could extract answers from data that do not contain the answer.


2. Why Isn't Everything Normally Distributed?

Why aren’t more phenomena normally distributed? Someone asked me this morning specifically about phenotypes with many genetic inputs. In this article, I will answer this question.


3. Why the Unified Logging Layer Matters

The amount of logs produced today is staggering. The logs provide opportunities for analysis to better understand customers and continually improve products. The log collection pipeline, then, becomes a source of valuable data.


4. Extremely Small Probabilities

Probabilities such as the following have no practical value, but it’s interesting to see how you’d compute them anyway. You could find the probability of a man having negative height by typing pnorm(-23.33) into R or scipy.stats.norm.cdf(-23.33) into Python.


5. Python: Checking Any Value in a List Exists in a Line of Text

I’ve been doing some log file analysis to see what cypher queries were being run on a Neo4j instance and I wanted to narrow down the lines I looked at to only contain ones which had mutating operations i.e. those containing the words MERGE, DELETE, SET or CREATE


Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

Topics:
bigdata ,big data ,best of the week

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}