Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Big Data Zone: Best of the Week (Apr. 26-May 3)

DZone's Guide to

Big Data Zone: Best of the Week (Apr. 26-May 3)

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

In case you missed them, here is a curated list of of the best articles from this week's edition DZone's Big Data Zone. This is a week heavy in R! We have R-related posts about numeric keys in a nested list/dictionary, a Markov chain example using Wikipedia, and solving a locomotive problem in R with posterior probabilities for different priors. Also, growing some trees (in a mathematical sense), and how to choose a Hadoop distro [video].

1. R: Numeric Keys in the Nested List/Dictionary

Last week I described how I’ve been creating fake dictionaries in R using lists and I found myself using the same structure while solving the dice problem in Think Bayes.


2. R: Markov Chain Wikipedia Example

Over the weekend I’ve been reading about Markov Chains and I thought it’d be an interesting exercise for me to translate Wikipedia’s example into R code.


3. How to Choose a Hadoop Distro [Video]

This is my first video post. Rather than writing I simply turned on Camtasia to see how things would work out. If you are interested in a short primer on how to choose a Hadoop distro this video is for you!


4. Growing Some Trees

Consider here the dataset used in a previous post, about visualising a classification (with more than 2 features). We can change the options here, such as the minimum number of observations, per node. To visualize that classification, use the following code (to get a projection on the first two components).


5. R: Think Bayes Locomotive Problem - Posterior Probabilities for Different Priors

In my continued reading of Think Bayes the next problem to tackle is the Locomotive problem. The interesting thing about this question is that it initially seems that we don’t have enough information to come up with any sort of answer. However, we can get an estimate if we come up with a prior to work with.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
bigdata ,big data ,best of the week

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}