Getting Started with Data
Getting Started with Data
Join the DZone community and get the full member experience.
Join For FreeHow to Simplify Apache Kafka. Get eBook.
A regular question I get asked is “What materials would you recommend for someone just getting started in a more data oriented job?” In this blog post I’m going to try to give a set of options, both books and websites, that will answer that question.
Where Am I At?
I currently work as a conversion optimization specialist. What that means is I design/run/analyze feature experiments on web sites. The ultimate goal is usually centered around driving more, or larger, purchases. In working with other analysts, I’ve noticed a set of core skills that, when all are present, make the analyst one of my gotos and that have led me to some success at what I do.
Without Further Ado

Website: Github
 You’re gonna need to learn to code.
 Search for data, or statistics, or anything, and I bet you find sample code.

Book: Think Stats
 Learn a little Python, learn a little stats. A great primer on using the two together.
 Might want to cover the statistics reading I’ve outlined first.

Book: Head First Data Analysis
 Descriptive statistics
 Basic linear regression
 Establishing a “gut” for data

Book: Statistics in Plain English
 Descriptive statistics
 Statistical tests (Binomial and ttests at least)
 Confidence intervals
 Linear regression

Web Article: How Not to Run an A/B Test
 Experiment design
 Dipping your toes into power analysis

Book: The Flaw of Averages * Why the most prevalent descriptive statistic, the average, can be a terribly misleading golden hammer in search of a nail.

Free Online Class: Probability & Statistics, Carnegie Mellon
 Probability (including Bayes theorem)
 Statistics
 Exploratory data analysis

Textbook: A Second Course in Statistics: Regression Analysis (7th Edition)
 In depth treatment on linear regression.
 Tons of theory, but focused on learning to use statistical software to do the analysis.
 Best read after either taking a stats 101 class or learning more about classic statistical tests and how to use them correctly.
 Technology: R Studio * I have read several books on R, but none of them really helped me much. The best thing has been this program, as it’s made it simple to get data into R and viewable so I can focus on analyzing it.
Currently Reading
This is a selection of books that I’m currently reading and learning from, but may or may not have gotten any results from yet.

Textbook: Statistics: A Bayesian Perspective
 A very approachable introduction to Bayesian Stats. It is exceedingly less dense than some of the other material in this list.
 Also starts to get into multiply probability distributions, and is very helpful in visualizing them.

Textbook: Doing Bayesian Data Analysis
 Once you’re done with the previous book, this one builds on it quite nicely.
 Gets down to actually computing more advanced Bayesian statistics problems, including hypothesis tests.
 Also fairly approachable, but I wouldn’t recommend it to an absolute novice.

Textbook: Introduction to Bayesian Statistics
 Very dense. Currently making my way through this bit by bit.
 Nice to use after hitting the first book, and great in parallel with “Doing Bayesian Data Analysis."
 More mathematically focused, but still starts from the basics, so it’s definitely a book that you can use to slowly build up to a more rigorous exploration of Bayesian Statistics.
(Note: This article and the opinions expressed are solely my own and do not represent those of my employer.)
12 Best Practices for Modern Data Ingestion. Download White Paper.
Published at DZone with permission of Justin Bozonier , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
{{ parent.title  parent.header.title}}
{{ parent.tldr }}
{{ parent.linkDescription }}
{{ parent.urlSource.name }}