Over a million developers have joined DZone.

The Best Blogs Every Data Analyst Should Follow

In this post, the author gathered some of the best blogs and websites that will prove useful for every data analyst.

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

When we discuss with our customers and our community of data analysts we always come up with a common list if go-to resources that everybody uses. Some days ago we wrote a short but awesome list of newsletters about Big Data and Data Science that you should be aware of.

In this post, I gathered some of the best blogs and websites that will prove useful for every data analyst. I tried to include blogs/sources which are up to date and not look "dead". Additions are welcome too.

So let’s get started (in alphabetical order)…

Cross Validated

Cross Validated is part of the Stack Exchange network. It is a Q&A site about statistics, machine learning, data analysis, data mining and data visualization.

Links: Cross Validated

Data Science Central

Data Science Central (DSC) is a thriving community of data scientists and data / big data experts and practitioners. It contains a large number of posts, questions, data sets, training material and more.

Links: Data Science Central


A blog by Curt Monash about data management, BI, and analytic technologies. There is a lot of material for stuff like: Amazon and its cloud, like Amazon RedshiftCassandraKafka, and Confluent or PostgreSQL.

Links: DBMS2

Facebook Data Science Blog

Facebook Data Science Blog is actually a Facebook page. But it is a goldmine as it is the official page of the data scientist teams working at Facebook.

Links: Facebook Data Science Blog

Open - New York Times

This is an interesting blog. It is about code written by New York Times development team. They cover everything from their internal projects and products, along with data analysis, Machine Learning, and data science.

Links: Open - New York Times

O’Reilly Data Radar

O’Reilly is the one stop shop in anything from software engineering to data. In their blog, you can find a huge amount of information about data and big data along with many events, opinions, and offers. O’Reilly Data Radar (with a new website) is a super resource to follow.

Links: O’Reilly Data Radar


There are many topics related to Data Analysis and Data Science topics in Quora. There is an active community answering questions and having discussions on various Data Science topics.

Links: Data Science TopicData Analysis TopicBig Data Analysis Topic


R-Bloggers is a content aggregator from feeds of blogs writing about R. If you are an R fan then you already know R-bloggers, if not it is great to follow it to stay up to date.

Links: R-bloggers


Reddit has some great subreddits about Machine Learning, Data Science and Data analysis.

Links: Machine Learning SubredditData Science Subreddit.

Simply Statistics

Simply Statistics is a blog run by three biostatistics professors (Rafa Irizarry, Roger Peng, and Jeff Leek). They write about statistics (obviously) data analysis and more. You may also find posts like: What is software engineering for data science? or The relativity of raw data.

Links: Simply Statistics

Statistical Modeling, Causal Inference, and Social Science

In this blog, you are going to find A LOT of practical examples for data analysis, statistics, and modeling. It is updated often as there are six people involved and sponsored by 10+ organizations like Columbia University, National Institute of Health or Sloan Foundation.

Links: Statistical Modeling, Causal Inference, and Social Science

The Shape of Data

The Shape of Data is a blog by Jesse Johnson (former math professor, with a research background in low-dimensional geometry/topology). It has intros in data analysis with a bit of geometry.

Links: The Shape of Data


Some bonus resources (that I love) and are great examples from actual applications of data over certain real world categories.


A great resource about politics, sports, data analysis and statistics. It is one of the best data journalism driven websites. We will never forget that they gave Captain America 4-To-1 Odds against winning the Civil War.

Links: FiveThirtyEight


A blog from the makers of Freakonomics book and Super Freakonomics.

Links: Freakonomics


A blog from Arthur Charpentier. A Math professor with statistics related posts. I like it because of its content and humor.

Links: Freakonometrics

A Few Words about Blendo

When we started working on our blog we wanted to write stuff that will help the data analyst, data scientist or anyone with the “data” hat to work better with his company data. We write stuff like: How to load MailChimp's data into Tables and DataFrames, or Access your data in Amazon Redshift and PostgreSQL with Python and R.

Links: Blendo.co

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.

big data,blogs

Published at DZone with permission of George Psistakis, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}