Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Visualization, Modeling, and Surprises

DZone's Guide to

Visualization, Modeling, and Surprises

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

This afternoon Hadley Wickham gave a great talk on data analysis. Here’s a paraphrase of something profound he said.

Visualization can surprise you, but it doesn’t scale well.
Modelling scales well, but it can’t surprise you.

Visualization can show you something in your data that you didn’t expect. But some things are hard to see, and visualization is a slow, human process.

Modeling might tell you something slightly unexpected, but your choice of model restricts what you’re going to find once you’ve fit it.

So you iterate. Visualization suggests a model, and then you use your model to factor out some feature of the data. Then you visualize again.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}