Exploring Data Science: Educational and Readable
Manning Publications has put together a strong (and free) introductory guide to the programming language of R and the versatility that it enables.
Join the DZone community and get the full member experience.Join For Free
John Mount and Nina Zumel, co-founders of a San Francisco-based data science consulting firm, showed their knowledge of R and data science in their new free book Exploring Data Science.
Published through Manning Publications as a free informational source, Exploring Data Science is both simple and detailed, rudimentary and complex. It is a "sampler" from Mount and Zumel's book Practical Data Science With R, as well as other Manning titles, and it presents knowledge that is the subject of advanced University courses or detailed training within a company, but handles that knowledge with an ease not often seen in books about programming languages.
The book has five chapters, covering topics ranging from time series to deep learning and even text mining. The topics are meant to be introductory but provide ample time to delve into code and statistical analysis. Screenshots or snippets of code are present throughout the chapters. And real world examples take dry mathematical concepts and render them easier to grasp.
It's the difference between saying a time series is a collection of days that you can assign information to in table and analyze, which is without context, versus looking at weather forecasts along with little anecdotes about history. The latter makes the book more readable and engaging to those new to the subject.
But an example of the depth Mount and Zumel explore a topic comes in the chapter about deep learning, which covers the basic understanding of artificial intelligence in just two pages. Time (and word space) is spent on the method of mimicking the processes of the human brain. Side by side pictures illustrate the concepts of the basic functions of the brain and the deep learning techniques that are patterned after them to create machines that have intelligence.
"Data science is a broad field that touches on aspects of statistics, machine learning, and data engineering," Mount and Zumel write in the introduction. "What the tools, methods, and work look like depend a lot on your problem domain and point of view."
But they add that while the book deals exclusively with the programming language of R, they never meant to imply that it was this, and only this, language that data scientists could use, as if it were king over all the rest.
They write: "Our book, Practical Data Science with R, introduces readers to basic predictive modeling in the R language. But it was never our intent to imply that data scientists can restrict themselves to one problem domain or one implementation language."
This free book may be too rudimentary for those experienced with R and some of the concepts may be dull to someone who has studied data science. But if you have a friend that's always wondered about what you do at your job or a teenager that's interested in programming languages and the utilization of them in data science, this is a worthwhile book.
About the authors:
Nina Zumel has a Ph.D. in robotics from Carnegie Mellon and over 12 years of R&D experience in emergency management research, intelligent search, and online pricing. She co-wrote Practical Data Science with R to address the gap between research and technical practice.
John Mount produces applied research, prototyping, and training in information extraction, algorithms, and data-mining for web-scale businesses, hedge funds, and start-ups. He has additional experience researching in the biotech industry.
To check it out, download Exploring Data Science for free at the website of Manning Publications.
Opinions expressed by DZone contributors are their own.