Over a million developers have joined DZone.

Evolving From Descriptive to Prescriptive Analytics (Part 2): Acquiring the Right Skills, Faster

DZone's Guide to

Evolving From Descriptive to Prescriptive Analytics (Part 2): Acquiring the Right Skills, Faster

With the overwhelming magnitude of learning resources available on the web, it's easy to develop what we call learner's block. Here's where you can begin.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

In the first part of this series, we reviewed the essential first step of acquiring leadership support to shift focus from descriptive to prescriptive analytics. The next step is to acquire the skills in your team in order to succeed with that goal.

If you're new to machine learning and need options to quickly master the art of machine learning, Coursera, edX, and Udemy have a total of 319 machine learning courses for you — and on top of those, there's Udacity and YouTube. It's easy to be overwhelmed by the magnitude of learning resources available on the web and it's difficult to decide where to begin. In fact, you can easily waste weeks or months without learning anything significant. We could call this confused state learner's block.

In fact, when we, a team of software developers, stepped into machine learning for the first time, we had learner's block for the first few months. At that point, we did a couple of things that moved us out of the slow lane.

Holding the Hand of a Mentor

As the team's lead architect, I started by looking for a mentor in machine learning, which turned out to be the single most important decision in our journey into machine learning. The mentor has driven the road and knows where to turn, slow down, or accelerate. We met for 30 minutes each month, and I went to each meeting with prepared questions, sometimes sending him the questions in advance. When I had a pressing question, I could also ask via Slack. I then passed everything I learned on to rest of my team. The whole game of learning had changed for us. We were on an accelerated path to machine learning.

Starting Right: Understanding the Core Concepts

Our mentor advised us to stop wandering around many different learning channels, and instead focus on one course to get started. We registered for Andrew Ng's "Machine Learning" from Coursera. Each of us was learning personally and as a group. But machine learning can't be learned just by watching videos; it needs to be supplemented with hands-on exercise. Every day, we would gather for 30 minutes and watch Andrew's lecture together for 20-25 minutes, followed by discussion. From group discussion, each member of the group got comfortable with the concepts we just learned. It took us roughly two months to go through much of the course. Some of us went ahead to earn the course certificate. In two months, we had a firm grip on the core concepts of machine learning.

Picking up ML Arsenals: Learning Python

Andrew Ng based his course assignments on Octave, which was good for its purpose, but we knew we needed to learn either Python or R: one of the two most of popular languages for industry-grade machine learning. We decided on Python and started with the following two courses:

  1. Google's Python Class: This is the compilation of lessons from a two-day intensive course taught internally at Google. The course offers its lessons in two formats: recorded video lectures or text lessons. We were able to finish this course in a couple of hours. This course covers the fundamentals of Python pretty well.
  2. Intro to Python for Data Science by DataCamp: DataCamp has done a good job in distilling the key concepts of Python for data science with bite-size lessons, interwoven with short programming exercises. DataCamp has simplified the programming practices by embedding the notebook and Python shell in the course portal.

Swimming in the Wild: Solving Problems

According to CrowdFlower, 51% of the efforts in machine learning projects goes into collecting, labeling, cleaning, and organizing data. Unfortunately, the machine learning datasets found in the textbooks or in most online courses are simple and don't offer the chance to practice these data preparation exercises, so they don't help much to see, feel, and deal with real-world data challenges. Our mentor advised us to continue practice machine learning with different datasets.

We took his advice and after finishing our Python courses, we started practicing machine learning with datasets from Kaggle and UCI repositories. Each week, we picked up one or a few machine learning algorithms and a dataset to try over the next week. We then got together to review each other's models and compare the features and performance. The process gave our whole team the courage, confidence, and competence to work on real machine learning projects.

We now have the organizational commitment and skills. In our next article, we'll discuss the tooling needed for efficient and scalable machine learning solutions.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

big data analytics ,data science ,big data ,descriptive analytics ,prescriptive analytics

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}