Selecting the Right Machine Learning Approach
Selecting the Right Machine Learning Approach
Deciding when and how to use AI in your organization is a daunting task. Even though the technology is incredibly advanced, we can apply many of the same tried-and-true methodologies we use for other types of software.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
Deciding when and how to use AI in your organization is a daunting task. Options are abound: according to venturescanner.com, VCs currently fund a whopping 855 AI companies with nearly $9 billion in investments. And that does not count the large number of established providers and bootstrapped companies. It's enough to make your head spin, leading to analysis paralysis.
But let's face it - with all these alternatives, we are still much better equipped to make such a choice than the AI software we're evaluating. Even though the technology is incredibly advanced, we can apply many of the same tried-and-true methodologies we use for other types of software.
Start with the End in Mind
In an earlier article, Thinking Big Data? Think Bold Questions Instead, I encourage starting with a question rather than a tool when evaluating Big Data opportunities. The same applies in the AI/machine learning space. What is exciting about the age we live in is we can ask really bold questions. We are no longer as constrained by hardware or software limitations.
Start by spending time really clarifying the type of question you're looking to answer or problem to solve. Use the approach of the "Five Whys" (asking Why? five times) to get to the root of the problem. I have found some common threads in my experience:
- Top Line (Revenue): Who are our best/most profitable products, customers, prospects, etc., and what actions should we take to maximize? This is an extension of classic market segmentation and Business Intelligence reporting. With newer tools in the Big Data and AI space, we can analyze massive amounts of data and group/predict with a great deal of accuracy and nuance.
- Bottom Line (Costs): What inefficiencies exist in our operations, and how can we optimize to reduce costs? This is also an extension of traditional reporting techniques.
- Customer Experience: What factors drive an optimal/positive customer experience, and what can we do to improve it? In addition to the approach and tools mentioned above, recommendation engines (like those powering Amazon and Netflix) play a big part in this space. Automated assistants for customer service also enter the realm of possibility.
- Knowledge Discovery/Decision Support: What new knowledge and insights can we glean from existing information, and how can we use it to make decisions? This is personally my favorite space and where I've spent most of my career. Decision support tools have been around for awhile, but technology advances continue to improve how much analysis the computer can handle, increasingly freeing us to focus on discovery.
- Smart Machines/Software: While the other areas focus on making businesses or consumers better, this area focuses on creating smart machines to tackle specific problems in the world: from navigating the real world to analyzing and reacting to data in real time. Opportunities still exist here even if you are not a hardcore software development company. If you have a business idea in this space, you can always partner with someone who can bring your vision to life.
Don't be surprised if this line of questioning leads you to a non-technical solution. Sometimes the best solution is to not implement software, but to improve people or process sides of the equation.
For example, I was brought in to help a publishing organization evaluate new analytics tools. After digging into the details, I learned that the real problem they faced was "the innovator's dilemma." Any new technology would undermine their existing business model, unless they addressed the disruption in their market first. I suggested a few modest technology improvements but encouraged the bulk of their focus be on addressing the business model problem head on.
You may also find that more traditional business intelligence tools are sufficient, or what you have is more of a big data scaling problem that does not require artificial intelligence. Remember that success is often asking the right question, not picking the shiny new toy.
Identify the Class of Machine Learning
Despite the dizzying array of vendors and algorithms, there are really only a few classes of machine learning approaches. Start by identifying the approach you need to solve your problem, then you can narrow to the vendors and tools that best support the approach. This may seem obvious, but I don't know how many times I've seen companies start with a particular tool (Hadoop, anyone?) before understanding the need or approach.
The most common approaches are
- Feature Extraction: This approach takes a raw input like text, images, video, audio and extracts relevant "features" or patterns that can be used in subsequent machine learning algorithms. This is not typically relevant by itself but is an important pre-processing step.
- Clustering: Also called "unsupervised learning," clustering takes raw data or features and groups objects together based on how similar they are. The only real requirement is that objects need a means to compare, e.g. what makes them similar or different.
- Classification: Also called "supervised learning," classification takes raw data or features along with a user-defined category and develops rules for placing objects into these categories. The rules can be used to predict categories for new, uncategorized objects. This technique is also helpful for tagging content, e.g. pictures, videos or products.
- Prediction: This approach identifies relationships in existing data to develop rules and make predictions about future events, e.g. will a customer leave ("customer churn") or will a person buy this ("recommendation engine"). Much of the interest and buzz is in prediction, and for good reason: who doesn't want to predict the future?
This may seem like a short list for over so many companies to be tripping over themselves to get into the action, but that's about it. Even more advanced solutions like Google's driverless cars use these basic building blocks: feature extraction (reducing its 3D space into a series of machine-readable objects), classification (these objects look like a car, those like pedestrians), and prediction (if a light turns red the car in front of me will stop).
Determine which of these (either individually or in combination) you need to solve your problem, and you are well on your way to a successful machine learning project.
Select Technology that Matches Your Risk Tolerance
Once you know the types of machine learning algorithms you need, The last step is to evaluate and select technology that meets your specific needs. You might be tempted to go with the most feature-rich and sophisticated of approaches, but that can lead to a mismatch in organizational risk tolerance. I've seen large, mature organizations select software from a guy in a garage and more nimble, smaller organizations go with a giant like IBM. In each case, problems crop up before the ink of a contract even dries.
You are better off going with a vendor whose overall strategy, philosophy, and risk tolerance is in the same neighborhood as yours. The space is changing so quickly that a decision on pure technology is rather short-sighted. You want a partner who will grow and adapt at a similar pace, so that there is no mismatch in expectations. In addition to technology, evaluate the following:
- Company growth strategy
- The leadership team
- Their consulting approach (traditional waterfall, agile, etc.)
- Their technology style (proprietary with heavy R&D, integrator, etc.)
Find those companies that match yours in terms of corporate ethos and you will have found a good partner for embarking on this journey. You can also use this evaluation to intentionally move out of your comfort zone. If you are a large company in need of more innovation, you might select a more dynamic and aggressive vendor for the sole purpose of injecting new thought and energy into a stagnant business. Just be sure you are going in with your eyes open.
Beneath the buzz of machine learning are real opportunities to tackle complex business problems or innovate new products. But with all the noise and bluster in the space, you need to keep a calm head and approach such projects in a rational way: identifying the need, selecting the right approaches, and evaluating vendors in a thoughtful and comprehensive way. Get these right, and you will be heads and shoulders above your competition.
Published at DZone with permission of Matt Coatney , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.