Machine learning (ML) and artificial intelligence (AI) are the new orange in networking. The most common storyline is that algorithms will drive behavior. The theory is that they represent another logical opportunity for vendors to move "up-stack."
And while it is true that algorithms will — at least initially — provide a proprietary means of making solutions better, they aren't the only lucrative aspect of a move towards a more self-driving network.
Machine Learning in 3 Sentences
For those who might have only a passive understanding of machine learning, let me provide a very simple working definition:
- ML is basically the idea that systems can learn new behavior without being told explicitly by a programmer what that behavior ought to be.
- The behavior is expressed in terms of models, which are themselves the result of examining data.
- The data scientists that you probably see popping up all over LinkedIn are the ones who find ways of expressing the data (and its patterns) via algorithms.
What Are the Valuable Bits in ML?
The most obvious answer here is that value will accrue at the algorithms. More simply, whoever can find reason in a sea of data will be able to monetize that reason.
Basically, companies will fall into two major camps: those for whom the algorithms provide competitive differentiation and those for whom ML is really just a tool to make things cheap and cheerful. Depending on which of these most fits your objectives, the actual answer to where the value resides will be different.
Everybody intuitively understands the role of better search and tagging algorithms that improve Google's ability to tune their results, target content, and monetize ads. Most people understand that algorithms will help massive retail make purchase recommendations and tweak pricing to maximize profits. Some people are aware that gaming companies track playing and purchasing behavior, and then use ML to entice players to buy into their in-app purchase schemes.
But not every use case provides a direct link between the business and the algorithms. In fact, for the vast majority of companies and use cases, ML is more likely to be a tool than a core competency.
If ML is relegated to playing a supporting role, this means that it won't be the algorithms that companies must master — rather, algorithms will be procured, certainly as part of broader solutions. And, if done well, the actual algorithms will be analogous to source code — important but ideally obfuscated if the solution is functioning as desired.
Of course, algorithms are not what drives the eventual solution behavior. The models that the algorithms produce will be the means by which generalized rules become contextualized.
In fact, in a networking environment, if the goal of ML is to automate workflows as part of adaptive or predictive operations, generalized algorithms are simply building blocks. Workflows are not ubiquitous. They will be hyper-contextual. And this means that generalized building blocks probably represent only 80% of the solution.
So how do you contextualize an algorithm? In ML-speak, you train it. This is where data comes in. If the thing being trained is common and consistent across all or even many environments, then the data can come from many places and be aggregated as part of the networking solution.
But if the behavior is very specifically based on the actual deployment — both the devices and the surrounding infrastructure, applications, and tools — then the generalized algorithm has to be fed very contextualized data.
In this scenario, the data is almost as important as the algorithm. Getting data useful in training the models is hugely important. In fact, companies that don't have a data strategy are going to find that the hype cycle around ML and AI is particularly brutal. Imagine selling an effort internally to automate everything through this new thing called machine learning only to find out that it requires a massive rip and replace across huge chunks of infrastructure.
Everything Is a Sensor and It Should Be Streaming
Over the past few years, there has been a pretty strong push for streaming data in networking. Efforts around gRPC and message buses (Rabbit, ZMQ, etc.) have been fairly popular among the DevOps crowd. It turns out that solving data distribution is critical in moving to an event-driven infrastructure.
Much of that same work will translate nicely to a world where ML plays a role. There will need to be ways to collect training data. And that is not going to be a one-time thing. If you do not update models as your infrastructure evolves, you will find that things like automation simply accelerate the rate at which you can shoot yourself in the foot. It's a little like upgrading from a revolver to a Glock.
If you are listening to the siren song of machine learning but not considering how you are going to collect and use data, the next few years are going to be fairly disappointing. And if you are clinging to the hope that you can avoid the event-driven interim step in the automation journey, you are likely missing the value of the data in the future state.
While the algorithms are going to be important, they are not going to do the work themselves. People should be planning now for how to contextualize more generalized rule sets.
And the clever companies (both end-users and vendors) will realize that the data, in and of itself, is a thing that has value. This opens up opportunities to monetize in ways that traditional networking has not seen before. Some clever reseller is probably already onto this, imagining how they can deal with lower-priced hardware and still widen their margins by adding real value. Maybe that could be you, too.