3 Scenarios for Machine Learning on Multicloud
3 Scenarios for Machine Learning on Multicloud
Learn about aligning with the principle of data gravity and letting consumption channels dictate where you deploy the ML models that will transform your organization.
Join the DZone community and get the full member experience.Join For Free
Insight for I&O leaders on deploying AIOps platforms to enhance performance monitoring today. Read the Guide.
More and more cloud computing experts are talking about "multicloud." The term refers to an architecture that spans multiple cloud environments in order to take advantage of different services, different levels of performance, security, or redundancy, or even different cloud vendors. But what sometimes gets lost in these discussions is that "multicloud" is not always "public cloud." In fact, it's often a combination of private and public clouds.
As machine learning (ML) continues to pervade enterprise environments, we need to understand how to make ML practical on multicloud — including those architectures that span the firewall.
Let's look at three possible scenarios.
Scenario 1: Train with On-Prem Data, Deploy on Cloud
It often happens that the data science team needs to build and train an ML model on sensitive customer data even though the model itself will be deployed on a public cloud. Data gravity and security issues mean that the model needs to be trained behind the firewall, where the data lives. However, the model may need to be invoked by cloud-native applications. Concerns about the latency for scoring calls mean that the model should be deployed close to the consuming app — near the edge of the network, outside the firewall.
Scenario 2: Train on Specialized Hardware, Deploy on Systems of Record
Deep learning models, as well as some types of classic ML models, can benefit from significant acceleration using specialized hardware. For example, a data science team might decide to build and train the model on specialized hardware like a PowerAI machine, which consists of power processors coupled to GPUs through high-speed NVLink connections. The PowerAI machine is designed to significantly speed up the training process, but the model itself may need to be consumed in a system of record like an on-premises system.
Scenario 3: Train on Cloud With Public Data, Deploy On-Prem
The third scenario is becoming increasingly common with the increased availability — and increased quality — of public data. Imagine a financial firm doing arbitrage on agricultural commodities. The data science team gathers a variety of publicly available data including weather and climate data, crop yield data, currency data, and more. Because the data is high-volume and non-proprietary, they aggregate it on a public cloud where they also train their ML model. They pull down the latest version of the model and integrate it within a proprietary application that the firm has developed to predict the prices of the commodities they trade.
Each of these scenarios calls for a fit-for-purpose, multicloud architecture for the flexible training, deployment, and consumption of the machine learning models. IBM takes an enterprise approach by making our Data Science Experience (DSX) platform available both on-prem and in the cloud with intuitive interfaces designed to let users easily move from one to the other. With the same REST APIs, you can save, publish, and consume models across environments — on the mainframe, on a private cloud, or on the public cloud, including on non-IBM public clouds, like AWS and Azure. These two videos demonstrate how easy this is: AWS/Azure.
A Kubernetes-based implementation of the DSX platform gives you the flexibility to run DSX Local within a variety of infrastructure options. For example, you can stand up a multi-node cluster with two separate infrastructure vendors, and then build and train models wherever it's most convenient and move your models from one vendor infrastructure to the other.
In DSX, each deployed model gets an external and internal endpoint. To invoke the model, simply use a REST API call for the endpoint. You can build and train the model on-prem and deploy the model to the cloud, where an external application like a chatbot can consume the model by making a REST API call to the particular endpoint.
When multicloud flexibility lets you pick and choose the cloud environments that best fit your needs, you can align with the principle of data gravity and let your consumption channels dictate where you deploy the machine learning models that will transform your organization.
Published at DZone with permission of John Thomas , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.