DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Build Scalable GenAI Applications in the Cloud: From Data Preparation to Deployment
  • Securing the Future: Defending LLM-Based Applications in the Age of AI
  • Agentic AI 101: Understanding Artificial Intelligence Agents
  • Weka Makes Life Simpler for Developers, Engineers, and Architects

Trending

  • Google Cloud Document AI Basics
  • Emerging Data Architectures: The Future of Data Management
  • *You* Can Shape Trend Reports: Join DZone's Software Supply Chain Security Research
  • How to Build Scalable Mobile Apps With React Native: A Step-by-Step Guide
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Model as a Service in the Generative AI Era

Model as a Service in the Generative AI Era

Learn about a cloud-based service where machine learning or Generative AI models are hosted in the cloud and are easily available for consumption through chat-based APIs.

By 
Bhala Ranganathan user avatar
Bhala Ranganathan
DZone Core CORE ·
Feb. 28, 24 · Analysis
Likes (1)
Comment
Save
Tweet
Share
3.9K Views

Join the DZone community and get the full member experience.

Join For Free

Generative AI space has seen huge advancements in recent times. AI models are getting better and better at tasks like text summarization, question answering, chatting, etc. For example, Bing Copilot has seen several improvements by taking advantage of the GPT-4 technology. Google also announced their Gemini and Bard models. However, training and fine-tuning such models require massive computing infrastructure, and they cost a lot. This is a huge barrier to AI adoption because not many players in the market may be able to come up with such large language models from scratch. However, building models or applications on top of an already existing foundational or base model is something that could solve this problem. This helps businesses because they don’t have to come up with a foundational model by themselves but can take advantage of an already existing one and fine-tune it to suit their needs or directly consume the model.

MaaS and MaaP

Model as a service (MaaS) refers to a cloud-based service where such machine learning or Generative AI models are hosted in the cloud and are easily available for consumption through simple chat-based APIs. The ease of use and lower learning curves for trying these services have accelerated their adoption. In general, MaaS simplifies model consumption.

Model as a platform (MaaP) is different from MaaS, where the model providers get access to the underlying infrastructure provided by the cloud provider rather than giving access to their model directly. In such cases, it could mean the model provider takes care of building, deploying, and managing their machine learning applications by leveraging the cloud infrastructure. MaaP empowers organizations to create comprehensive ML solutions.

MaaS Building Blocks

There are three parties involved in this process, namely the Model provider, Model publisher, and Model consumer. Model provider is typically the one who creates the model, and they can be open or closed-source models—e.g., Open AI, Hugging Face, etc. The model publisher could be a cloud provider who accepts this model from a model provider and makes it available for consumption to consumers—e.g., Amazon, Microsoft, etc. Model consumers consume the available models published by the model publisher. E.g. Chat applications, bots etc.

In order for a model provider to publish their model for consumption as a service, depending on the cloud provider they choose, there might be a few fundamental steps involved, as mentioned below:

  1. Model registry: Model providers may want to use their registry service to provide all metadata associated with the model, like weights, params, bin files, safe tensors, etc.
  2. Model catalog: A repository of available models for consumption. One may expose the foundational model directly or a model built on top of the foundational model.
  3. Model endpoints: Model providers may want to specify the computer they want to provision as part of their model deployment, and the cloud service may expose an endpoint that consumers can use to access the model.

MaaS Interface

Typically, MaaS offerings are chat-based interfaces that generate a text when an input prompt is given. One can also send sampling parameters like temperature, repetition penalty, top k, max tokens, etc., in the request. Below is a hello world example of a request sent to the Facebook/opt-125m model hosted in localhost as a service. In the request, we are sending a few sampling parameters that the model accepts. As we can see, the model responds with a generated text as output.

Shell
 
curl http://0.0.0.0:5001/generate -H "Content-Type: application/json" -d '{
    "model": "facebook/opt-125m",
    "prompt":"Hibiscus is a beautiful",
    "max_tokens":20,
    "temperature":0.8,
    "top_p":0.95
  }'


Shell
 
 {"text":["Hibiscus is a beautiful plant.  It will grow and live for years to come."]}

 

Conclusion

In conclusion, Model as a Service (MaaS) offers a convenient and efficient way to leverage pre-trained machine learning models for specific tasks without the overhead of model development. It enables organizations to focus on their core applications instead of building models on their own. Google, Microsoft, and Amazon are notable cloud providers that offer MaaS, and the list of models they support is expected to increase as new model providers arise. The cloud infrastructure behind the scenes should also scale well to support these models.

AI Machine learning generative AI

Opinions expressed by DZone contributors are their own.

Related

  • Build Scalable GenAI Applications in the Cloud: From Data Preparation to Deployment
  • Securing the Future: Defending LLM-Based Applications in the Age of AI
  • Agentic AI 101: Understanding Artificial Intelligence Agents
  • Weka Makes Life Simpler for Developers, Engineers, and Architects

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!