DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Implementing Ethical AI: Practical Techniques for Aligning AI Agents With Human Values
  • Foundational Building Blocks for AI Applications
  • Application Task Driven: LLM Evaluation Metrics in Detail
  • Architecting High-Performance Supercomputers for Tomorrow's Challenges

Trending

  • Orchestrating Microservices with Dapr: A Unified Approach
  • How to Ensure Cross-Time Zone Data Integrity and Consistency in Global Data Pipelines
  • Enhancing Business Decision-Making Through Advanced Data Visualization Techniques
  • Endpoint Security Controls: Designing a Secure Endpoint Architecture, Part 2
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Transforming Generative Tasks to Classification Tasks

Transforming Generative Tasks to Classification Tasks

The cost of large general purpose language models can be alleviated by training smaller task specific classification models which can be more efficient.

By 
Kashyap Puranik user avatar
Kashyap Puranik
·
Pranesh Srinivasan user avatar
Pranesh Srinivasan
·
Sep. 21, 23 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
2.2K Views

Join the DZone community and get the full member experience.

Join For Free

ML systems are becoming increasingly powerful, with applications in a wide range of fields. Additionally, ML and NLP systems are being used to improve the quality of information retrieval systems, making it easier for users to find the information they need. This makes it much easier for us to access and understand information from a variety of sources. For example,

  • Conversational AI can be used to create chatbots that can hold natural conversations with humans.
  • Recommender systems can use NLP to recommend products relevant to a user’s interest.
  • Sentiment analysis can be used to identify the emotional tone of text.

The training of many deep learning models requires a vast amount of computational resources, such as GPUs and TPUs. The cost of inference can also be prohibitive for the application of these models in high-performance situations.

Ways To Improve Performance

There are three primary ways to scale the performance of ML models.

  1. We can simplify the task itself in creative ways.
  2. Next, we can consider simpler, smaller models when possible.
  3. Finally, we can use innovative techniques to reduce computational complexity for some decrease in precision.

Of these, the largest gains are often found by creatively redefining the task at hand. For example, if the task is well-defined and not generative, we can design smaller, more efficient models to achieve similar performance.

While larger models help with generative tasks and also with generalization across tasks (aka transfer learning), many times, generative problems can be cast into classification problems that BERT-style models can handle. In other words, if we can narrow the scope of what we want the model to do, we don't need a large model to do it. A smaller model can be just as accurate, and it will be much faster and cheaper to train and run. However, if we want the model to perform a variety of different tasks, then we may need a larger model. This is because larger models may be able to generalize better (aka transfer learning). In some cases, we can even cast a generative problem into a retrieval problem.

Case Study: Customer Support Templates

For example, in the case of support chat interactions, several interactions can be answered from the customer playbook. In this case, it is easier and cheaper to cast the problem as a retrieval problem of finding the template that best satisfies the user.

Collecting Training Data

The first step is to split the customer service log by turning it into a list of questions and responses. This means that each turn in the conversation is broken down into its individual parts, such as the customer's question and the agent's response. Once this is done, the next step is to match the questions and responses against an existing playbook repository. A playbook repository is a collection of pre-written responses that can be used to answer common customer questions. By matching the questions and responses from the customer service log against the playbook repository, the system can identify which pre-written responses are the most likely to be helpful in resolving the customer's issue. This process can help to save time and improve the efficiency of customer service.

Collecting Training Data

Stage 1: Collecting Training Data Set

Alternatively, one can use LLMs to make this step easier. Using few-shot prompts, one can match against the correct template response for each customer service interaction. LLMs can be used to match customer service interactions to the correct template response by providing the LLM with a few examples of customer service interactions and the corresponding template responses. The LLM will then be able to learn to match new customer service interactions to the correct template responses. This can save customer service representatives a lot of time and effort, as they will no longer have to manually search for the correct template response for each interaction.


Collecting Training Data Set

Collected training data in this manner has several advantages:

  • It is offline, meaning that it can be done at training time without any access to user requests.
  • It is small in size, which means that it can be stored and processed easily. This is important for training machine learning models on devices with limited storage or processing power.
  • It can be used for classification tasks with high accuracy. This is because the data is collected in a controlled environment, which reduces the amount of noise and error in the data.

Training Classification Models

We can train a classification model in three steps:

  1. Set up tasks by exploding retrieved template responses. This means breaking down the template responses into their individual components, such as words, phrases, and entities. When exploding retrieved template responses, you need to be careful to avoid choosing irrelevant templates.
  2. Encode interactions and templates using SentenceBERT, Word2Vec, etc. This means representing the interactions and templates as vectors of numbers. When encoding interactions and templates, you need to select an encoding technique that preserves the semantic meaning.
  3. Use a simple, cheap ML model to predict the best template. The model can even be a simple DNN. When using a simple, cheap DNN to predict the best template, you need to choose a DNN that is appropriate. For example, for complex classification, you might want to use a DNN that has been trained on a large corpus of text.

Training Classification Models

The advantages of this classification model are that it has low training costs and is cheap to serve. It is also very easy to re-train the model and even more explainable than using deeper, larger models.

In fact, the learned model need not be DNN. It can also be a random forest (XGBoost-based classifier) that has even more advantages in execution speed and training ease. The advantages of this classification model are:

  • Low training cost: This model is relatively inexpensive to train compared to other models, such as deep neural networks.
  • Cheap to serve: This model is also relatively inexpensive to serve once it has been trained.
  • Easy to retrain: This model is easy to retrain if new data becomes available.
  • More explainable than deeper larger models: This model is easier to explain than deeper larger models, which can be helpful for understanding how the model works and why it makes the predictions that it does.

In fact, the learned model need not be a DNN. It can also be a random forest (XGBoost-based classifier that is known for its speed and accuracy) that has even more advantages in execution speed and training ease. Each decision tree is trained on a different subset of the data, and the predictions from the individual trees are then combined to make a final prediction. This approach can be very effective for classification tasks, and it can also be very fast to train and serve.

Overall, the advantages of this classification model make it a good choice for a variety of applications. It is relatively inexpensive to train and serve, it is easy to retrain, and it is more explainable than deeper, larger models.

Machine learning DNN (software) Language model

Opinions expressed by DZone contributors are their own.

Related

  • Implementing Ethical AI: Practical Techniques for Aligning AI Agents With Human Values
  • Foundational Building Blocks for AI Applications
  • Application Task Driven: LLM Evaluation Metrics in Detail
  • Architecting High-Performance Supercomputers for Tomorrow's Challenges

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!