DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Detecting E-Commerce Fraud With Advanced Data Science Techniques
  • Embracing AI for Software Development: Solution Strategies and Implementation
  • Revolutionizing Inventory Management With Artificial Intelligence: A Comprehensive Guide
  • What Is Model Ops?

Trending

  • Continuous Integration vs. Continuous Deployment
  • Resistance to Agile Transformations
  • Freedom to Code on Low-Code Platforms
  • Microservices With Apache Camel and Quarkus
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Quality Assurance and Testing the ML Model

Quality Assurance and Testing the ML Model

This article delves into the meaning of quality assurance and looks at what can be tested with ML models.

Ajitesh Kumar user avatar by
Ajitesh Kumar
·
Aug. 09, 18 · Opinion
Like (2)
Save
Tweet
Share
18.90K Views

Join the DZone community and get the full member experience.

Join For Free

This is the first post in the series of posts related to Quality Assurance (QA) and Testing Practices and Data Science/Machine Learning Models, which I will release in the next few months. The goal of this and upcoming posts is to create a tool and framework that can help you design your testing/QA practices around data science/Machine Learning models.

QA Practices For Testing Machine Learning Models

Are you a test engineer and want to know how you can make a difference in the AI initiative being undertaken by your current company? Are you a QA manager and looking for or researching tools and frameworks that can help your team perform QA with Machine Learning models built by data scientists? Are you in one of the strategic roles in your company and looking for QA practices (to quality assure ML models built by data scientists) that you want to be adopted in your testing center of excellence (COE) to serve your clients in a better manner?

If the answers to the above questions are yes, then keep reading. I will be presenting concepts, tools, and frameworks that will help you achieve some of the objectives mentioned earlier.

I have seen in my experience that ML models are developed and tested by data scientists themselves. This is not a desired situation to be in. Ideally speaking, it should be a quality assurance team that should be performing QA by running tests as like traditional software to test the ML models from time-to-time. However, the challenge is that ML models are not like traditional software where the behavior of the software is pre-determined based on the different inputs. We will touch upon some of the challenges related to testing ML models in later articles.

What Can Be Tested With ML Models?

The following are some of the aspects of a Machine Learning model that needs to be tested/quality assured:

  • Quality of data
  • Quality of features
  • Quality of ML algorithms

Image title

Quality Assurance of Data Used for Training the Model

One of the most overlooked (or ignored) aspects of building a Machine Learning model is to check whether the data used for training and testing the model are sanitized or if they belong to an adversary data set. The adversary data sets are the ones that can be used to skew the results of the model by training the model using incorrect data. This is also termed as data poisoning attacks.

The role of the QA is to put test mechanisms in place to validate whether the data used for training is sanitized. In other words, the tests need to be performed to identify whether there are instances of data poisoning attacks intentionally or unintentionally.

In order to achieve the above, one of the techniques could be to have QA/Test engineers work with product management and product consultant teams for some of the following:

  • Understand the statistics related with data (mean, median, mode etc)
  • Understand the data and their relationships at a high-level
  • Build tests (using scripts) to check the above statistics and relationships.
  • Run the tests at regular intervals

The parameters listed above would need to be tracked at regular intervals and verified with the help of PMs/consultants before every release. We will go into the details in later articles.

Quality Assurance of Features

Many a time, one or more features could cease to be important or become redundant/irrelevant, and, in turn, impact the prediction error rates. This is where a set of QA/testing practices should be in place to proactively evaluate features using feature engineering techniques such as feature selection, dimensionality reduction, etc. We will go into the details in later articles.

Quality Assurance of ML Algorithms

Evolving datasets as a result of business evolution or data poisoning attacks could result in increased prediction error rates. As the ML model gets retrained (manually or in an automated manner), the increased prediction error rates result in the re-evaluation of ML models, which could result in the discovery of new algorithms that could give improved accuracy over the previous ones.

One of the ways to go about testing ML algorithms with new data is the following:

  • Keep all the ML models based on different algorithms handy. Many times, I have seen that ML models are built using different algorithms and get discarded once and for all after the most accurate model gets selected.
  • Retrain all of the models and evaluate the performance
  • Track the performance of all the models with new data set at regular intervals.
  • Raise the defect if another model starts giving greater accuracy or performing better than the existing model.

We will go into further details in later articles.

References

  • Keeping Your Machine Learning Models Up-To-Date
  • An introduction to feature selection

Summary

In this post, you learned about the need for QA practices for Data Science/ML models and also the different aspects of testing the ML models. Please feel free to suggest or share your thoughts in the comments section. 

Machine learning Data science

Published at DZone with permission of Ajitesh Kumar, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Detecting E-Commerce Fraud With Advanced Data Science Techniques
  • Embracing AI for Software Development: Solution Strategies and Implementation
  • Revolutionizing Inventory Management With Artificial Intelligence: A Comprehensive Guide
  • What Is Model Ops?

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: