DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Making Machine Learning Accessible for Enterprises: Part 2

Making Machine Learning Accessible for Enterprises: Part 2

Let's take a look at discussing critical areas of machine learning-based solutions, such as model explainability and model governance.

Ramesh Balakrishnan user avatar by
Ramesh Balakrishnan
·
Aug. 08, 18 · Opinion
Like (2)
Save
Tweet
Share
3.90K Views

Join the DZone community and get the full member experience.

Join For Free

In Part 1 of this series, we discussed the need for automation of data science and the need for speed and scale in data transformation and building models. In this part, we will discuss other critical areas of ML-based solutions like:

  • Model Explainability
  • Model Governance (Traceability, Deployment, and Monitoring)

Model Explainability

Simpler Machine Learning models like linear and logistic regression have high interpretability, but may have limited accuracy. On the other hand, Deep Learning models have time and again produced high accuracy results, but are considered black boxes because of the machine’s inability to explain their decisions and actions to human users. With regulations like GDPR, model explainability is quickly becoming one of the biggest challenges for data scientists, legal teams, and enterprises. Explainable AI, commonly referred to as XAI, is becoming one of the most sought-after research areas in Machine Learning. Predictive accuracy and explainability are frequently subject to a trade-off; higher levels of accuracy may be achieved but at the cost of decreased levels of explainability. Unlike Kaggle, competitions where complex ensemble models are created to win competitions, for enterprises, model interpretability is very important. Loan Default Prediction model cannot be used to reject loan to a customer until the model is able to explain why a loan is being rejected. Also, it is often required at the model level as well as individual test instance level. At Model level, there is need to explain key features which are important and how variation in these features affect the model decision. Variable Importance and Partial Dependence plots are popularly used for this. For an individual test instance level, there are packages like “lime,” which help in explaining how black box models make a decision.

Image title

Figure 1: Screenshot with Variable Importance chart from Infosys Nia Machine Learning

Image title

Image title

Figure 2: Screenshot of Partial Dependence Plots from Infosys Nia Machine Learning


Image title

Figure 3: Test Point Variable Importance using LIME

Model Governance (Traceability, Deployment, and Monitoring)

Any Machine Learning project would involve trying multiple hypotheses, data transformation strategies, models, etc. Machine Learning algorithms have dozens of configurable parameters, and whether you work alone or in a team, it is difficult to track which parameters, code, and data went into each experiment to produce a model. Keeping track of where you started and what all options were tried for a particular project is a typical challenge faced by data scientist in any project. Also, there are some industries with certain compliance requirements which makes it essential to track all the activities of a project.

Image titleFigure 4: Project Audit feature from Infosys Nia ML platform

Once the model is built, validated, and signed-off by all stakeholders, the model has to be deployed in Production. REST APIs are the preferred method for Model scoring as they can be easily integrated with the line of business applications. It is critical that the entire data pipeline including all feature engineering and transformations are also packaged along with the model for deployment.

Production Model deployed should be monitored to detect “data drift,” in which production data differs from training data with an emphasis on how such a drift might impact the model performance. Models should be refreshed periodically in Production to avoid data/model drift and the process of moving models from training to production along with data pipeline have to be automated and made seamless.

Conclusion

To succeed in the data-driven economy, you have to get new machine-learning projects up and running quickly. For this, choice of platform that caters to the above needs is very critical.

With these platform capabilities, enterprises can help their teams in accelerating delivery of data science projects along with more accurate results. Enterprise business and IT teams can focus on identifying business problems that are of great value and that can be solved with data at hand.

Machine learning Data science

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Mr. Over, the Engineer [Comic]
  • Real-Time Stream Processing With Hazelcast and StreamNative
  • AWS Cloud Migration: Best Practices and Pitfalls to Avoid
  • How to Secure Your CI/CD Pipeline

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: