DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Unlocking the Power of Search: Keywords, Similarity, and Semantics Explained
  • Advancements in AI for Health Data Analysis
  • The Science Behind Emotion AI: Algorithms and Data Analysis
  • The AI Revolution: Embracing the Future of Technology and Automation

Trending

  • *You* Can Shape Trend Reports: Join DZone's Software Supply Chain Security Research
  • GDPR Compliance With .NET: Securing Data the Right Way
  • How to Build Scalable Mobile Apps With React Native: A Step-by-Step Guide
  • Accelerating AI Inference With TensorRT
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Explainable AI: Interpreting BERT Model

Explainable AI: Interpreting BERT Model

Learn how explainable AI techniques like Integrated Gradients can be used to make a deep learning NLP model interpretable.

By 
Sai Sharanya Nalla user avatar
Sai Sharanya Nalla
·
May. 06, 24 · Tutorial
Likes (5)
Comment
Save
Tweet
Share
3.6K Views

Join the DZone community and get the full member experience.

Join For Free

Motivation and Background

Why is it important to build interpretable AI models?

The future of AI is in enabling humans and machines to work together to solve complex problems. Organizations are attempting to improve process efficiency and transparency by combining AI/ML technology with human review.

In recent years with the advancement of AI, AI-specific regulations have emerged, for example, Good Machine Learning Practices (GMLP) in pharma and Model Risk Management (MRM) in finance industries, other broad-spectrum regulations addressing data privacy, EU’s GDPR and California’s CCPA. Similarly, internal compliance teams may also want to interpret a model’s behavior when validating decisions based on model predictions. For instance, underwriters want to learn why a specific loan application was tagged suspicious by an ML model.

Overview

What is interpretability?

In the ML context, interpretability refers to trying to backtrack what factors have contributed to an ML model for making a certain prediction. As shown in the graph below, simpler models are easier to interpret but may often produce lower accuracy compared to complex models like Deep Learning and transformer-based models that can understand non-linear relations in the data and often have high accuracy.

Loosely defined, there are two types of explanations:

  • Global explanation: is explaining on an overall model level to understand what features have contributed the most to the output? For example, in a finance setting where the use case is to build an ML model to identify customers who are most likely to default, some of the most influential features for making that decision are the customer’s credit score, total no. of credit cards, revolving balance, etc.
  • Local explanation: This can enable you to zoom in on a particular data point and observe the behavior of the model in that neighborhood. For example, for sentiment classification of a movie review use case, certain words in the review may have a higher impact on the outcomes vs the other. “I have never watched something as bad.”

What is a transformer model?

A transformer model is a neural network that tracks relationships in sequential input, such as the words in a sentence, to learn context and subsequent meaning. Transformer models use an evolving set of mathematical approaches, called attention or self-attention, to find minute relationships between even distance data elements in a series. Refer to Google’s publication for more information.

Integrated Gradients

Integrated Gradients (IG), is an Explainable AI technique introduced in the paper Axiomatic Attribution for Deep Networks. In this paper, an attempt is made to assign an attribution value to each input feature. This tells how much the input contributed to the final prediction.

IG is a local method that is a popular interpretability technique due to its broad applicability to any differentiable model (e.g., text, image, structured data), ease of implementation, computational efficiency relative to alternative approaches, and theoretical justifications. Integrated gradients represent the integral of gradients with respect to inputs along the path from a given baseline to input. The integral can be approximated using a Riemann Sum or Gauss Legendre quadrature rule. Formally, it can be described as follows:

Integrated Gradients along the i — th dimension of input X. Alpha is the scaling coefficient. The equations are copied from the original paper.

The cornerstones of this approach are two fundamental axioms, namely sensitivity and implementation invariance. More information can be found in the original paper.

Use Case

Now let’s see in action how the Integrated Gradients method can be applied using the Captum package. We will be fine-tuning a question-answering BERT (Bidirectional Encoder Representations from Transformers) model, on the SQUAD dataset using the transformers library from HuggingFace, review notebook for a detailed walkthrough.

Steps

  • Load the tokenizer and pre-trained BERT model, in this case, bert-base-uncased
  • Next is computing attributions w.r.t BertEmbeddings layer. To do so, define baseline/references and numericalize both the baselines and inputs.
Python
 
def construct_whole_bert_embeddings(input_ids, ref_input_ids, \
token_type_ids=None, ref_token_type_ids=None, \
position_ids=None, ref_position_ids=None):


Python
 
input_embeddings = model.bert.embeddings(input_ids, token_type_ids=token_type_ids, position_ids=position_ids)


Python
 
ref_input_embeddings = model.bert.embeddings(ref_input_ids, token_type_ids=ref_token_type_ids, position_ids=ref_position_ids)


Python
 
return input_embeddings, ref_input_embeddings


  • Now, let's define the question-answer pair as an input to our BERT model
Question = “What is important to us?”
text = “It is important to us to include, empower and support humans of all kinds.”
  • Generate corresponding baselines/references for question-answer pair
  • The next step is to make predictions, one option is to use LayerIntegratedGradients and compute the attributions with respect to BertEmbedding. LayerIntegratedGradients represents the integral of gradients with respect to the layer inputs/outputs along the straight-line path from the layer activations at the given baseline to the layer activation at the input.
Python
 
start_scores, end_scores = predict(input_ids, \
token_type_ids=token_type_ids, \
position_ids=position_ids, \
attention_mask=attention_mask)


Python
 
print(‘Question: ‘, question)
print(‘Predicted Answer: ‘, ‘ ‘.join(all_tokens[torch.argmax(start_scores) : torch.argmax(end_scores)+1]))


Python
 
lig = LayerIntegratedGradients(squad_pos_forward_func, model.bert.embeddings)


  • Output:
Question: What is important to us?
Predicted Answer: to include , em ##power and support humans of all kinds
  • Visualize attributes for each word token in the input sequence using a helper function
Python
 
# storing couple samples in an array for visualization purposes


Python
 
start_position_vis =
viz.VisualizationDataRecord(
         attributions_start_sum,            
         torch.max(torch.softmax(start_scores[0], dim=0)),
         torch.argmax(start_scores),
         torch.argmax(start_scores),
         str(ground_truth_start_ind),
         attributions_start_sum.sum(),
         all_tokens,
         delta_start) 


Python
 
print(‘\033[1m’, ‘Visualizations For Start Position’, ‘\033[0m’)
viz.visualize_text([start_position_vis])


Python
 
print(‘\033[1m’, ‘Visualizations For End Position’, ‘\033[0m’)
viz.visualize_text([end_position_vis])

From the results above we can tell that for predicting the start position, our model is focusing more on the question side. More specifically on the tokens ‘what’ and ‘important’. It has also a slight focus on the token sequence ‘to us’ on the text side.

In contrast to that, for predicting end position, our model focuses more on the text side and has relatively high attribution on the last end position token ‘kinds’.

Conclusion

This blog describes how explainable AI techniques like Integrated Gradients can be used to make a deep learning NLP model interpretable by highlighting positive and negative word influences on the outcome of the model.

References

  • Axiomatic Attribution for Deep Networks
  • Model Interpretability for PyTorch
  • Towards Better Understanding of Gradient-Based Attribution Methods for Deep Neural Networks
AI Deep learning NLP

Published at DZone with permission of Sai Sharanya Nalla. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Unlocking the Power of Search: Keywords, Similarity, and Semantics Explained
  • Advancements in AI for Health Data Analysis
  • The Science Behind Emotion AI: Algorithms and Data Analysis
  • The AI Revolution: Embracing the Future of Technology and Automation

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!