A Guide to Aspect-Based Sentiment Analysis With GPT and BERT

Explore rapid prototyping with GPT and custom BERT fine-tuning to extract targeted sentiment insights for nuanced text analysis and business applications.

Vaibhavi Tiwari

Nov. 28, 24 · Tutorial

Likes (1)

Comment

Save

1.4K Views

Aspect-based sentiment analysis (ABSA) focuses on determining the sentiment (positive, negative, or neutral) associated with a specific aspect of a text. For example, in the sentence "The battery life is nice, but the screen is dim," ABSA helps identify that the sentiment toward the "battery life" is positive, while the sentiment toward the "screen" is negative. This capability is crucial for businesses to gain nuanced insights from customer feedback, product reviews, and social media.

ABSA is particularly useful for businesses and researchers to gain granular insights into user feedback, such as identifying how customers perceive different product features or services. It has applications in:

E-commerce: Analyzing product reviews to identify customer satisfaction with specific features.
Customer Support: Extracting actionable insights from feedback or complaints.
Social Media Monitoring: Understanding public sentiment on particular aspects of a topic or event.

Approach 1: Using OpenAI's GPT API

This approach leverages OpenAI's GPT, a cutting-edge language model capable of performing sophisticated NLP tasks. By crafting prompts, we can instruct GPT to focus on specific aspects of a sentence and classify its sentiment. This approach is excellent for rapid prototyping, especially if you're new to sentiment analysis or lack labeled data.

Why Choose GPT for ABSA?

Ease of use: Minimal coding required; no need to collect or prepare large datasets.
High accuracy: Benefiting from pre-trained GPT models capable of understanding complex language nuances.
Flexible applications: Ideal for scenarios where quick insights are needed without building a full-fledged machine learning model.

Prerequisites

Python installed: Ensure Python 3.7 or later is installed.
OpenAI API Key: Sign up at OpenAI to get your API key.

Step-by-Step Guide

1. Install Dependencies

Before proceeding, install the required library: This library enables you to interact with the OpenAI GPT API.

    Python
   
   pip install openai

2. Set Up the OpenAI API Key

The script requires your OpenAI API key to authenticate requests. Replace ADD-YOUR-KEY-GET-FROM-OPENAI with your actual API key in the following code snippet:

    Python
   
   openai.api_key = "ADD-YOUR-KEY-GET-FROM-OPENAI"

3. Understanding the Code

The script is built around the function get_aspect_sentiment, which accepts a sentence and an aspect as input. Here's the breakdown:

Input parameters:
- sentence: The text to analyze.
- aspect: The specific aspect to identify sentiment for.
Prompt design: The script uses a carefully crafted prompt to instruct the GPT model to focus on analyzing the sentiment of the given aspect within the context of the sentence. The prompt also instructs the model to consider subtle expressions and negations.

    Python
   
   prompt = (
    f"Identify the sentiment for the aspect '{aspect}' in the following sentence:\n\n"
    f"Sentence: \"{sentence}\"\n\n"
    "Sentiment options: Positive, Negative, or Neutral.\n"
    "Please consider any subtle expressions and negations while analyzing the sentiment."
)

OpenAI API call: The openai.ChatCompletion.create function sends the prompt to the GPT model for processing. The model's response is parsed to extract the sentiment.

    Python
   
 

   response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are an assistant that analyzes sentiment for specific aspects in a sentence."},
        {"role": "user", "content": prompt}
    ],
    max_tokens=50,
    temperature=0
)

  

Output: The function returns the sentiment as a string (Positive, Negative, or Neutral).

Complete Code:

    Python
   
 

   import openai

# Set up the OpenAI API key
openai.api_key = "ADD-YOUR-KEY-FROM-OPENAI"

def get_aspect_sentiment(sentence, aspect):
    # Craft the prompt for aspect-based sentiment analysis
    prompt = (
        f"Identify the sentiment for the aspect '{aspect}' in the following sentence:\n\n"
        f"Sentence: \"{sentence}\"\n\n"
        "Sentiment options: Positive, Negative, or Neutral.\n"
        "Please consider any subtle expressions and negations while analyzing the sentiment."
    )

    # Make the API call
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are an assistant that analyzes sentiment for specific aspects in a sentence."},
            {"role": "user", "content": prompt}
        ],
        max_tokens=50,
        temperature=0
    )

    # Extract the response text
    sentiment = response['choices'][0]['message']['content'].strip()
    return sentiment

# Example usage
sentence = "The battery life is nice."
aspect = "battery life"

sentiment = get_aspect_sentiment(sentence, aspect)
print(f"Sentiment for '{aspect}': {sentiment}")
  

Approach 2: Fine-Tuning BERT for ABSA

BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained transformer model designed for various NLP tasks. Unlike GPT, BERT is bidirectional, meaning it understands the context of words based on both preceding and following words in a sentence. In ABSA, BERT is fine-tuned to predict sentiment for specific aspects using a labeled dataset.

Why Choose BERT for ABSA?

Domain adaptation: Fine-tune BERT on your specific dataset for high accuracy in niche applications.
Offline capability: Use the trained model locally without requiring an internet connection.
Scalability: Suitable for large-scale projects with diverse datasets.

Step-by-Step Guide

1. Install Dependencies:

    Python
   
   pip install transformers torch scikit-learn pandas

2. Prepare the Dataset

The dataset includes sentences, aspects, and corresponding sentiment labels (e.g., 0 = Negative, 1 = Neutral, 2 = Positive).

    Python
   
   import pandas as pd

data = pd.DataFrame({
    'sentence': ["The battery life is amazing", "The camera quality is poor", "The screen is bright and clear"],
    'aspect': ["battery life", "camera", "screen"],
    'label': [2, 0, 2]  # Sentiment: 2 = Positive, 0 = Negative
})

3. Create a Dataset Class

This class converts the dataset into a format compatible with the BERT tokenizer.

    Python
   
 

   class ABSADataset(Dataset):
    def __init__(self, data, tokenizer, max_length):
        self.data = data
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        sentence = self.data.iloc[idx]['sentence']
        aspect = self.data.iloc[idx]['aspect']
        label = self.data.iloc[idx]['label']

        inputs = self.tokenizer(
            sentence, aspect, 
            add_special_tokens=True, 
            max_length=self.max_length, 
            padding='max_length', 
            truncation=True, 
            return_tensors='pt'
        )

        item = {key: val.squeeze() for key, val in inputs.items()}
        item['labels'] = torch.tensor(label, dtype=torch.long)
        return item

  

Model and Training

    Python
   
 

   from transformers import BertForSequenceClassification, Trainer, TrainingArguments

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=3)

train_dataset = ABSADataset(train_data, tokenizer, max_length=128)
test_dataset = ABSADataset(test_data, tokenizer, max_length=128)

training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=2,
    per_device_train_batch_size=4,
    evaluation_strategy="epoch",
    logging_dir='./logs'
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset
)

trainer.train()

  

Evaluation and Prediction

    Python
   
 

   def predict_sentiment(model, tokenizer, sentence, aspect):
    inputs = tokenizer(
        sentence, aspect,
        return_tensors="pt",
        truncation=True,
        padding='max_length',
        max_length=128
    )
    outputs = model(**inputs)
    logits = outputs.logits
    predicted_class = torch.argmax(logits, dim=1).item()
    sentiment_labels = {0: 'Negative', 1: 'Neutral', 2: 'Positive'}
    return sentiment_labels[predicted_class]

print(predict_sentiment(model, tokenizer, "The battery life is amazing", "battery life"))

  

Output: Positive

Complete Code:

    Python
   
 

   import torch
from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
from sklearn.model_selection import train_test_split
from torch.utils.data import Dataset, DataLoader
import pandas as pd

# Sample data (replace with actual dataset)
data = pd.DataFrame({
    'sentence': ["The battery life is amazing", "The camera quality is poor", "The screen is bright and clear"],
    'aspect': ["battery life", "camera", "screen"],
    'label': [2, 0, 2]  # 2 = Positive, 0 = Negative
})

# Train-test split
train_data, test_data = train_test_split(data, test_size=0.2, random_state=42)

class ABSADataset(Dataset):
    def __init__(self, data, tokenizer, max_length):
        self.data = data
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        sentence = self.data.iloc[idx]['sentence']
        aspect = self.data.iloc[idx]['aspect']
        label = self.data.iloc[idx]['label']

        inputs = self.tokenizer(
            sentence, aspect,
            add_special_tokens=True,
            max_length=self.max_length,
            padding='max_length',
            truncation=True,
            return_tensors='pt'
        )

        item = {key: val.squeeze() for key, val in inputs.items()}
        item['labels'] = torch.tensor(label, dtype=torch.long)
        return item

# Initialize tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=3)

# Create datasets and dataloaders
train_dataset = ABSADataset(train_data, tokenizer, max_length=128)
test_dataset = ABSADataset(test_data, tokenizer, max_length=128)

train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=8)

# Define metrics computation
def compute_metrics(pred):
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average='weighted')
    acc = accuracy_score(labels, preds)
    return {
        'accuracy': acc,
        'f1': f1,
        'precision': precision,
        'recall': recall
    }

# Training arguments
training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=2,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    warmup_steps=100,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=10,
    evaluation_strategy="epoch",
    report_to="none"  # Disable wandb and other logging integrations
)

# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics
)

# Disable WandB logging
import os
os.environ["WANDB_DISABLED"] = "true"

# Train the model
trainer.train()

# Evaluate the model
results = trainer.evaluate()
print("Evaluation Results:", results)

# Define prediction function
def predict_sentiment(model, tokenizer, sentence, aspect):
    inputs = tokenizer(
        sentence, aspect,
        return_tensors="pt",
        truncation=True,
        padding='max_length',
        max_length=128
    )
    outputs = model(**inputs)
    logits = outputs.logits
    predicted_class = torch.argmax(logits, dim=1).item()
    sentiment_labels = {0: 'Negative', 1: 'Neutral', 2: 'Positive'}
    return sentiment_labels[predicted_class]

# Example prediction
print(predict_sentiment(model, tokenizer, "The battery life is amazing", "battery life"))

  

Choosing the Right Approach

When selecting the right approach for ABSA, it's important to weigh the strengths and limitations of each method. See the table below for a quick comparison between the two:

CRITERIA	GPT API	BERT FINE-TUNING
Ease of Use	Easy to set up	Requires ML expertise
Customization	Limited	Highly customizable
Domain-Specific Applications	Moderate	Excellent
Online/Offline	Online only	Online after training

Conclusion

In this tutorial, we examined two effective methods for ABSA: optimizing BERT for sophisticated, domain-specific applications and utilizing the ease of use of OpenAI's GPT API for rapid and effective sentiment extraction. Depending on the project's complexity, scalability, and customization needs, each approach has distinct benefits.

These methods offer a strong basis for learning ABSA, regardless of your level of experience. Beginners can plunge into sentiment analysis with little preparation, while developers may construct a reliable offline solution customized for certain datasets.

Begin modestly with GPT, hone your abilities with BERT, and keep coming up with new ideas as you extract useful information from your text data. With these resources at your disposal, you're prepared to take on the complex problems of contextualizing sentiment. The options are virtually limitless.

NLP Sentiment analysis Aspect (computer programming)

Opinions expressed by DZone contributors are their own.

Related

Trending

A Guide to Aspect-Based Sentiment Analysis With GPT and BERT

Explore rapid prototyping with GPT and custom BERT fine-tuning to extract targeted sentiment insights for nuanced text analysis and business applications.

Approach 1: Using OpenAI's GPT API

Why Choose GPT for ABSA?

Prerequisites

Step-by-Step Guide

1. Install Dependencies

2. Set Up the OpenAI API Key

3. Understanding the Code

Approach 2: Fine-Tuning BERT for ABSA

Why Choose BERT for ABSA?

Step-by-Step Guide

1. Install Dependencies:

2. Prepare the Dataset

3. Create a Dataset Class

Choosing the Right Approach

Conclusion

Related

Partner Resources