Serverless NLP: Implementing Sentiment Analysis Using Serverless Technologies
Learn how to build a serverless application to perform sentiment analysis using AWS lambda, API Gateway, and NLTK's Vader library.
Join the DZone community and get the full member experience.
Join For FreeIn this article, I will discuss building a sentiment analysis tool using AWS serverless capabilities and NLTK. I will be using AWS lambda to run sentiment analysis using the NLTK-vader library and AWS API Gateway to enable this functionality as an API. This architecture eliminates the need for any server management while providing on-demand scalability and cost-efficiency.
Before we dive in, ensure that you have the following:
Step-by-Step Implementation
- First, we'll create a directory for our Lambda package and install NLTK and the required dependencies.
mkdir serverless_sentiment
cd serverless_sentiment
pip install nltk -t ./
- To add the dependencies, open the Python interpreter, download the required NLTK dependencies, and copy them into the
nltk_data
folder inside the lambda package.
python
>>> import nltk
>>> nltk.download(‘punkt’)
>>> nltk.download(‘vader_lexicon’)
cp -R /<projects_folder>/serverless_sentiment/* ./nltk_data
- Create a file named
lambda_function.py
with the following code:
import json
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer
from nltk.tokenize import word_tokenize
intesity_analyger = SentimentIntensityAnalyzer()
def find_overall_sentiment(score):
if score['compound'] >= 0.05:
return '+Positive+'
elif score['compound'] <= -0.05:
return '-Negative-'
else:
return '*Neutral*'
def lambda_handler(event, context):
# Parse the incoming request
body = json.loads(event['body'])
input_text = body['text']
# tokenize words and remove any punctuations
tokens = word_tokenize(text.lower())
tokens = [token for token in tokens if token.isalpha()]
processed_tokens = ' '.join(tokens)
# Run sentiment analysis on the processed text
sentiment_scores = intesity_analyger.polarity_scores(processed_tokens)
# Evaluate the score to identify the overall sentiment
overall_sentiment = find_overall_sentiment(sentiment_scores)
# Prepare and send response
return {
'statusCode': 200,
'body': json.dumps({
'requested_text': input_text,
'sentiment_scores': sentiment_scores,
'sentiment': overall_sentiment
}),
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
}
}
This function parses the “text
” attribute from the request body and runs sentiment analysis on that.
- Package the Lambda code and upload it to the AWS S3 bucket for deployment.
zip -r ../deploymentPackage.zip .
aws s3 mb s3://serverless_sentiment
aws s3 cp ./deploymentPackage.zip s3://serverless_sentiment
aws lambda create-function \
--function-name serverless_sentiment \
--runtime python3.10 \
--role <REPLACE_THIS_WITH_LAMBDA_EXECUTION_ROLE> \
--handler LambdaFunction.lambda_handler --code S3Bucket=serverless_sentiment,S3Key=deploymentPackage.zip \
--environment variables={NLTK_DATA=./nltk_data}
Note: Make sure you have added the environment variables for NLTK_DATA
.
Now that your Lambda function is deployed and ready to execute, let's set up an API using AWS API Gateway to expose it.
- Create a REST endpoint.
# Create new API
aws apigateway create-rest-api --name 'serverless_sentiment' --description 'serverless sentiment analysis API'
# Preserve internal IDs for future scripts
API_ID=$(aws apigateway get-rest-apis --query "items[?name==\`lambda_nltk\`].id" --output text)
PARENT_RES_ID=$(aws apigateway get-resources --rest-api-id $API_ID --query "items.id" --output text)
# Add a resource for the newly created API
aws apigateway create-resource --rest-api-id $API_ID --parent-id $PARENT_RES_ID --path-part sear ch --region us-east-1
# Add POST operation
aws apigateway put-method --rest-api-id $APP_ID --resource-id $RES_ID --http-method POST --aut horization-type NONE
-
Next, we are going to integrate them with our Lambda.
aws apigateway put-integration --rest-api-id $API_D\ > --resource-id $RES_ID \
--http-method POST \
--type AWS \
--integration-http-method POST \
--uri <REPLACE_ME_WITH_LAMBDA_URI>
# Set the resposne type as JSON
aws apigateway put-method-response --rest-api-id $API_ID --resource-id $RES_ID --http-method POST --status-code 200 --response-models "{}"
aws apigateway put-integration-response --rest-api-id $API_ID --resource-id $RES_ID --http-method POST --status-code 200 --selection-pattern ".*"
-
Finally, deploy your API and add the necessary permissions to invoke the Lambda directly from the API.
# Deploy API
aws apigateway create-deployment --rest-api-id $API_ID --stage-name prod
# Add necessary permissions
aws lambda add-permission --function-name serverless_sentiment --statement-id apigateway-serverless-sentiment -prod --action lambda:InvokeFunction --principal apigateway.amazonaws.com --source-arn "arn:aws:execute-api:$REGION:$ACCOUNT:$API/*/POST/sentiment"
You can test your solution using curl or Postman by sending a POST request to your API Gateway endpoint with the text you want to analyze in the body.
Conclusion
We have successfully built a serverless sentiment analysis API that can be used by any system. It can handle varying loads efficiently as AWS Lambda automatically scales based on incoming requests. The same setup can be extended for various NLP tasks like text classification, entity recognition, etc.
Opinions expressed by DZone contributors are their own.
Comments