DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Serverless NLP: Implementing Sentiment Analysis Using Serverless Technologies
  • Setting Up CORS and Integration on AWS API Gateway Using CloudFormation
  • API Implementation on AWS Serverless Architecture
  • Streamlining AWS Lambda Deployments

Trending

  • Hybrid Cloud vs Multi-Cloud: Choosing the Right Strategy for AI Scalability and Security
  • My LLM Journey as a Software Engineer Exploring a New Domain
  • While Performing Dependency Selection, I Avoid the Loss Of Sleep From Node.js Libraries' Dangers
  • Is Agile Right for Every Project? When To Use It and When To Avoid It
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Building a Scalable ML Pipeline and API in AWS

Building a Scalable ML Pipeline and API in AWS

This blog explains the architecture and design for building a scalable end-to-end ML pipeline using AWS for automated model execution, real-time data processing, and API.

By 
Dhanush Thirugnana user avatar
Dhanush Thirugnana
·
Digvijay Waghela user avatar
Digvijay Waghela
·
Mar. 28, 25 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
4.0K Views

Join the DZone community and get the full member experience.

Join For Free

With rapid progress in the fields of machine learning (ML) and artificial intelligence (AI), it is important to deploy the AI/ML model efficiently in production environments. 

This blog post discusses an end-to-end ML pipeline on AWS SageMaker that leverages serverless computing, event-trigger-based data processing, and external API integrations. The architecture downstream ensures scalability, cost efficiency, and real-time access to applications.

In this blog, we will walk through the architecture, explain design decisions, and examine the key AWS services used to build this system.

Architecture Overview

The AWS-based ML pipeline consists of multiple components that communicate with one another seamlessly to perform model execution, data storage, processing, and API exposure. The workflow includes:

  1. ML Model Execution in AWS SageMaker
  2. Storing data in AWS S3, DynamoDB, and Snowflake
  3. Event-based processing using AWS Lambda and AWS Glue
  4. Real-time API integration with AWS Lambda and Application Load Balancer
  5. Routing traffic to applications through AWS Route 53

Architecture overview

Step 1: Running of the ML Model on AWS SageMaker

The main component of the system is the ML model that runs on AWS SageMaker periodically to generate predictions. This is also called batch processing.

The SageMaker pipeline:

  • Uses preprocessed data and results from previous runs.
  • Applies the ML algorithms for inference.
  • Writes the output in both JSON and Delta formats to an S3 bucket.

Why save data in JSON and Delta formats?

  • JSON is lightweight and can be easily consumed by AWS DynamoDB for real-time querying.
  • Delta format allows for efficient data loading into Snowflake for analytics and reporting.

Step 2: Event-Based Data Processing and Storage

Once SageMaker writes the output to an S3 bucket, an event-based trigger will automatically run the next steps.

  1. S3 Event Notification invokes an AWS Lambda function, as soon as the new "done" file is created in the corresponding S3 location, where the trigger was setup.
  2. The Lambda function invokes the AWS Glue job that:
    • Processes and loads the JSON data from the S3 location into DynamoDB.
    • Copies Delta data to Snowflake.

Why use AWS Glue for data ingestion?

  • AWS Lambda has a max timeout of 15 minutes.
  • Processing and uploading huge amounts of data might take more than 15 minutes.
  • Glue ETL transformations ensure that structured and clean data ingestion is assured.

Step 3: API Processing and Real-Time Access

Now, the data stored in DynamoDB needs to be accessed by external applications. That’s done using APIs. We can use an AWS Lambda function to host the API code.

  • API Lambda function is invoked when the application makes a request.
  • The API Lambda function:
    1. Queries DynamoDB with the latest ML model results.
    2. Integrates with real-time APIs (third-party services) to enhance the results.
    3. Processes all this information and generates an API response. 

Step 4: API Exposure Using Application Load Balancer (ALB)

To handle API traffic, the Lambda function is connected to an AWS Application Load Balancer (ALB).

Why use an Application Load Balancer?

  • ALB routes traffic to the relevant Lambda function.
  • Autoscales based on the number of API requests, ensuring high availability.
  • Distributes traffic efficiently across multiple Lambda instances.
  • Secures the API endpoints by performing authentication and request filtering.

Step 5: Routing API Calls Using Route 53

We integrate AWS Route 53 with the ALB to obtain a consistent API endpoint.

  • Route 53 handles domain name resolution, making sure that the applications can easily connect to the API.
  • It also supports custom domain mapping, allowing other teams to use a user-friendly API URL instead of directly accessing ALB endpoints.
  • If the API Lambda is deployed in multiple regions, Route 53 can be configured to route traffic efficiently, ensuring reliability and failover even during high-traffic periods.

Most Critical Features of This Architecture

  • Scalability – AWS services like SageMaker, Lambda, Glue, and DynamoDB handle loads dynamically
  • Cost optimization – Use of on-demand DynamoDB, serverless Lambda, and event-based processing ensures efficient utilization of resources
  • Real-time processing – Provides real-time access to the ML output with low-latency APIs
  • Seamless integration – Supports integration with other real-time APIs, thereby enhancing results
  • Cross-team collaboration – Exporting data to Snowflake helps businesses and other teams to run analytics against ML predictions 

Future Enhancements and Considerations

  • Streaming processing – Replacing batch flows with Kafka or Kinesis for real-time data processing.
  • Automated model retraining – Use SageMaker Pipelines for automated model retraining.

Conclusion

This AWS-based ML architecture provides a scalable, automated, and efficient pipeline for running ML models, generating predictions, and serving real-time API responses. By utilizing AWS services such as SageMaker, Lambda, Glue, DynamoDB, ALB, and Route 53, the system ensures cost efficiency, high performance, and real-time data availability for downstream applications.

Would love to hear your thoughts!

API AWS AWS Lambda Pipeline (software)

Opinions expressed by DZone contributors are their own.

Related

  • Serverless NLP: Implementing Sentiment Analysis Using Serverless Technologies
  • Setting Up CORS and Integration on AWS API Gateway Using CloudFormation
  • API Implementation on AWS Serverless Architecture
  • Streamlining AWS Lambda Deployments

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!