DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

SBOMs are essential to circumventing software supply chain attacks, and they provide visibility into various software components.

Related

  • Real-Time Market Data Processing: Designing Systems for Low Latency and High Throughput
  • Processing Cloud Data With DuckDB And AWS S3
  • Delta Live Tables in Databricks: A Guide to Smarter, Faster Data Pipelines
  • Microservices With .NET Core: Building Scalable and Resilient Applications

Trending

  • When Caches Collide: Solving Race Conditions in Fare Updates
  • Testing Distributed Microservices Using XState
  • Analysis of the Data Processing Framework of Pandas and Snowpark Pandas API
  • Securing Software Delivery: Zero Trust CI/CD Patterns for Modern Pipelines
  1. DZone
  2. Software Design and Architecture
  3. Microservices
  4. Microservices for Machine Learning

Microservices for Machine Learning

Learn how I scaled my ML-powered finance tracker by breaking a monolithic design into microservices for better performance, maintainability, and deployment.

By 
Ramya Boorugula user avatar
Ramya Boorugula
·
Jul. 01, 25 · Analysis
Likes (0)
Comment
Save
Tweet
Share
826 Views

Join the DZone community and get the full member experience.

Join For Free

My latest personal project, a personal finance tracker with ML-powered insights, started with a simple feature to categorize expense but quickly got expanded to accommodate multiple features including handling everything from transaction classification to spending predictions (I was greedy to get into ML based investment recommendations but oh boy I don’t think I’m there yet to believe in making ML recommended investments :D). When one model failed, everything failed.

So I decided to do what I'd been putting off for months: break the monolith apart. Here's what I learned decomposing my personal ML project into focused microservices, and why, I think, you might want to consider the same approach for your own projects.

The Problem: When Your Side Project Becomes a Nightmare

My finance tracker project started with a simple idea: automatically categorize bank transactions using a text classification model. I trained a basic logistic regression model on my transaction history, wrapped it in a Flask API, and called it done.

But then I got ambitious. The service gradually accumulated more features:

  • Spending pattern analysis for budget insights
  • Fraud detection for suspicious transactions
  • Bill prediction to forecast upcoming expenses

Each new feature kind of made sense in isolation. They all dealt with financial data, shared some preprocessing logic, and keeping everything in one service meant simpler deployment.

I was demoing this product to a friend and realized that within minutes of running it, the entire service crashed. The root cause is: to analyze spending patterns, I used a memory-hungry XGBoost model. My simple transaction categorization, which is the core and fundamental feature that I originally wanted to build, was completely inaccessible.

That's when I realized my "efficient" monolith had become a single point of failure for my entire application.

Finding Service Boundaries

The first step in breaking apart my monolith was figuring out where to draw the lines. Initially, I grouped the models by their technical similarities — all the NLP models together, all the time series models together, etc. This was completely wrong.

Instead, I focused on functional boundaries based on what each model actually did:

Core Transaction Processing

  • Transaction categorization
  • Duplicate detection
  • Data validation and cleaning

Spending Analysis

  • Pattern recognition
  • Budget variance analysis
  • Spending trend prediction

Security and Risk

  • Fraud detection
  • Anomaly detection
  • Suspicious pattern identification

Financial Planning

  • Bill prediction and reminders

This functional approach revealed something interesting: models I'd kept together for technical convenience actually served completely different purposes with different requirements. Transaction categorization needed to be fast and accurate for real-time processing. Spending analysis could run overnight with more complex models. Fraud detection required immediate alerting, but could tolerate some false positives.

The Shared Data Challenge

One of the biggest challenges with breaking down the service was handling shared data and features. My monolith had evolved around a central database and feature computation engine that every model depended on.

User spending patterns, transaction embeddings, account balances — everything was computed once and reused everywhere. Breaking this apart meant rethinking and redesigning how the services would share information.

I ended up with three patterns:

Pattern 1: Service-Owned Data

Some data naturally belonged to specific services. The Security service owned all fraud-related features and suspicious transaction patterns. The Planning service managed investment profiles and savings goals. These stayed within their respective services.

Pattern 2: Shared Data Services

Core data like user profiles, account information, and cleaned transaction history was needed by multiple services. I created a dedicated User Data service that provided clean, consistent data to other services that needed it.

Pattern 3: Event-Driven Updates

For real-time updates, I implemented a simple event system using Redis pub/sub. When a new transaction comes in, the Core Processing service publishes an event that other services can consume to update their models or trigger new analysis.

Service Deployments

Not every service needs the same deployment strategy. This was a key lesson from my resource-constrained personal project environment.

Lightweight Services: Shared Container

Simple models like my transaction categorization (a basic logistic regression) run in the same container as the application logic. The entire Core Processing service fits in a single Docker container with minimal memory.

Heavy Models: Dedicated Containers

My investment recommendation service uses a more complex ensemble model that requires significant memory. It runs in its own container that I can scale independently (or shut down when I'm not using it to save on hosting costs).

Batch Processing: Scheduled Jobs

Services like spending analysis don't need to run continuously. They execute as scheduled jobs — the spending analyzer runs nightly to update insights, and the bill predictor runs weekly to forecast upcoming expenses.

This approach lets me optimize resource usage on my single server while maintaining service independence.

Inter-Service Communication Patterns

With everything broken apart, I needed to figure out how services would talk to each other. ML services have unique communication needs that differ from typical web APIs.

HTTP for Real-Time Requests

When I need immediate results, services communicate directly via HTTP. The Core Processing service makes API calls to the Security service for real-time fraud checking on new transactions.

Message Queues for Background Processing

For non-urgent tasks, I use Redis as a message queue. New transactions trigger background analysis jobs in the Spending Analysis service without blocking the main transaction processing flow.

Shared Cache for Performance

Since I'm running on limited resources, caching became crucial. Services share computed features through Redis to avoid redundant calculations. The User Data service caches user profiles, and the Spending Analysis service caches recent insights.

A Practical Roadmap for Your Projects

If you're struggling with a monolithic ML service in your own projects, here's a practical approach:

  1. List all your ML capabilities and group them by what they accomplish, not what technology they use. Look for natural functional boundaries.
  2. Map out all the data, features, and infrastructure that multiple models depend on. 
  3. Pick the service that causes the most problems — either the most unstable, resource-hungry, or frequently modified. Extract this first.
  4. Before writing any code, figure out how services will share data. This includes both real-time communication and batch data processing.
  5. Extract one service at a time. Use the strangler fig pattern — gradually migrate functionality rather than attempting a big bang rewrite.

Final Thoughts

Decomposing a monolithic ML service isn't just an enterprise concern. Even personal projects can benefit from the clarity and maintainability that comes with focused, single-purpose services.

The key is approaching it pragmatically. You don't need enterprise-grade infrastructure or monitoring tools. Simple patterns, clear boundaries, and gradual migration can transform your project from a maintenance nightmare into something you actually enjoy working on.

The journey from monolith to microservices is worth it.

Data processing Data (computing) microservices

Opinions expressed by DZone contributors are their own.

Related

  • Real-Time Market Data Processing: Designing Systems for Low Latency and High Throughput
  • Processing Cloud Data With DuckDB And AWS S3
  • Delta Live Tables in Databricks: A Guide to Smarter, Faster Data Pipelines
  • Microservices With .NET Core: Building Scalable and Resilient Applications

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: