Migrating Legacy Microservices to Modern Java and TypeScript

Incremental strangler fig migration — containerize first, route traffic gradually, and validate with shadow mode testing.

Mar. 30, 26 · Analysis

Likes (0)

Comment

Save

2.6K Views

"Modernize the legacy stack" is a phrase that strikes dread into every senior engineer's heart — and for good reason. Migration projects fail at a notoriously high rate. They balloon in scope, break running systems, and produce tech debt that rivals what they replaced. I led successful migrations of critical microservices to modern runtimes, containerized deployments, and event-driven architectures — on time, without downtime, and with measurable gains in performance and reliability.

This article distills the frameworks, patterns, and hard lessons from those engagements into a practical guide for teams facing similar challenges.

Why Migrations Fail: The Common Traps

Before discussing what works, it's worth naming what doesn't:

The Big Bang rewrite: Halting feature development to rebuild from scratch. Systems become outdated before they ship. Teams lose institutional knowledge. This almost always fails.
The framework upgrade without architecture change: Upgrading Java 8 → Java 17 without rethinking the monolithic service structure just ships a faster monolith. The underlying problems remain.
Ignoring the database layer: Migrating application services while leaving tightly-coupled schemas in place creates a false sense of progress. The database becomes the new bottleneck.
The missing Strangler Fig:— Attempting to migrate everything simultaneously instead of routing traffic incrementally.

The pattern that works: incremental strangler fig migration with continuous deployment verification.

Phase 0: Characterize Before You Modernize

The first step — before writing a single line of new code — is deep characterization of the existing system.

Build a Dependency Map

    Shell
   
   # For Maven projects: visualize the dependency tree
mvn dependency:tree -Dverbose | grep -E "(INFO|WARN)" > dep-tree.txt

# For Node.js microservices: check for outdated dependencies

This analysis revealed 23 transitive dependencies that were unmaintained, 4 services using Spring Boot 1.5 (EOL), and 3 services sharing a database schema — a classic anti-pattern in microservice architectures.

Profile the Current System Under Load

You need a baseline to measure progress against. We captured:

P50/P95/P99 response times per service endpoint
Memory and CPU utilization under typical load
Database query execution plans for the top 20 slowest queries
Error rates and types by service

This data becomes your migration contract: the new system must at minimum match these metrics, and ideally exceed them.

    TypeScript
   
 

   // Capture response time metrics using a Node.js middleware
import { Request, Response, NextFunction } from 'express';
import { Histogram } from 'prom-client';

const httpDuration = new Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'route', 'status_code'],
  buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5],
});

export function metricsMiddleware(req: Request, res: Response, next: NextFunction) {
  const end = httpDuration.startTimer();
  res.on('finish', () => {
    end({ method: req.method, route: req.route?.path ?? req.path, status_code: res.statusCode });
  });
  next();
}
Phase 1: Containerize Without Changing Logic
  

The safest first migration step is containerizing existing services without changing their code. This gives you several advantages:

Establishes Docker/Kubernetes as the deployment standard
Removes environment-specific configuration from the application
Exposes hidden environment dependencies (hardcoded paths, implicit file system assumptions)
Lets the team practice the deployment pipeline before the high-risk code changes

Multi-Stage Dockerfile for Spring Boot

    Dockerfile
   
 

   # Stage 1: Build
FROM maven:3.9.4-eclipse-temurin-17 AS build
WORKDIR /app
COPY pom.xml .
# Cache dependencies separately from source code
RUN mvn dependency:go-offline -q
COPY src ./src
RUN mvn clean package -DskipTests

# Stage 2: Runtime
FROM eclipse-temurin:17-jre-jammy AS runtime
WORKDIR /app

# Run as non-root user
RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser
USER appuser

COPY --from=build /app/target/*.jar app.jar

# JVM tuning for containerized environments
ENV JAVA_OPTS="-XX:+UseContainerSupport \
               -XX:MaxRAMPercentage=75.0 \
               -XX:+UseG1GC \
               -XX:+HeapDumpOnOutOfMemoryError"

EXPOSE 8080
  

Critical flags explained:

-XX:+UseContainerSupport – tells the JVM to respect cgroup memory limits rather than the host's total RAM. Without this, your JVM allocates 25% of the host's 64GB RAM even though the container limit is 2GB.
-XX:MaxRAMPercentage=75.0 – uses 75% of the container's memory limit for heap.
-XX:+HeapDumpOnOutOfMemoryError – writes a heap dump file on OOM for post-mortem analysis.

Kubernetes Deployment With Resource Limits

    YAML
   
 

   apiVersion: apps/v1
kind: Deployment
metadata:
  name: document-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: document-service
  template:
    spec:
      containers:
        - name: document-service
          image: registry.internal/document-service:1.0.0
          resources:
            requests:
              memory: "512Mi"
              cpu: "250m"
            limits:
              memory: "1Gi"
              cpu: "1000m"
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 60
            periodSeconds: 20
          env:
            - name: SPRING_DATASOURCE_URL
              valueFrom:
                secretKeyRef:
                  name: db-credentials
  

Phase 2: The Strangler Fig Pattern in Practice

The Strangler Fig pattern — named after the fig tree that grows around a host tree and gradually replaces it — is the only proven approach for risk-free large-scale migration.

The Routing Proxy

We deployed an NGINX proxy in front of all legacy services. New endpoints are progressively routed to the new service; legacy endpoints remain on the old system until they are fully replaced and validated.

    Nginx
   
 

   upstream legacy_document_service {
    server legacy-docs:8080;
}

upstream new_document_service {
    server new-docs:8080;
}

server {
    location ~ ^/api/v1/documents/(.*)$ {
        # Legacy routes still served by old service
        proxy_pass http://legacy_document_service;
    }

    location ~ ^/api/v2/documents/(.*)$ {
        # New endpoints served by migrated service
        proxy_pass http://new_document_service;
    }
    
    # Feature-flagged shadow routing for validation
    location ~ ^/api/v1/documents/generate$ {
        # Route 5% of traffic to new service for comparison
        set $upstream legacy_document_service;
        if ($request_id ~* "^[0-4]") {
            set $upstream new_document_service;
        }
        proxy_pass http://$upstream;
    }
  

Shadow Mode Testing

Before cutting over a migrated endpoint, we ran it in shadow mode: the request was sent to both the old and new service simultaneously, but only the old service's response was returned to the client. We logged and compared both responses.

    TypeScript
   
 

   // Shadow mode middleware for validation
async function shadowTest(req: Request, res: Response, next: NextFunction) {
  // Send request to legacy system and return its response
  const legacyResponse = await axios(buildLegacyRequest(req));

  // Asynchronously compare with new service (fire-and-forget)
  shadowCompare(req, legacyResponse).catch((err) =>
    logger.warn('Shadow test failed', { path: req.path, error: err.message })
  );

  // Return legacy response to client
  res.status(legacyResponse.status).json(legacyResponse.data);
}

async function shadowCompare(req: Request, legacyResponse: AxiosResponse) {
  const newResponse = await axios(buildNewServiceRequest(req));

  const match =
    legacyResponse.status === newResponse.status &&
    isEquivalentResponse(legacyResponse.data, newResponse.data);

  await metrics.record({
    endpoint: req.path,
    match,
    legacyDuration: legacyResponse.headers['x-response-time'],
    newDuration: newResponse.headers['x-response-time'],
  });
  

This approach let us identify 14 behavioral discrepancies in the new service before any real traffic hit it — issues that would have been production incidents under a hard cutover.

Phase 3: Database Decoupling

The trickiest part of microservice migration is the database. Three services shared a single PostgreSQL schema. Decoupling them required the following sequence:

1. Introduce an Anti-Corruption Layer (ACL)

Before splitting the schema, each service accesses the shared database through a dedicated adapter module. This creates a seam for future extraction.

    TypeScript
   
   // Before: Direct shared DB access
const user = await db.query('SELECT * FROM shared.users WHERE id = $1', [userId]);

// After: Routed through ACL
import { UserRepository } from '@domain/users/repository';
const user = await userRepository.findById(userId);

2. Schema Versioning With Flyway

Every schema change goes through Flyway migrations, versioned and reviewed as code:

    SQL
   
 

   -- V2.1.3__extract_document_metadata_to_service_schema.sql
-- Create the new schema owned by document-service
CREATE SCHEMA IF NOT EXISTS document_service;

-- Copy data (non-destructive)
CREATE TABLE document_service.metadata AS
SELECT id, document_id, created_at, created_by, file_size
FROM shared.document_metadata;

-- Add constraints to new table
ALTER TABLE document_service.metadata
  ADD CONSTRAINT pk_metadata PRIMARY KEY (id),
  ADD CONSTRAINT fk_document FOREIGN KEY (document_id)
      REFERENCES document_service.documents(id);

-- Dual-write trigger during migration window (dropped after cutover)
CREATE OR REPLACE FUNCTION sync_metadata_to_new_schema()
RETURNS TRIGGER AS $$
BEGIN
    INSERT INTO document_service.metadata
    VALUES (NEW.id, NEW.document_id, NEW.created_at, NEW.created_by, NEW.file_size)
    ON CONFLICT (id) DO UPDATE
    SET file_size = EXCLUDED.file_size;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER sync_metadata
AFTER INSERT OR UPDATE ON shared.document_metadata
  

The dual-write trigger ensures both schemas stay in sync during the migration window, providing an instant rollback path.

Phase 4: Event-Driven Decoupling With Kafka

Synchronous service-to-service HTTP calls were causing cascading failures. When the quote-pricing service had elevated latency, the entire quote journey degraded. The solution: replace synchronous calls with asynchronous events via Kafka.

Before: Synchronous Chain

    Shell
   
   Quote Request → QuoteService → [HTTP] → PricingService → [HTTP] → EligibilityService

A 2-second latency spike in the EligibilityService propagated the full 2 seconds to the user.

After: Event-Driven Quote Journey

    TypeScript
   
 

   // QuoteService publishes an event and returns immediately
async function initiateQuote(request: QuoteRequest): Promise<QuoteAcknowledgement> {
  const quoteId = generateQuoteId();

  await kafkaProducer.send({
    topic: 'quote.initiated',
    messages: [{
      key: quoteId,
      value: JSON.stringify({ quoteId, ...request, timestamp: Date.now() }),
    }],
  });

  // Return immediately — processing is async
  return { quoteId, status: 'processing', estimatedCompletion: Date.now() + 3000 };
}

// PricingService subscribes to quote.initiated and publishes quote.priced
kafkaConsumer.on('quote.initiated', async (event) => {
  const price = await calculatePrice(event);
  await kafkaProducer.send({
    topic: 'quote.priced',
    messages: [{ key: event.quoteId, value: JSON.stringify({ ...event, price }) }],
  });
});

// EligibilityService subscribes to quote.priced
kafkaConsumer.on('quote.priced', async (event) => {
  const eligibility = await checkEligibility(event);
  await kafkaProducer.send({
    topic: 'quote.ready',
    messages: [{ key: event.quoteId, value: JSON.stringify({ ...event, eligibility }) }],
  });
  

The client receives a quote ID immediately and polls for completion (or receives a WebSocket push when quote.readyfires). EligibilityService latency no longer affects the user-perceived response time.

Kafka Consumer Error Handling With Dead Letter Queue

    TypeScript
   
 

   await consumer.run({
  eachMessage: async ({ topic, partition, message }) => {
    try {
      await processMessage(message);
    } catch (error) {
      const retryCount = parseInt(message.headers?.['retry-count']?.toString() ?? '0');

      if (retryCount < MAX_RETRIES) {
        // Publish to retry topic with exponential backoff metadata
        await producer.send({
          topic: `${topic}.retry`,
          messages: [{
            ...message,
            headers: { 'retry-count': String(retryCount + 1), 'retry-after': String(Date.now() + 2 ** retryCount * 1000) },
          }],
        });
      } else {
        // Exhausted retries — send to DLQ for manual investigation
        await producer.send({
          topic: `${topic}.dlq`,
          messages: [{ ...message, headers: { 'error': error.message } }],
        });
        logger.error('Message sent to DLQ', { topic, error: error.message });
      }
    }
  },
  

Results Across Projects

Cloud migration:

Document generation latency: P95 reduced from 4.2s → 0.9s
Service deployment time: reduced from 45 minutes → 6 minutes (containerized CI/CD)
Zero production incidents during migration due to shadow testing and the strangler fig approach

Health quote journey:

Quote journey error rate: reduced from 1.8% → 0.12%
P99 quote initiation latency: reduced from 8.1s → 320ms (async decoupling)
Infrastructure cost: reduced by 31% through right-sized containers vs. over-provisioned VMs

Migration Playbook: Summary

Phase	Goal	Key Techniques
0: Characterize	Establish baseline	Dependency mapping, performance profiling
1: Containerize	Remove environment coupling	Multi-stage Docker, Kubernetes with resource limits
2: Strangle	Risk-free incremental migration	Routing proxy, shadow mode testing
3: Decouple DB	Eliminate shared schema anti-pattern	ACL, Flyway versioning, dual-write triggers
4: Go async	Eliminate cascade failures	Kafka event streams, DLQ for error resilience

Conclusion

Microservice modernization is not a technology problem — it's a sequencing problem. The technologies (containers, Kafka, modern JVM runtimes) are mature and well-documented. The challenge is doing it without breaking production systems, maintaining team velocity, and building confidence incrementally. The strangler fig pattern, shadow mode testing, and phased database decoupling are the tools that make the difference between a successful modernization and a multi-year failed rewrite.

Github: Microservices-Migration-Framework-Java-TypeScript

Java (programming language) microservices

Opinions expressed by DZone contributors are their own.

Related

Trending