DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • When Events Move Faster Than Your Database: A Resilient Design Pattern
  • Clock Synchronization and Ordering Events in Distributed Systems: Lamport Clocks vs. Vector Clocks
  • Building a Deterministic Event Correlation Engine in Go for High-Volume Alert Systems
  • Queues Don't Absorb Load — They Delay Bankruptcy

Trending

  • If You Can Survive a Toddler, You Can Ship LLMs in Production
  • Getting Started With Agentic Workflows in Java and Quarkus
  • Good Data, Bad Metric: A Mutation Testing Pattern for Analytics Engineering
  • Liquid Glass, Material 3, and a Lot of Plumbing
  1. DZone
  2. Data Engineering
  3. Databases
  4. Solving Real-Time Event Correlation in Distributed Systems

Solving Real-Time Event Correlation in Distributed Systems

Learn how to join streaming events in real time using Spring Boot with event-time windows, watermarking, and in-memory state — no Kafka or Flink needed.

By 
Rahul Tewari user avatar
Rahul Tewari
·
Nov. 27, 25 · Analysis
Likes (0)
Comment
Save
Tweet
Share
2.0K Views

Join the DZone community and get the full member experience.

Join For Free

Modern digital platforms operate as distributed ecosystems — microservices emitting events, APIs exchanging data, and asynchronous communication becoming the norm. In such environments, correlating events across multiple sources in real time becomes a critical requirement.

Think of payments, orders, customer metadata, IoT sensors, logistics tracking — all flowing continuously.

These events rarely arrive in order, rarely originate from one system, and often must be joined before downstream processing. This leads to a crucial question:

How do we join two streaming event sources (left and right streams) by key and time window using plain Spring Boot, without Kafka Streams, Spark, or Flink?

This blog explains a clean, production-ready way to implement:

  • Real-time stream join
  • Event-time based correlation
  • Watermarking to drop late events
  • Sliding/Tumbling window joins
  • In-memory state store
  • REST API-based ingestion

Why Do We Need a Streaming Join Engine?

In enterprise systems, operations often depend on combining information from multiple event producers.

1. Order + Customer Metadata

  • Left stream: Order event from OMS.
  • Right stream: Customer country/geo from User Service. This is needed for real-time enrichment before invoicing or fraud detection.

2. Payment Event + Fraud Score Event

  • Payment attempt arrives.
  • Fraud score arrives from ML model. Join is required within 5–10 seconds to allow real-time approval decisions.

3. IoT Sensor Fusion

  • Temperature event
  • Humidity event. Must be joined within milliseconds for accurate predictions.

4. Telecom Real-Time Monitoring

  • Subscriber movement.
  • Tower signal. Join is needed for real-time network-performance metrics.

These joins cannot rely on nightly batch jobs, because decisions must be:

  • Real-time
  • Event-driven
  • Low-latency
  • Correct, even if events arrive out of order

Core Challenge: Event-Time Disorder and Late Arrivals

In real systems:

  • Network jitter delays events
  • Services are produced at different speeds
  • Clock skew causes timestamp variations
  • Event bursts cause uneven flow
  • Events can arrive seconds to minutes late

A naïve "join the last event you saw" strategy will break.

Example

Left Event (Order)

  • t = 10:00:05

Right Event (CustomerMeta)

  • t = 10:00:02 (arrives late at 10:00:08)

→ Without event-time logic, this join fails.
→ Business logic says it should match.

This is why we need:

  • Event-time based windows
  • Watermarking
  • Late-event handling

Architecture: Event-Time Join Engine With Watermarking

Event-time join engine with watermarking

Event Models

Left Event

Java
 
public class LeftEvent {
    private String key;
    private long eventTime; // epoch millis
    private String orderId;
    private double amount;
}


Right Event

Java
 
public class RightEvent {
    private String key;
    private long eventTime; // epoch millis
    private String country;
}


How the Join Works: Step-by-Step

1. Accept Event via REST

Java
 
@PostMapping("/left")  → engineService.acceptLeft(e);
@PostMapping("/right") → engineService.acceptRight(e);


Events are added to:

  • leftStore
  • rightStore

2. Update Maximum Event Time

Python
 
maxEventTime = max(maxEventTime, event.eventTime)


3. Compute Watermark

Allowed lateness = 10 seconds → 10000 ms

Python
 
watermark = maxEventTime - 10000


Events older than this are discarded.

4. Perform Join

On the arrival of a new Left Event:

Python
 
scan RightStore[key]
match if |leftTime - rightTime| <= joinWindow


Same for Right Event.

5. Emit Joined Output

A sample join result:

JSON
 
{
  "key": "CUST-1002",
  "orderId": "ORD-9002",
  "amount": 1989.75,
  "country": "India",
  "joinedAt": 1700000105000
}


6. Cleanup Using Watermark

Any event older than the watermark is removed from memory.

Example API Requests and Outputs

POST /api/left

JSON
 
{
  "key": "CUST-1002",
  "eventTime": 1700649000202,
  "orderId": "ORD-9002",
  "amount": 1989.75
}


Response: accepted.

Response: accepted

State:

Python
 
leftStore["CUST-1002"] = [{ORD-9002, 1989.75}]
rightStore["CUST-1002"] = []


POST /api/right

JSON
 
{
  "key": "CUST-1002",
  "eventTime": 1700649000152,
  "country": "India"
}

POST /api/right

Joined Output

JSON
 
{
  "key": "CUST-1002",
  "orderId": "ORD-9002",
  "amount": 1989.75,
  "country": "India",
  "joinedAt": 1700000105000
}


Watermark Example

  • Max event time: 1700000105000
  • Allowed lateness: 10s 
  • Watermark: 1700000095000

API endpoint: GET /api/watermark

Response: 1700000095000 

REST Controller (Final Version)

Java
 
@RestController
@RequestMapping("/api")
public class EventController {

    private final EngineService engineService;

    public EventController(EngineService engineService) {
        this.engineService = engineService;
    }

    @PostMapping("/left")
    public ResponseEntity<?> postLeft(@RequestBody LeftEvent e) {
        engineService.acceptLeft(e);
        return ResponseEntity.ok().body("accepted");
    }

    @PostMapping("/right")
    public ResponseEntity<?> postRight(@RequestBody RightEvent e) {
        engineService.acceptRight(e);
        return ResponseEntity.ok().body("accepted");
    }

    @GetMapping("/watermark")
    public ResponseEntity<?> watermark() {
        return ResponseEntity.ok().body(engineService.getWatermark());
    }
}


Endpoints

Endpoint description

POST /api/left

Ingest Left events

POST /api/right

Ingest Right events

GET /api/watermark

Shows computed watermark


Why Not Use a Database JOIN?

Because database joins are:

  • Based on stored/batch data
  • High latency (seconds to minutes)
  • Order-dependent (arrival order matters)
  • Cannot handle late events.
  • Cannot scale for streaming workloads

Our Spring Boot join engine provides:

  • Millisecond joins
  • Accurate event-time semantics
  • In-memory performance
  • Watermark-based cleanup
  • No external streaming infrastructure

Practical Real-Time Use Case: E-Commerce Order Enrichment

Business Flow

1. Order Service emits:

  • orderId
  • amount
  • eventTime
  • customerId (key)

2. Customer Profile Service emits:

  • customerId
  • country
  • eventTime

Business Need

Before routing an order to:

  • Tax engine
  • Fraud decisioning
  • Payment gateway
  • Invoicing pipeline

It must be enriched with the customer’s country.

Join Logic

  • Join window: 10 seconds
  • Join based on: event-time, not arrival time

Why?

  • The customer profile may arrive late.
  • Order service may be faster.
  • Network latency varies.
  • A message queue (Kafka/SQS) might delay only one stream.

Example Timeline

Time (Actual) event arrival time

10:00:02

RightEvent(country=“IN”)

arrives at 10:00:05

10:00:05

LeftEvent(order=550)

arrives at 10:00:05


Both timestamps fall within the 10-second window.

→ Join event created.

Real-Time Benefits

  • Sub-second join processing. Accurate even when events arrive out of order.
  • No Kafka, Flink, Spark required.
  • Stateless REST producers → State held centrally.
  • Runs on any environment (VM, on-prem, Kubernetes).
  • Production-ready with watermark-based cleanup

Where This Is Used in the Real World

  • Fraud detection: Join payment events with risk-score events.
  • Real-time order enrichment: Join orders with customer metadata (country, geo, segment).
  • Logistics/fleet tracking: Join GPS location with route updates.
  • IoT sensor fusion: Combine temperature + humidity into a unified sensor reading.
  • Telecom analytics: Join subscriber session with tower signal events.
  • Online gaming: Join user action events with current session metadata.

Final Summary

You now have a complete:

  • Real-time streaming join engine
  • Event-time window correlation
  • Watermarking and cleanup
  • REST-based ingestion
  • In-memory state store
  • Real-world business use case
  • Ready-to-deploy Spring Boot implementation

This is a lightweight, efficient alternative to Kafka Streams/Flink for small-to-medium-scale real-time joins.

Event Joins (concurrency library) systems

Opinions expressed by DZone contributors are their own.

Related

  • When Events Move Faster Than Your Database: A Resilient Design Pattern
  • Clock Synchronization and Ordering Events in Distributed Systems: Lamport Clocks vs. Vector Clocks
  • Building a Deterministic Event Correlation Engine in Go for High-Volume Alert Systems
  • Queues Don't Absorb Load — They Delay Bankruptcy

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook