Implementing Asynchronous Communication Between Microservices Using Kafka and Spring Boot

Kafka decouples services, buffers spikes, and routes failures to a DLT. Schemas are contracts; consumers must be idempotent.

Jun. 24, 26 · Tutorial

Likes (1)

Comment

Save

2.2K Views

In a microservices system, that tight coupling turns a small hiccup into a cascading slowdown. Thread pools fill, retries amplify traffic, and suddenly your simple request is blocked on half the fleet. My executive summary: asynchronous messaging with Kafka helps systems keep moving when individual components inevitably slow down or fail. It does this by decoupling producers from consumers, absorbing traffic spikes, and allowing services to evolve without tying their availability directly to one another.

Code Patterns in Spring Boot With Kafka

Spring for Apache Kafka gives me two primitives that feel pleasantly old Spring KafkaTemplate for sending and @KafkaListener for receiving. That template/listener model is intentionally similar to other Spring integration tech, which keeps application code focused on domain logic instead of raw client plumbing.

Below is a compact (but production-shaped) pattern: externalized config via @ConfigurationProperties, a service port for publishing, a REST command endpoint, a consumer with a real error strategy (DLT), and a REST error advice.

    Java
   
 

   // === Messaging config (externalized, type-safe) ===
@ConfigurationProperties(prefix = "messaging.orders")
@Validated
record OrdersMessagingProps(
    @NotBlank String topic,
    @NotBlank String dltTopic
) {}

// === DTO (event contract) ===
public record OrderCreatedEvent(UUID orderId, UUID userId, BigDecimal total, Instant createdAt) {}

// === Service port (keeps domain testable, Kafka swappable) ===
public interface OrderEventPublisher {
  void publishOrderCreated(OrderCreatedEvent event);
}

// === Adapter: Kafka producer ===
@Component
class KafkaOrderEventPublisher implements OrderEventPublisher {
  private final KafkaTemplate<String, OrderCreatedEvent> template;
  private final OrdersMessagingProps props;

  KafkaOrderEventPublisher(KafkaTemplate<String, OrderCreatedEvent> template, OrdersMessagingProps props) {
    this.template = template;
    this.props = props;
  }

  @Override
  public void publishOrderCreated(OrderCreatedEvent event) {
    // Keying by orderId keeps per-order ordering and drives partitioning decisions.
    template.send(props.topic(), event.orderId().toString(), event);
  }
}

// === REST command API (synchronous edge, async core) ===
@RestController
@RequestMapping("/v1/orders")
class OrdersController {
  private final OrderService orderService; // domain port

  OrdersController(OrderService orderService) { this.orderService = orderService; }

  @PostMapping
  public ResponseEntity<Map<String, Object>> create(@Valid @RequestBody CreateOrderRequest req) {
    UUID orderId = orderService.create(req.userId(), req.total()); // persists + publishes event
    return ResponseEntity.accepted().body(Map.of("orderId", orderId, "status", "ACCEPTED"));
  }

  record CreateOrderRequest(@NotNull UUID userId, @NotNull @Positive BigDecimal total) {}
}

// === Domain service port (implementation can use outbox, transactions, etc.) ===
public interface OrderService {
  UUID create(UUID userId, BigDecimal total);
}

// === Consumer: downstream service reacts to events ===
@Component
class BillingListener {
  @KafkaListener(topics = "${messaging.orders.topic}", groupId = "${spring.kafka.consumer.group-id}")
  void onOrderCreated(OrderCreatedEvent event) {
    // Idempotency belongs here: process-by-key + store processed eventId/orderId to avoid duplicates.
    // Do work (charge card, create invoice, etc.)
  }
}

// === Kafka consumer error handling: retries + DLT ===
@Configuration
class KafkaErrorHandlingConfig {

  @Bean
  DefaultErrorHandler defaultErrorHandler(KafkaTemplate<Object, Object> template,
                                         OrdersMessagingProps props) {
    var recoverer = new DeadLetterPublishingRecoverer(template,
        (rec, ex) -> new TopicPartition(props.dltTopic(), rec.partition()));
    // Backoff and retry policy are configurable; keep it finite to avoid poison-pill loops.
    return new DefaultErrorHandler(recoverer, new FixedBackOff(1000L, 3));
  }
}

// === REST error handling (ProblemDetail) ===
@RestControllerAdvice
class ApiErrors {

  @ExceptionHandler(IllegalArgumentException.class)
  @ResponseStatus(HttpStatus.BAD_REQUEST)
  ProblemDetail badRequest(IllegalArgumentException ex) {
    var pd = ProblemDetail.forStatusAndDetail(HttpStatus.BAD_REQUEST, ex.getMessage());
    pd.setTitle("Invalid request");
    return pd;
  }
}

  

A few been-burned-before notes on the code above. Spring Kafka’s reference docs are explicit that KafkaTemplate is the convenience wrapper for producing, and DefaultErrorHandler + DeadLetterPublishingRecoverer is a first-class way to route failed records to dead-letter topics after retries.

If we want non-blocking retries, Spring Kafka also provides @RetryableTopic, which orchestrates retry topics and a DLT automatically useful when transient failures are common and you want predictable retry delay semantics.

Containers and Local Dev With Docker Compose

When I’m chasing down event flow bugs, I like local environments that feel like the old days: one command, deterministic startup order, and no mystery dependencies. Docker Compose is still the quickest way to stand up Kafka alongside your services, and Confluent publishes straightforward Docker-based tutorials and compose examples for running Kafka locally.

For the service image itself, multi-stage builds are the modern classic compile in a builder stage, and copy the artifact into a slimmer runtime stage. Docker documents multi-stage builds as a way to reduce the final image contents and keep build dependencies out of production.

    Dockerfile
   
 

   # Multi-stage Dockerfile for a Spring Boot service (orders-service)
FROM eclipse-temurin:21-jdk AS build
WORKDIR /workspace
COPY mvnw pom.xml ./
COPY .mvn .mvn
RUN ./mvnw -q -DskipTests dependency:go-offline
COPY src src
RUN ./mvnw -q -DskipTests package

FROM eclipse-temurin:21-jre
WORKDIR /app
COPY --from=build /workspace/target/*.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java","-jar","/app/app.jar"]
  

And here’s a Compose file that wires up Kafka and Schema Registry, plus an example Spring Boot service. The exact image choices are illustrative. Your production choices are unspecified and should reflect your standards and security posture.

    YAML
   
 

   # compose.yaml (local/dev)
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.6.0
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181

  kafka:
    image: confluentinc/cp-kafka:7.6.0
    depends_on: [zookeeper]
    ports: ["9092:9092"]
    environment:
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:9092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

  schema-registry:
    image: confluentinc/cp-schema-registry:7.6.0
    depends_on: [kafka]
    ports: ["8081:8081"]
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://kafka:9092

  orders:
    build: ./orders-service
    depends_on: [kafka]
    ports: ["8080:8080"]
    environment:
      SPRING_KAFKA_BOOTSTRAP_SERVERS: kafka:9092
      MESSAGING_ORDERS_TOPIC: orders.events
      MESSAGING_ORDERS_DLTTOPIC: orders.events.dlt
      SCHEMA_REGISTRY_URL: http://schema-registry:8081
  

Deploying on Kubernetes or AWS

On AWS, the Kafka decision is usually managed or self-managed. If you choose Amazon MSK, the cluster lives in your VPC, pick subnets across distinct Availability Zones, and connect clients using the cluster’s bootstrap brokers. That’s the networking baseline, and it’s not optional. MSK is VPC-first by design.

For authentication/authorization, MSK supports IAM access control. AWS documents the client configuration for IAM mechanisms. In EKS, I typically pair MSK IAM with IRSA so pods can obtain AWS credentials the AWS way, while ECS services would use task roles instead. Both patterns are documented by AWS, and your choice here is unspecified.

Kubernetes service discovery is usually the easy part. Services and Pods get DNS names so workloads can call each other by name rather than IP. Kafka itself is reached via bootstrap broker endpoints or via internal Services, but either way, you want the strings in externalized config, not hardcoded.

Here’s a minimal Kubernetes Deployment/Service for a Kafka client service. Values like region, account IDs, and MSK endpoints are unspecified placeholders.

    YAML
   
 

   apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders
  namespace: apps
spec:
  replicas: 2
  selector:
    matchLabels: { app: orders }
  template:
    metadata:
      labels: { app: orders }
    spec:
      serviceAccountName: orders-sa  # IRSA-bound (role ARN unspecified)
      containers:
        - name: orders
          image: <UNSPECIFIED_AWS_ACCOUNT_ID>.dkr.ecr.<UNSPECIFIED_REGION>.amazonaws.com/orders:<TAG>
          ports: [{ containerPort: 8080 }]
          env:
            - name: SPRING_KAFKA_BOOTSTRAP_SERVERS
              value: "<UNSPECIFIED_MSK_BOOTSTRAP_BROKERS>"
            - name: MESSAGING_ORDERS_TOPIC
              value: "orders.events"
            - name: MESSAGING_ORDERS_DLTTOPIC
              value: "orders.events.dlt"
          readinessProbe:
            httpGet: { path: /actuator/health/readiness, port: 8080 }
            initialDelaySeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: orders
  namespace: apps
spec:
  selector: { app: orders }
  ports:
    - port: 80
      targetPort: 8080
  

Operationally, MSK exposes metrics into CloudWatch (AWS/Kafka), and broker logs can be delivered to CloudWatch Logs (or S3/Firehose). That combination gives you the classic visibility loop: throughput, lag, under-replicated partitions, and error logs without running your own monitoring plane.

For distributed tracing in async flows, OpenTelemetry is my default vocabulary now. Spring Boot supports OpenTelemetry export via OTLP, and OpenTelemetry defines Kafka semantic conventions so your producer/consumer spans and attributes stay consistent across tools.

CI/CD and the Hard-Earned Field Notes

For CI/CD, I keep it boring: build once, push an immutable image, deploy via a declarative mechanism. AWS Prescriptive Guidance provides a clear GitHub Actions pattern for building Docker images and pushing to Amazon ECR, which is a solid baseline when your region/account is unspecified until configured.

    YAML
   
 

   # .github/workflows/orders.yml
name: orders

on:
  push:
    branches: ["main"]

jobs:
  build_push_deploy:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-java@v4
        with:
          distribution: temurin
          java-version: "21"

      - name: Build & test
        run: ./mvnw -q test package

      - name: Configure AWS credentials (OIDC)
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::<UNSPECIFIED_AWS_ACCOUNT_ID>:role/<UNSPECIFIED_GHA_ROLE>
          aws-region: <UNSPECIFIED_REGION>

      - name: Login to ECR
        run: |
          aws ecr get-login-password --region <UNSPECIFIED_REGION> \
            | docker login --username AWS --password-stdin <UNSPECIFIED_AWS_ACCOUNT_ID>.dkr.ecr.<UNSPECIFIED_REGION>.amazonaws.com

      - name: Build & push image
        run: |
          IMAGE=<UNSPECIFIED_AWS_ACCOUNT_ID>.dkr.ecr.<UNSPECIFIED_REGION>.amazonaws.com/orders:${{ github.sha }}
          docker build -t $IMAGE ./orders-service
          docker push $IMAGE

      - name: Deploy to EKS (example)
        run: |
          aws eks update-kubeconfig --name <UNSPECIFIED_EKS_CLUSTER> --region <UNSPECIFIED_REGION>
          kubectl -n apps set image deploy/orders orders=$IMAGE
  

Now, the part I wish someone had handed me in 2016: Kafka gives you strong tools, but it does not remove distributed-systems truths. You still need safeguards on the consumer side: idempotent processing, disciplined schema management, and clearly defined retry and dead-letter topic behavior. Kafka’s documentation is careful about the limits of “exactly once” guarantees. Idempotent producers and transactions can strengthen delivery semantics, but achieving true end-to-end exactly-once behavior, especially when external side effects are involved, still depends on deliberate system design.

For schema governance, Kafka itself doesn’t ship a schema registry, but acknowledges third-party registries; in practice, Confluent Schema Registry and Apicurio Registry are common choices. Both store schemas out-of-band, so messages carry only a schema identifier, and both support evolvable contracts across Avro/JSON Schema/Protobuf depending on your ecosystem.

Conclusion and Best Practices

If you take one lesson from my legacy brain into modern event-driven systems, let it be this: asynchrony is a reliability feature, not a performance trick. Kafka’s durable log and consumer group model decouples uptime and absorbs spikes, but you only get the real benefit when you treat schemas as contracts, consumers as idempotent processors, and failure handling as first-class application behavior.

On AWS, the operational baseline is non-negotiable. MSK lives in your VPC across AZ subnets, clients connect via bootstrap brokers, IAM auth is configured explicitly, and observability lives in CloudWatch. Do those fundamentals early, and Kafka stops feeling like a mysterious black box and starts feeling like the dependable workhorse it was built to be.

AWS Contextual design REST Virtual private cloud Docker (software) kafka Schema Spring Boot microservices identity and access management

Opinions expressed by DZone contributors are their own.

Related

Trending