Zero-Downtime Deployments for Java Apps on Kubernetes

Achieve zero-downtime deployments for Java applications on Kubernetes using rolling updates, readiness/liveness probes, and graceful shutdown strategies.

Ramya vani Rayala

May. 29, 26 · Analysis

Likes (0)

Comment

Save

4.3K Views

This article provides a comprehensive guide to achieving zero-downtime deployments for Java-based applications on Kubernetes.

We cover deployment strategies, Kubernetes primitives, Java-specific considerations, session state handling, database migrations, traffic shifting techniques, CI/CD pipelines, GitHub Actions, Jenkins with automated rollbacks, observability (Prometheus, Grafana, Jaeger), Helm/ArgoCD examples, testing strategies (canary analysis, chaos, smoke tests), and troubleshooting.

Deployment Strategies

Kubernetes offers several strategies for deploying new versions without downtime:

Rolling Update

Incrementally replace old pods with new ones, maintaining availability. Kubernetes Deployment object uses rolling updates by default. You can control maxUnavailable and maxSurge to tune the rollout.

Blue-Green Deployment

Run two separate environments: Blue = current, green = new. Only one serves live traffic at a time. Once the Green version is verified, switch the Service or Ingress to point at Green, then scale down Blue. This allows instant rollback by redirecting traffic back to Blue. Argo Rollouts defines a blue/green strategy with an active and preview Service. Traffic flows only to the active version until promotion.

Canary Deployment

Gradually shift a small percentage of traffic to the new version. Start with a few pods of v2, monitor, then incrementally increase. Tools like Istio or Argo Rollouts can control traffic weights. For instance, sending 10% of traffic to v2 can be done by running 9 v1 pods and 1 v2 pod (10%). Argo defines a canary rollout with setWeight steps and pauses for analysis.

Shadow/Mirroring

The new version receives a copy of live requests for testing under real load, but its responses are not returned to users. This is low risk but does not assist in rollback decisions since users don’t see the new behavior.

Kubernetes Primitives for Zero Downtime

Deployment

A Deployment naturally performs rolling updates. By default, it creates a new ReplicaSet and scales it up while scaling down the old one controlled by maxUnavailable/maxSurge. This ensures some pods always serve traffic. To use blue/green, you would deploy two separate Deployments (e.g., app-blue, app-green) and switch Services.

Service and Ingress

A Service fronts pods. For blue/green, you can point a single Service at either the blue or green pods. Ingress can also switch between backend services. E.g., label selectors can be adjusted to redirect traffic from version blue to version green pods.

PodDisruptionBudget

Ensures a minimum number of pods stay running during voluntary disruptions. For instance, setting minAvailable 1 ensures at least one pod remains during a rolling update. To avoid complete downtime during maintenance.

Horizontal Pod Autoscaler (HPA)

Scales pods based on CPU/memory or custom metrics. It automatically updates a workload to match demand. An HPA can be attached to the Deployment so that if traffic spikes during a rollout, new pods will be created to handle the load. Example:

    YAML
   
 

   apiVersion autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
         averageUtilization: 50
  

Liveness and Readiness Probes

Critical for zero downtime. A liveness probe checks if the app is alive; if it fails, K8 restarts the pod. A readiness probe tells if the app is ready to serve traffic. During startup or shutdown, the readiness probe should fail, causing the pod to be removed from the service load balancer. Spring Boot Actuator provides /actuator/health for this. In K8S YAML:

    YAML
   
 

   livenessProbe:
  httpGet:
    path: /actuator/health/liveness
    port: 8080
  initialDelaySeconds: 15
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /actuator/health/readiness
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5
  

Spring Boot exposes health/liveness and health/readiness groups by default. Quarkus and Micronaut have similar health endpoints.

Spring Boot supports graceful shutdown by setting server.shutdown is equals to graceful and tuning spring.lifecycle.timeout-per-shutdown-phase. This causes the embedded server, either Tomcat/Jetty/Undertow, to stop accepting traffic and wait up to the timeout for active requests.

    Java
   
 

   @Component
public class ShutdownListener implements SmartLifecycle {
    private boolean running = true;
    @Override public void stop() {
        running = false;
    }
    @Override public boolean isRunning() {
        return running;
    }
}
  

Quarkus provides graceful shutdown configuration. By setting quarkus.shutdown.timeout=10s, Quarkus will wait up to 10 seconds for current requests to finish before exiting. You can annotate a bean method with @Shutdown to run cleanup code.

Micronaut has @EventListener for ShutdownEvent:

    Java
   
 

   @Singleton
public class ShutdownBean {
    @EventListener
    void onShutdown(ShutdownEvent event) {
    }
}
  

Kubernetes Hooks

You can use a preStop hook in the Deployment spec to run a script before SIGTERM.

    YAML
   
 

   lifecycle:
  preStop:
    exec:
      command: ["/bin/sh","-c","sleep 5"]
terminationGracePeriodSeconds: 30
  

The grace period (default 30s) should be tuned to let the app finish. K8S doc 77†L99-L107 describes the sequence container enters Terminating, runs preStop, sends SIGTERM, waits terminationGracePeriodSeconds, then SIGKILL.

JVM Tuning

Set -XX +ExitOnOutOfMemoryError to avoid hanging. Tune thread pools so they drain quickly. Monitor GC pause times, consider using low-latency GC to minimize pause before shutdown.

Session and State Handling

To maintain zero downtime when pods switch:

Stateless services: Best practice is to keep services stateless. Store session state or user data in an external store, such as Redis or a database. This way, any pod can handle any request, and pods can be replaced without losing the user session.
Sticky sessions: If an app uses in-memory sessions, you can enforce sticky sessions
Service affinity: Set sessionAffinity: ClientIP on the Service. Kubernetes routes requests from the same client IP to the same pod.
Ingress affinity: Use Ingress annotations to bind a user’s requests to one pod. However, sticky sessions introduce risk and are not suitable for autoscaling.
StatefulSets: For true stateful workloads, use StatefulSet with stable identities. StatefulSets pair pods with PersistentVolumes, which are not zero-downtime by themselves.

GitHub Actions CI/CD Pipeline zero-downtime:

    YAML
   
 

   name: Deploy

on:
  push:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-java@v3
        with: { java-version: '17' }
      - name: Build
        run: mvn clean package -DskipTests
       name: Docker Build & Push
        run: |
          docker build -t ghcr.io/myorg/myapp:${{ github.sha }}
          echo ${{ secrets.GITHUB_TOKEN }} | docker login ghcr.io -u ${{ github.actor }} --password-stdin
          docker push ghcr.io/myorg/myapp:${{ github.sha }}
      - name: Set image tag
        run: echo "::set-output name=image::ghcr.io/myorg/myapp:${{ github.sha }}

  deploy:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with: { path: manifests }
      - name: Update K8s deployment
        uses: azure/setup-kubectl@v3
      - name: Deploy to Kubernetes
        run: |
          kubectl set image deployment/myapp-deployment myapp=ghcr.io/myorg/myapp:${{ needs.build.outputs.image }}
          kubectl rollout status deployment myapp-deployment
  

This workflow builds the image, pushes it, and updates the deployment. The rollout status command waits for all new pods to become ready. If health checks fail, it will abort without downtime.

Conclusion

Zero-downtime deployment on Kubernetes combines careful architecture and automation, using rolling updates, progressive strategies, ensuring graceful shutdown and health checks in your Java apps, externalizing state, managing database changes, and orchestrating with CI/CD pipelines. Kubernetes primitives like Deployments, Services, Probes, and HPA, along with tools like Istio or Argo Rollouts, provide the building blocks.

Kubernetes Java (programming language) pods Spring Boot

Opinions expressed by DZone contributors are their own.

Related

Trending