Rolling Deployments: Safe by Default If You Do Them Right

February 23, 2026

by Eric Hanson, Backend Developer at Clean Systems Consulting

The Default That's Not Configured

Your Kubernetes deployment uses RollingUpdate strategy. You didn't set it explicitly — it's the default. Kubernetes replaces old pods with new pods, a few at a time, until the rollout is complete. This looks correct from the outside: the deployment progresses, health checks pass, the new version is live.

What you didn't configure: how quickly old pods are replaced, what "healthy" means before traffic is sent to a new pod, how long the old pod continues receiving requests after the new one starts, and whether the new version is actually compatible with requests already in flight. The defaults for most of these are permissive in ways that can cause real problems.

What the Defaults Actually Do

In Kubernetes with default RollingUpdate settings:

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 25%        # Allow up to 25% extra pods during rollout
    maxUnavailable: 25%  # Allow up to 25% of pods to be unavailable

With 8 replicas, Kubernetes can take down 2 pods simultaneously and start 2 new ones. If the new pods start quickly and pass the health check, this moves fast. If the health check is misconfigured — checking TCP connectivity rather than actual readiness — pods receive traffic before they're truly ready.

The missing configuration that makes rolling deployments actually safe:

# Deployment spec
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0    # Never go below desired capacity during rollout

  template:
    spec:
      containers:
        - name: myapp

          # Readiness probe: traffic only routes here when this passes
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 3
            successThreshold: 1

          # Liveness probe: restart the container if this fails
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
            failureThreshold: 3

          # Graceful shutdown: finish in-flight requests before terminating
          lifecycle:
            preStop:
              exec:
                command: ["sh", "-c", "sleep 5"]  # Drain window

      terminationGracePeriodSeconds: 60

maxUnavailable: 0 ensures capacity never drops below the desired count — a new pod must be ready before an old one is removed. Combined with a real readiness probe (not just TCP), this prevents traffic from reaching pods before they're actually handling requests correctly.

The Readiness Probe Is the Most Important Knob

The readiness probe decides whether a pod receives traffic. If it's wrong, the rest of the configuration doesn't matter.

A common mistake: using the liveness probe path for readiness. Liveness checks whether the process is alive. Readiness checks whether it's ready to serve traffic — which means all connection pools are initialized, caches are warm (if required), and downstream dependencies are reachable.

For Spring Boot, use the built-in readiness and liveness actuator endpoints that integrate with Spring's ApplicationContext lifecycle:

# application.yml
management:
  endpoint:
    health:
      probes:
        enabled: true
  health:
    readinessState:
      enabled: true
    livenessState:
      enabled: true

Spring Boot marks the readiness endpoint as OUT_OF_SERVICE until the application context is fully initialized and all ApplicationListener<AvailabilityChangeEvent> handlers confirm readiness. This means Kubernetes won't route traffic to a pod that's still running @PostConstruct initialization or warming connection pools.

Handling In-Flight Requests During Rollout

When Kubernetes terminates an old pod, requests that are already in progress on that pod need time to complete. Without a graceful shutdown window, those requests get abruptly terminated.

The preStop sleep combined with terminationGracePeriodSeconds creates that window. The sequence on pod termination:

Pod is removed from the Service endpoint (no new requests routed here)
preStop hook runs (sleep 5 seconds — wait for load balancer to propagate the removal)
SIGTERM is sent to the container
Application handles SIGTERM by stopping accepting new requests and finishing in-flight ones
After terminationGracePeriodSeconds, SIGKILL is sent if the process hasn't exited

For Spring Boot, configure graceful shutdown:

# application.yml
server:
  shutdown: graceful
spring:
  lifecycle:
    timeout-per-shutdown-phase: 30s

This tells Spring to drain active requests before completing the shutdown, up to 30 seconds.

The Compatibility Requirement

Rolling deployments mean both versions run simultaneously during the rollout window. Any API contract, database schema, or message format must be compatible with both versions during that window.

A v1.3 pod writing records to the database that v1.2 pods can't read will cause errors for requests that land on v1.2 pods during the overlap period. The expand-contract migration pattern (add nullable column in one release, make it required in the next) prevents this.

The key question before every rolling deployment: is my new version compatible with the currently deployed version, both ways? If not, consider blue-green instead — it eliminates the mixed-version window entirely. Rolling deployment safety depends on that compatibility invariant. Don't violate it silently.

Our offices

Follow us

Rolling Deployments: Safe by Default If You Do Them Right

The Default That's Not Configured

What the Defaults Actually Do

The Readiness Probe Is the Most Important Knob

Handling In-Flight Requests During Rollout

The Compatibility Requirement

Scale Your Backend - Need an Experienced Backend Developer?

Tell us about your project

Our offices

More articles

Reactive Programming in Spring Boot — WebFlux, When to Use It, and When Not To

The Evolving Role of a Tech Lead With Modern Tools

Why Deleting Code Is One of the Most Underrated Engineering Skills

Handling Criticism Without Feeling Defeated