Rolling Deployments: Safe by Default If You Do Them Right

by Eric Hanson, Backend Developer at Clean Systems Consulting

The Default That's Not Configured

Your Kubernetes deployment uses RollingUpdate strategy. You didn't set it explicitly — it's the default. Kubernetes replaces old pods with new pods, a few at a time, until the rollout is complete. This looks correct from the outside: the deployment progresses, health checks pass, the new version is live.

What you didn't configure: how quickly old pods are replaced, what "healthy" means before traffic is sent to a new pod, how long the old pod continues receiving requests after the new one starts, and whether the new version is actually compatible with requests already in flight. The defaults for most of these are permissive in ways that can cause real problems.

What the Defaults Actually Do

In Kubernetes with default RollingUpdate settings:

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 25%        # Allow up to 25% extra pods during rollout
    maxUnavailable: 25%  # Allow up to 25% of pods to be unavailable

With 8 replicas, Kubernetes can take down 2 pods simultaneously and start 2 new ones. If the new pods start quickly and pass the health check, this moves fast. If the health check is misconfigured — checking TCP connectivity rather than actual readiness — pods receive traffic before they're truly ready.

The missing configuration that makes rolling deployments actually safe:

# Deployment spec
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0    # Never go below desired capacity during rollout

  template:
    spec:
      containers:
        - name: myapp

          # Readiness probe: traffic only routes here when this passes
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 3
            successThreshold: 1

          # Liveness probe: restart the container if this fails
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
            failureThreshold: 3

          # Graceful shutdown: finish in-flight requests before terminating
          lifecycle:
            preStop:
              exec:
                command: ["sh", "-c", "sleep 5"]  # Drain window

      terminationGracePeriodSeconds: 60

maxUnavailable: 0 ensures capacity never drops below the desired count — a new pod must be ready before an old one is removed. Combined with a real readiness probe (not just TCP), this prevents traffic from reaching pods before they're actually handling requests correctly.

The Readiness Probe Is the Most Important Knob

The readiness probe decides whether a pod receives traffic. If it's wrong, the rest of the configuration doesn't matter.

A common mistake: using the liveness probe path for readiness. Liveness checks whether the process is alive. Readiness checks whether it's ready to serve traffic — which means all connection pools are initialized, caches are warm (if required), and downstream dependencies are reachable.

For Spring Boot, use the built-in readiness and liveness actuator endpoints that integrate with Spring's ApplicationContext lifecycle:

# application.yml
management:
  endpoint:
    health:
      probes:
        enabled: true
  health:
    readinessState:
      enabled: true
    livenessState:
      enabled: true

Spring Boot marks the readiness endpoint as OUT_OF_SERVICE until the application context is fully initialized and all ApplicationListener<AvailabilityChangeEvent> handlers confirm readiness. This means Kubernetes won't route traffic to a pod that's still running @PostConstruct initialization or warming connection pools.

Handling In-Flight Requests During Rollout

When Kubernetes terminates an old pod, requests that are already in progress on that pod need time to complete. Without a graceful shutdown window, those requests get abruptly terminated.

The preStop sleep combined with terminationGracePeriodSeconds creates that window. The sequence on pod termination:

  1. Pod is removed from the Service endpoint (no new requests routed here)
  2. preStop hook runs (sleep 5 seconds — wait for load balancer to propagate the removal)
  3. SIGTERM is sent to the container
  4. Application handles SIGTERM by stopping accepting new requests and finishing in-flight ones
  5. After terminationGracePeriodSeconds, SIGKILL is sent if the process hasn't exited

For Spring Boot, configure graceful shutdown:

# application.yml
server:
  shutdown: graceful
spring:
  lifecycle:
    timeout-per-shutdown-phase: 30s

This tells Spring to drain active requests before completing the shutdown, up to 30 seconds.

The Compatibility Requirement

Rolling deployments mean both versions run simultaneously during the rollout window. Any API contract, database schema, or message format must be compatible with both versions during that window.

A v1.3 pod writing records to the database that v1.2 pods can't read will cause errors for requests that land on v1.2 pods during the overlap period. The expand-contract migration pattern (add nullable column in one release, make it required in the next) prevents this.

The key question before every rolling deployment: is my new version compatible with the currently deployed version, both ways? If not, consider blue-green instead — it eliminates the mixed-version window entirely. Rolling deployment safety depends on that compatibility invariant. Don't violate it silently.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Reactive Programming in Spring Boot — WebFlux, When to Use It, and When Not To

Spring WebFlux enables non-blocking, reactive HTTP handling. It solves a specific problem — high-concurrency I/O-bound services — and creates new problems for everything else. Here is what it actually does and the honest case for when it's worth adopting.

Read more

The Evolving Role of a Tech Lead With Modern Tools

Modern development tools are transforming how tech leads do their work. From code review automation to team collaboration, the role is shifting—but not disappearing.

Read more

Why Deleting Code Is One of the Most Underrated Engineering Skills

Every line of code that exists must be maintained, understood, and tested. Deleting code that no longer serves a purpose is not cleanup — it is removing a permanent tax from your team's velocity.

Read more

Handling Criticism Without Feeling Defeated

Criticism stings, even when you know it’s supposed to help. Learning to handle it without losing confidence is a superpower for any professional.

Read more