Rolling Deployments: Safe by Default If You Do Them Right
by Eric Hanson, Backend Developer at Clean Systems Consulting
The Default That's Not Configured
Your Kubernetes deployment uses RollingUpdate strategy. You didn't set it explicitly — it's the default. Kubernetes replaces old pods with new pods, a few at a time, until the rollout is complete. This looks correct from the outside: the deployment progresses, health checks pass, the new version is live.
What you didn't configure: how quickly old pods are replaced, what "healthy" means before traffic is sent to a new pod, how long the old pod continues receiving requests after the new one starts, and whether the new version is actually compatible with requests already in flight. The defaults for most of these are permissive in ways that can cause real problems.
What the Defaults Actually Do
In Kubernetes with default RollingUpdate settings:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25% # Allow up to 25% extra pods during rollout
maxUnavailable: 25% # Allow up to 25% of pods to be unavailable
With 8 replicas, Kubernetes can take down 2 pods simultaneously and start 2 new ones. If the new pods start quickly and pass the health check, this moves fast. If the health check is misconfigured — checking TCP connectivity rather than actual readiness — pods receive traffic before they're truly ready.
The missing configuration that makes rolling deployments actually safe:
# Deployment spec
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # Never go below desired capacity during rollout
template:
spec:
containers:
- name: myapp
# Readiness probe: traffic only routes here when this passes
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
# Liveness probe: restart the container if this fails
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
# Graceful shutdown: finish in-flight requests before terminating
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 5"] # Drain window
terminationGracePeriodSeconds: 60
maxUnavailable: 0 ensures capacity never drops below the desired count — a new pod must be ready before an old one is removed. Combined with a real readiness probe (not just TCP), this prevents traffic from reaching pods before they're actually handling requests correctly.
The Readiness Probe Is the Most Important Knob
The readiness probe decides whether a pod receives traffic. If it's wrong, the rest of the configuration doesn't matter.
A common mistake: using the liveness probe path for readiness. Liveness checks whether the process is alive. Readiness checks whether it's ready to serve traffic — which means all connection pools are initialized, caches are warm (if required), and downstream dependencies are reachable.
For Spring Boot, use the built-in readiness and liveness actuator endpoints that integrate with Spring's ApplicationContext lifecycle:
# application.yml
management:
endpoint:
health:
probes:
enabled: true
health:
readinessState:
enabled: true
livenessState:
enabled: true
Spring Boot marks the readiness endpoint as OUT_OF_SERVICE until the application context is fully initialized and all ApplicationListener<AvailabilityChangeEvent> handlers confirm readiness. This means Kubernetes won't route traffic to a pod that's still running @PostConstruct initialization or warming connection pools.
Handling In-Flight Requests During Rollout
When Kubernetes terminates an old pod, requests that are already in progress on that pod need time to complete. Without a graceful shutdown window, those requests get abruptly terminated.
The preStop sleep combined with terminationGracePeriodSeconds creates that window. The sequence on pod termination:
- Pod is removed from the Service endpoint (no new requests routed here)
preStophook runs (sleep 5 seconds — wait for load balancer to propagate the removal)SIGTERMis sent to the container- Application handles
SIGTERMby stopping accepting new requests and finishing in-flight ones - After
terminationGracePeriodSeconds,SIGKILLis sent if the process hasn't exited
For Spring Boot, configure graceful shutdown:
# application.yml
server:
shutdown: graceful
spring:
lifecycle:
timeout-per-shutdown-phase: 30s
This tells Spring to drain active requests before completing the shutdown, up to 30 seconds.
The Compatibility Requirement
Rolling deployments mean both versions run simultaneously during the rollout window. Any API contract, database schema, or message format must be compatible with both versions during that window.
A v1.3 pod writing records to the database that v1.2 pods can't read will cause errors for requests that land on v1.2 pods during the overlap period. The expand-contract migration pattern (add nullable column in one release, make it required in the next) prevents this.
The key question before every rolling deployment: is my new version compatible with the currently deployed version, both ways? If not, consider blue-green instead — it eliminates the mixed-version window entirely. Rolling deployment safety depends on that compatibility invariant. Don't violate it silently.