Load Balancing Is Not Just Distributing Traffic. Here Is What It Really Does.

by Eric Hanson, Backend Developer at Clean Systems Consulting

What Engineers Think Load Balancers Do

Ask most engineers what a load balancer does and you get: "It distributes traffic across multiple servers." That is accurate the way "a database stores data" is accurate — technically correct, operationally incomplete.

The mental model of a load balancer as a simple traffic splitter causes real problems. Teams configure round-robin across three instances and consider the problem solved. Then they get surprised when a backend instance dies and requests continue hitting it for 30 seconds. Or when sticky sessions cause one instance to handle 70% of the traffic. Or when TLS configuration at the load balancer does not match their security requirements. These are not edge cases — they are the operational reality of running a load balancer in production.

What Load Balancers Actually Do

Health checking and failure removal. A load balancer continuously checks backend health and removes failing instances from the pool. The critical configuration is the health check parameters: interval (how often to check), threshold (how many failures before removal), and timeout (how long to wait for a response). A health check with a 30-second interval and a threshold of three failures means a dead backend handles traffic for up to 90 seconds before removal. For a system handling 500 req/s, that is 45,000 failed requests during that window.

# ALB health check configuration
# Default settings fail slowly:
HealthCheckIntervalSeconds: 30   # checks every 30s
HealthyThresholdCount: 3         # needs 3 successful checks
UnhealthyThresholdCount: 3       # needs 3 failed checks to remove
# Worst case: 90 seconds of traffic to a dead backend

# Aggressive settings for faster failover:
HealthCheckIntervalSeconds: 10
UnhealthyThresholdCount: 2
# Worst case: 20 seconds -- much more acceptable

TLS termination. The load balancer handles the TLS handshake with the client and forwards decrypted traffic to backends over the internal network. This offloads CPU-intensive cryptographic operations from application servers. It also centralizes certificate management — you renew the certificate in one place rather than on every instance. The tradeoff: traffic between the load balancer and backends is unencrypted unless you configure end-to-end TLS (mutual TLS between load balancer and backends), which adds complexity.

Connection pooling and HTTP/2 multiplexing. Modern load balancers like nginx and AWS ALB maintain persistent connection pools to backends. A client makes a request; the load balancer may use an existing connection to the backend rather than opening a new one. For HTTP/2, the load balancer multiplexes multiple client streams onto fewer backend connections. This matters significantly for high-concurrency workloads where connection setup overhead is non-trivial.

Session affinity (sticky sessions). The load balancer can route all requests from the same client to the same backend instance, based on a cookie. This is required when backends hold session state locally. The problem: it creates uneven load distribution. A client that makes 10x more requests than average disproportionately loads one backend. It also complicates failover — if the pinned backend dies, that client's session is lost. The better solution is stateless backends with centralized session storage (Redis), making sticky sessions unnecessary.

Balancing Algorithms That Matter

Round robin: requests cycle through backends in order. Simple, even distribution when all requests have similar cost. Fails when backends have different capacities or when request cost varies significantly.

Least connections: new requests go to the backend with the fewest active connections. Better for variable-cost requests — a long-running query on one backend does not disproportionately load that backend because new short requests route elsewhere. AWS ALB's "least outstanding requests" algorithm is a variant of this.

Weighted: backends receive traffic proportional to assigned weights. Used during deployments — gradually shift traffic from old to new instances by adjusting weights rather than a hard cutover.

What This Changes About Your Design

If you design your system assuming the load balancer is a dumb traffic splitter, you will be surprised by health check lag, session distribution problems, and TLS configuration gaps. Design instead with the assumption that the load balancer is a configurable policy layer between clients and your application.

Configure health checks aggressively. Use least-connections for variable-cost workloads. Remove local session state from application servers. Configure appropriate connection draining (the time the load balancer gives in-flight requests to complete before removing an instance during a deployment). These are not advanced configurations — they are the baseline for a load balancer configuration that behaves correctly under real conditions.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Mocking in Spring Boot Tests: When It Helps and When It Hurts

Mocking is the most overused tool in the Spring Boot testing toolkit. Used well, it isolates units and speeds up suites. Used carelessly, it builds a test suite that passes confidently while your application fails in production.

Read more

Why Asynchronous Work Is Essential for Remote Teams

Working across time zones can feel impossible. Asynchronous work makes collaboration smoother, without the chaos of constant real-time meetings.

Read more

How to Decide What Skills Will Actually Get You More Work

Not every skill you learn brings more projects or higher pay. Here’s how to pick the ones that truly make you marketable.

Read more

How to Deliver Bad News Without Panic

Breaking bad news is never fun. Here’s a calm, practical way to handle it without losing your cool.

Read more