Synchronous Communication in Microservices Is a Trap
by Eric Hanson, Backend Developer at Clean Systems Consulting
How you end up with a system worse than what you started with
You split the monolith. Each service has its own database, its own deployment pipeline, its own team. On paper it looks like you've achieved the independence microservices promise. In production, when the Inventory Service has a 30-second GC pause, your Order Service starts timing out, your API Gateway starts returning 503s, and your users see checkout failures.
This is not an Inventory Service problem. It's an architecture problem. You've built a system where the availability of a single service determines the availability of every upstream service that depends on it — the same failure coupling you had in the monolith, now expressed through network calls instead of function calls. The monolith at least failed fast. Now your failure cascades slowly through retry queues and thread pool exhaustion.
Why synchronous chains are so dangerous
The math is straightforward. If each service has 99.9% availability (the three-nines SLA many teams consider acceptable), a chain of three synchronous dependencies gives you 99.9% × 99.9% × 99.9% = 99.7% combined availability. That's roughly 22 hours of downtime per year from a chain of individually reliable services. A chain of ten services — not unusual in complex microservices architectures — drops to 99.0%.
More practically: synchronous coupling means that any slow service in the chain becomes a slow service for every caller. If Inventory Service has a database query that degrades from 20ms to 2,000ms under high load, Order Service's request handling blocks those threads. If Order Service uses a fixed-size thread pool (common with Spring's default Tomcat connector or traditional servlet containers), those threads fill up, requests queue, and Order Service itself becomes unavailable — not because Order Service has a bug, but because Inventory Service is slow.
This is cascading failure. It's a distributed systems property, not a bug you can fix in any single service.
The latency addition problem
Even without failures, synchronous service chains add latency multiplicatively. A request that touches five services with average response times of 50ms each takes at minimum 250ms — and that's with perfect parallelism. If those calls must be sequential (each depends on the result of the previous), you're looking at 250ms of pure network + processing time before your service adds any of its own processing time.
At 250ms, you're already above many UX guidelines for perceived responsiveness in interactive applications. For mobile clients with higher round-trip latency, it compounds further.
The specific failure modes synchronous calls introduce
Thread pool saturation: Blocking threads waiting for downstream responses consume resources. Under slow downstream conditions, thread pools saturate faster than load increases. This is why even modest traffic spikes during a downstream degradation can take an upstream service fully offline.
Retry amplification: If Service A retries failed calls to Service B three times, a brief B outage generates 3x the load on B when it recovers. If multiple services retry simultaneously (the thundering herd problem), the recovering service gets hammered with amplified retry traffic before it can stabilize.
Timeout misconfiguration: If A calls B with a 5-second timeout, and B calls C with a 5-second timeout, A's effective timeout is 10+ seconds — longer than A's own callers probably expect. Timeout values rarely account for the full call chain depth.
Moving interactions to async where possible
The fundamental fix is to identify which synchronous interactions are not actually synchronous by necessity and convert them to event-driven patterns.
An Order Service that synchronously calls a Notification Service to send a confirmation email has no business doing so. The user doesn't wait for the email before their checkout completes. The interaction should be:
// Before: synchronous, adds latency and couples availability
notificationService.sendOrderConfirmation(order); // HTTP call
// After: publish event, notification service handles asynchronously
eventPublisher.publish(new OrderConfirmedEvent(order.getId(), order.getUserId()));
// Notification service consumes from Kafka and sends email independently
The Notification Service can be down for six hours. Orders are not affected. When it recovers, it processes the backlog of events. Nothing is lost. No retries needed at the Order Service level.
Designing for partial availability
For interactions that are genuinely synchronous — where the response is needed before proceeding — you need to design each call for the assumption that it will sometimes be slow or unavailable.
Circuit breakers (Resilience4j in Java, resilience4go in Go, Polly in .NET) wrap synchronous clients and stop calls to unhealthy downstreams before thread pools saturate:
@CircuitBreaker(name = "inventoryService", fallbackMethod = "getInventoryFallback")
public InventoryStatus getInventory(String itemId) {
return inventoryClient.getStatus(itemId);
}
private InventoryStatus getInventoryFallback(String itemId, Exception e) {
// Return cached data, or degrade gracefully
return InventoryStatus.assumeAvailable(itemId);
}
The fallback is where you earn your keep architecturally. "Return cached data" is fine for catalog information that changes slowly. "Assume available" is a business risk decision — you might accept orders for out-of-stock items and deal with fulfillment failures downstream. Know what you're trading.
Synchronous calls you can't eliminate should be short, independently resilient, and have a fallback that degrades gracefully rather than failing hard. For the rest — the notifications, the audit logs, the analytics, the downstream fulfillment triggers — publish events and stop waiting for responses you don't need.