Why Your Services Can't Stop Talking to Each Other

by Eric Hanson, Backend Developer at Clean Systems Consulting

What chatty services are telling you

Your order service calls the user service for profile data, the credit service for limit checks, the inventory service for availability, and the shipping service for rate calculations — all within a single request. You've added aggressive caching, reduced timeout windows, and deployed Envoy as a service mesh, and the latency is still unacceptable. The problem is not the network. The problem is that you've drawn your service boundaries in the wrong places and are now compensating with infrastructure.

Chatty services — services that can't serve a request without making multiple synchronous calls to other services — are a consistent indicator of one or more of these underlying issues: bounded contexts that were cut along technical layers rather than business domains, data that belongs in one service but lives in another, or orchestration logic that should be events-based but is synchronous by design.

The layered architecture trap

The most common cause of chatty services is drawing service boundaries along technical layers rather than business capabilities. Teams coming from layered monolith architecture (presentation, business logic, data access) replicate that structure as services: a "data service," a "business logic service," an "API gateway service." This is backwards.

A "data service" that just wraps database access for other services is not a microservice. It's a remote repository layer. Every business operation requires calling it, which means every service is permanently coupled to it. Adding caching doesn't fix this — it just trades staleness risk for latency improvement while keeping the fundamental coupling intact.

Services should own their data and expose business capabilities, not raw data access:

❌ Layered (creates chatty services)
Request → Order Logic Service
           → GET /data/users/{id}       (User Data Service)
           → GET /data/inventory/{id}   (Inventory Data Service)
           → GET /data/prices/{id}      (Price Data Service)
           → do logic locally
           → POST /data/orders          (Order Data Service)

✅ Domain-oriented (services own their data)
Request → Order Service
           → GET /users/{id}/order-context   (User Service — returns only what ordering needs)
           → POST /orders/initiate            (Order Service does its own writes)
         Async: publishes OrderInitiated event

Domain data replication as a coupling solution

When a service legitimately needs data from another domain for its own operations, the answer is often not a synchronous call — it's a local copy of the relevant data, kept current via events.

The Order Service needing to check whether a user is in good standing (active account, no fraud flags) does not require a synchronous call to the User Service on every order request. The User Service can publish UserStatusChanged events to a Kafka topic. The Order Service maintains a local user_status table, consuming those events:

CREATE TABLE user_order_eligibility (
  user_id       UUID PRIMARY KEY,
  is_eligible   BOOLEAN NOT NULL DEFAULT TRUE,
  reason        VARCHAR(255),
  updated_at    TIMESTAMP NOT NULL
);

Now the Order Service checks eligibility locally with a single DB read. No network call. No dependency on User Service uptime. The data is eventually consistent — if a user is flagged for fraud, there's a short window where they could still place orders. For most systems, that window (seconds to milliseconds, depending on event processing lag) is acceptable. If it's not acceptable, you have a synchronous query requirement, and you should model it that way explicitly.

Orchestration versus choreography

Another source of chatty services is orchestration-heavy design: one service calling a sequence of other services to drive a workflow. The Order Service calls Inventory Service to reserve stock, calls Payment Service to charge the card, calls Fulfillment Service to schedule delivery. Every step is a synchronous dependency, every failure cascades.

Choreography — event-driven coordination — reduces this coupling. Each service reacts to events from the previous step without being called:

Order Service publishes: OrderConfirmed
  → Inventory Service consumes: reserves stock, publishes: StockReserved
  → Payment Service consumes: charges card, publishes: PaymentCollected
  → Fulfillment Service consumes: schedules delivery, publishes: ShipmentScheduled

No service calls another directly. The workflow emerges from event subscriptions. Adding a new step (fraud check between order confirmation and inventory reservation) means a new consumer, not a change to Order Service. Removing a step means removing a consumer. The coupling is to the event schema, not to other services' APIs.

The downside: workflow state is distributed. Debugging a failed workflow requires correlating events across multiple topics and services. You need distributed tracing and event correlation IDs from the start, not as an afterthought.

When synchronous calls are unavoidable

Some inter-service calls are genuinely synchronous requirements: real-time credit decisions, inventory availability at checkout, pricing at point of sale. These should be the exception, not the default, and they should be designed with the assumption that the downstream service will sometimes be slow or unavailable.

If after restructuring your domain model you still have five synchronous calls per request, look at whether those calls can be parallelized. If they're independent, fan them out concurrently:

CompletableFuture<UserContext> userFuture = 
    CompletableFuture.supplyAsync(() -> userClient.getOrderContext(userId));
CompletableFuture<InventoryStatus> inventoryFuture = 
    CompletableFuture.supplyAsync(() -> inventoryClient.getStatus(itemIds));

CompletableFuture.allOf(userFuture, inventoryFuture).join();
// total latency = max(user latency, inventory latency), not sum

But if you find yourself doing this routinely, it's still a signal that the domain model is wrong — you're compensating for a boundary problem with concurrency tricks.

The right question when services won't stop talking: which of these calls could be eliminated by moving data ownership to the service that needs it? Answer that first. Then optimize the calls that remain.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

How Small Is a Microservice Supposed to Be

Service size is the wrong metric. Cohesion, team ownership, and bounded context alignment are what determine whether a service is well-sized — and most teams are making their services too small, not too large.

Read more

Why One Developer Cannot Build an Entire Product Alone

“Can one developer build this?” sounds like a cost-saving question. In reality, it’s often the start of a much more expensive problem.

Read more

Employee vs Contractor: The Real Financial Difference

Why that “expensive” contractor rate isn’t as simple as it looks (and why employees aren’t as cheap as they seem)

Read more

Planning for Growth Without a Boss or HR

No performance reviews. No promotion ladder. No one telling you what’s next. Freedom sounds great—until you realize you’re fully responsible for your own growth.

Read more