Breaking Up a Monolith Without Breaking Everything

by Arif Ikhsanudin, Backend Developer

Why big-bang migrations fail

Your team has a plan: spend six months extracting five services from the monolith, then cut over. Six months later, three services are "done" but untested under production load, two are still in progress, the monolith has continued accumulating new features throughout, and the cutover date keeps slipping because nobody is confident the new services actually work correctly.

This is the big-bang migration pattern, and it fails consistently. The failure mode is not technical incompetence — it's that you've deferred all the risk to a cutover event that is now six months away and enormous in scope. Big-bang cutovers have no feedback loop: you won't know if the extracted services work correctly under production conditions until you flip the switch, at which point rolling back is painful.

The alternative is incremental extraction with production validation at each step. Small, reversible changes. Continuous production traffic on extracted components before committing to them.

Defining the right first extraction

The first service you extract sets the precedent for how your team works and validates your infrastructure (deployment pipeline, monitoring, service mesh). Choose it carefully.

The ideal first extraction candidate:

  • Has clear boundaries in the monolith — minimal cross-cutting concerns with other modules
  • Is not on the critical path for your most important user flows (so a bug doesn't cause a high-severity incident)
  • Has meaningful traffic volume (so you can validate behavior under real load, not just synthetic tests)
  • Has a team that is motivated and capable of owning a service

Notification systems, reporting services, and admin tooling are common good first candidates. Payment processing and user authentication are bad first candidates — too high-risk, too many implicit dependencies, too much blast radius if something goes wrong.

The parallel running pattern

Before routing any production traffic to the new service, run it in parallel with the monolith: the monolith handles the request authoritatively, and the new service also handles it, with results compared but not returned to users.

This pattern (called dark launching or shadow testing) gives you production validation without production risk:

// In the monolith: parallel call for comparison
public OrderResult createOrder(OrderRequest request) {
    // Primary path: monolith handles the request
    OrderResult monolithResult = monolithOrderService.create(request);
    
    // Shadow path: new service runs in parallel, result is compared but discarded
    CompletableFuture.runAsync(() -> {
        try {
            OrderResult serviceResult = newOrderService.create(request);
            compareResults(monolithResult, serviceResult, request.getOrderId());
        } catch (Exception e) {
            log.warn("Shadow service error for order {}: {}", request.getOrderId(), e.getMessage());
        }
    });
    
    return monolithResult;
}

Run this for one to two weeks. Monitor the comparison results for discrepancies. Fix bugs in the new service. When the discrepancy rate drops to near zero and you're confident in the new service's behavior, you're ready to route actual traffic.

Incremental traffic migration

Don't cut over 100% of traffic to the new service. Use feature flags or weighted routing to migrate gradually:

  • 1% of traffic to new service, 99% to monolith → validate for 24 hours
  • 10% → validate for 48 hours
  • 50% → validate for one week
  • 100% → keep monolith code paths available for rollback

Weighted routing at the API gateway (Kong, NGINX, or your service mesh) makes this infrastructure-level rather than application-level:

# Kong: weighted routing between monolith and new service
services:
- name: order-service-routing
  plugins:
  - name: traffic-split
    config:
      destinations:
      - service: monolith-order
        weight: 90
      - service: new-order-service
        weight: 10

Monitor error rates and latency for both targets during the migration. If the new service's error rate exceeds the monolith's, roll back the weight. The rollback should take thirty seconds, not a deployment pipeline run.

Data migration strategy

The hardest part of any service extraction is not the code — it's the data. The new service needs its own database. Getting data out of the monolith's database and into the new service's database, consistently and without downtime, requires a strategy.

Dual write: during migration, write to both the monolith database and the new service's database. Read from the monolith. When you're ready to cut over reads to the new service, the data is already there and in sync.

Event-based sync: the monolith publishes change events to Kafka. The new service consumes those events and maintains its own data store. This works well when the monolith already has event publishing or can be instrumented to add it.

Database-level replication with transformation: for initial data load, use a one-time migration script to copy and transform existing data into the new service's schema. For ongoing sync during the migration period, use Debezium (change data capture from the monolith's database) to replicate new changes.

Whichever strategy you use: validate data consistency between both stores throughout the migration period. A comparison job that runs hourly and logs discrepancies will catch problems before they reach users.

Keeping the monolith clean during extraction

While you're extracting services, the monolith continues receiving new features. Without discipline, new code in the monolith creates new dependencies that make future extraction harder.

Enforce the module boundaries being extracted: if you're extracting the Order module, add ArchUnit rules that prevent other monolith code from newly importing from the Order module. New features that would have gone into Order go into the new service instead. The monolith's Order code becomes read-only — no new development — while migration is in progress.

This requires team coordination but prevents the migration from being a treadmill where each extraction is followed by new entanglement.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

How to Undo Almost Anything in Git Without Panicking

Git is one of the most forgiving version control systems ever built — but only if you know where to look. Most "I ruined everything" moments are recoverable in under five minutes.

Read more

Adding Too Many Indexes Is Also a Problem

Every index you add slows down writes and consumes storage — indiscriminate indexing is as harmful as missing indexes, and knowing how to audit and prune your index set is as important as knowing when to add one.

Read more

Why Finding a Senior Backend Developer in Taipei Is Harder Than the City's Tech Reputation Suggests

Taipei has a strong technology identity and a serious engineering culture. Senior backend developers are still surprisingly hard to hire here.

Read more

When Clients Hate Your Work: Learning What Went Wrong

It stings when a client hates what you delivered. Here’s how to turn negative feedback into a roadmap for improvement.

Read more