Read Replicas: The Database Scaling Strategy Most Teams Discover Too Late

by Arif Ikhsanudin, Backend Developer

The Wall You Hit

Your application's read traffic has grown steadily. The primary database CPU sits at 70% during peak hours. p95 read latency is climbing. You are close to the ceiling of what the current instance can handle, and you are starting to feel it. The instinct is to size up the instance. The better move — which you could have made three months ago with less urgency — is to add a read replica.

Most web applications are read-heavy. A typical ratio is 80–90% reads, 10–20% writes. A single primary database handles all of it. By adding a read replica — a continuously synchronized copy of the primary that serves read traffic — you shift the read load off the primary, leaving it to handle writes and critical reads with significantly more headroom.

How Replication Works

PostgreSQL and MySQL both implement streaming replication: the primary writes changes to a write-ahead log (WAL in PostgreSQL, binary log in MySQL), and replicas continuously consume that log, applying changes to maintain a synchronized copy.

The replica is eventually consistent with the primary. Under normal conditions, replication lag is sub-second — typically 10–100ms. Under write pressure or network congestion, lag can grow. A replica with 5 seconds of replication lag serves data that is up to 5 seconds old.

This has important implications for which reads can safely route to replicas:

# Safe to route to replica:
- Product catalog reads (price/inventory updates are infrequent)
- User profile reads (doesn't need to be millisecond-current)
- Analytics and reporting queries
- Historical data access
- Admin dashboard reads

# Must route to primary:
- Reads immediately after a write in the same transaction or session
  (read-your-writes consistency)
- Inventory checks during checkout (stale read could oversell)
- Balance checks before financial operations
- Any read where staleness causes a correctness problem

The Mechanics of Routing

The simplest routing is connection-based: configure your ORM or database driver to use the replica connection for reads and the primary connection for writes. Rails has a built-in connected_to with role-based routing. Django supports multiple database configuration with query routers.

More sophisticated routing handles the read-your-writes case: after a write, tag the session with a "prefer primary" flag for a short window (1–2 seconds), long enough for replication lag to clear. Subsequent reads in that window go to the primary; reads after the window route to the replica.

# Simplified read-your-writes routing:
def get_db_connection(user_session, operation_type):
    if operation_type == "write":
        user_session["last_write"] = time.time()
        return primary_conn

    # Route to primary briefly after a write for read-your-writes
    if user_session.get("last_write") and time.time() - user_session["last_write"] < 2:
        return primary_conn

    return replica_conn

Replicas for More Than Read Scaling

Failover. A replica can be promoted to primary in the event of a primary failure. RDS Multi-AZ and Cloud SQL High Availability use a synchronous standby replica (replication with no lag, lower write throughput) for automatic failover with sub-60-second recovery time.

Analytics and reporting. Long-running analytical queries — the kind that might take 30 seconds and scan millions of rows — will devastate the primary during peak hours. Running them against a replica (possibly a dedicated read replica not used for application reads) isolates the analytical workload from the production read path.

Blue/green database migration. Replicas can be used as the target for a migration — run the migration on a replica, verify it, promote it. Reduces the risk window for major schema changes.

When to Add One

Add a read replica before you are forced to. The moment your primary exceeds 60% CPU during normal operating hours, you have lost the headroom to absorb spikes without a problem. A replica is a managed service add-on in most cloud environments — RDS, Cloud SQL, Azure Database all offer one-click replica provisioning. The cost is roughly the instance cost of the replica.

The mistake is waiting until you are at 90% CPU and in an incident to make the decision. At that point, the replica takes time to provision and sync — minutes to hours depending on data volume — during which you are still degraded. Add it proactively.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

How to Evaluate a Backend Project Before Accepting the Work

“Seems doable… but something feels off.” That hesitation is worth listening to — especially in backend work.

Read more

When Laptops Are Domain-Locked and Developers Can’t Install Tools

Nothing kills momentum faster than a laptop you can’t fully control. As contractors, working on a domain-locked machine is often a recipe for frustration.

Read more

Abstractions Are Powerful Until They Hide Too Much

Abstractions reduce cognitive load by hiding irrelevant details. They become a liability when they hide details that are actually relevant — leaving developers confused about system behavior and unable to reason about failures.

Read more

New Zealand's Tech Talent Pool Is Small. Async Remote Contractors Are How Startups Close the Gap

You've been looking for a senior backend engineer for three months. You've seen every relevant CV in Auckland twice. The pool isn't refreshing — it's the same twelve people.

Read more