Read Replicas: The Database Scaling Strategy Most Teams Discover Too Late
by Eric Hanson, Backend Developer at Clean Systems Consulting
The Wall You Hit
Your application's read traffic has grown steadily. The primary database CPU sits at 70% during peak hours. p95 read latency is climbing. You are close to the ceiling of what the current instance can handle, and you are starting to feel it. The instinct is to size up the instance. The better move — which you could have made three months ago with less urgency — is to add a read replica.
Most web applications are read-heavy. A typical ratio is 80–90% reads, 10–20% writes. A single primary database handles all of it. By adding a read replica — a continuously synchronized copy of the primary that serves read traffic — you shift the read load off the primary, leaving it to handle writes and critical reads with significantly more headroom.
How Replication Works
PostgreSQL and MySQL both implement streaming replication: the primary writes changes to a write-ahead log (WAL in PostgreSQL, binary log in MySQL), and replicas continuously consume that log, applying changes to maintain a synchronized copy.
The replica is eventually consistent with the primary. Under normal conditions, replication lag is sub-second — typically 10–100ms. Under write pressure or network congestion, lag can grow. A replica with 5 seconds of replication lag serves data that is up to 5 seconds old.
This has important implications for which reads can safely route to replicas:
# Safe to route to replica:
- Product catalog reads (price/inventory updates are infrequent)
- User profile reads (doesn't need to be millisecond-current)
- Analytics and reporting queries
- Historical data access
- Admin dashboard reads
# Must route to primary:
- Reads immediately after a write in the same transaction or session
(read-your-writes consistency)
- Inventory checks during checkout (stale read could oversell)
- Balance checks before financial operations
- Any read where staleness causes a correctness problem
The Mechanics of Routing
The simplest routing is connection-based: configure your ORM or database driver to use the replica connection for reads and the primary connection for writes. Rails has a built-in connected_to with role-based routing. Django supports multiple database configuration with query routers.
More sophisticated routing handles the read-your-writes case: after a write, tag the session with a "prefer primary" flag for a short window (1–2 seconds), long enough for replication lag to clear. Subsequent reads in that window go to the primary; reads after the window route to the replica.
# Simplified read-your-writes routing:
def get_db_connection(user_session, operation_type):
if operation_type == "write":
user_session["last_write"] = time.time()
return primary_conn
# Route to primary briefly after a write for read-your-writes
if user_session.get("last_write") and time.time() - user_session["last_write"] < 2:
return primary_conn
return replica_conn
Replicas for More Than Read Scaling
Failover. A replica can be promoted to primary in the event of a primary failure. RDS Multi-AZ and Cloud SQL High Availability use a synchronous standby replica (replication with no lag, lower write throughput) for automatic failover with sub-60-second recovery time.
Analytics and reporting. Long-running analytical queries — the kind that might take 30 seconds and scan millions of rows — will devastate the primary during peak hours. Running them against a replica (possibly a dedicated read replica not used for application reads) isolates the analytical workload from the production read path.
Blue/green database migration. Replicas can be used as the target for a migration — run the migration on a replica, verify it, promote it. Reduces the risk window for major schema changes.
When to Add One
Add a read replica before you are forced to. The moment your primary exceeds 60% CPU during normal operating hours, you have lost the headroom to absorb spikes without a problem. A replica is a managed service add-on in most cloud environments — RDS, Cloud SQL, Azure Database all offer one-click replica provisioning. The cost is roughly the instance cost of the replica.
The mistake is waiting until you are at 90% CPU and in an incident to make the decision. At that point, the replica takes time to provision and sync — minutes to hours depending on data volume — during which you are still degraded. Add it proactively.