What Happens When Your Cache and Your Database Disagree

by Eric Hanson, Backend Developer at Clean Systems Consulting

When Two Sources of Truth Diverge

A user updates their shipping address. Your application writes the new address to the database and deletes the cache entry. Half a second later, another request comes in. The cache miss triggers a database read — but due to replication lag, the read replica that serves this query still has the old address. The new cache entry is populated with the old value. For the next 5 minutes, until TTL expires, every request sees the old address even though the database primary has the correct one.

This is not a hypothetical failure mode. It is a specific, predictable consequence of using read replicas in combination with a write-invalidate caching pattern. Understanding the mechanics of how cache and database diverge tells you how to structure your reads and writes to minimize the window.

The Four Ways Divergence Happens

Replication lag on invalidation. You invalidate the cache after a write, but the subsequent cache miss reads from a replica that has not yet received the write. The cache is repopulated with stale data from the replica. This is the scenario above.

Mitigation: on cache miss after a write, read from the primary. A simple way to implement this is to set a short flag in the user's session or request context indicating "this user just wrote data — route this read to primary." Alternatively, use a short TTL after write invalidation that forces a primary read on the next miss.

Failed invalidation. The database write succeeds. The cache invalidation fails — Redis is momentarily unavailable, the network drops, the process crashes. The stale cache entry persists until TTL.

Mitigation: keep TTLs short enough that the backstop is meaningful. Accept that write-through invalidation without a TTL backstop is not a reliable strategy.

Race between two concurrent writes. Writer A updates the record and deletes the cache. Writer B updates the record and deletes the cache. Writer A's cache read-through populates the cache with a value that may predate Writer B's change, depending on read replica lag.

# Timeline of a write-write race:
t=0ms: Writer A: UPDATE users SET name = 'Alice' WHERE id = 1
t=1ms: Writer A: DELETE cache["user:1"]
t=2ms: Writer B: UPDATE users SET name = 'Alicia' WHERE id = 1
t=3ms: Writer B: DELETE cache["user:1"]
t=4ms: Cache miss -- read from replica
       Replica lag = 20ms -- replica still has original value "Alex"
t=4ms: cache["user:1"] = "Alex"  <- both "Alice" and "Alicia" are now wrong

This scenario requires either optimistic locking with version numbers or accepting the eventual consistency window and using TTL to bound it.

Manual database changes. A developer runs an UPDATE directly on the database to fix a data issue. The cache is not invalidated. The cache holds the pre-fix value. This is the most common cause of "but I fixed it in the database, why is the application still showing the old value."

Mitigation: make manual database fixes include a corresponding cache invalidation step. For critical data, use an internal admin API that handles both operations atomically rather than direct database access.

Patterns That Reduce Divergence Risk

Read-your-writes consistency. Route the read that immediately follows a write to the same primary the write went to. This eliminates the replication lag window for the writing user.

Versioned cache keys. Instead of invalidating a cache entry, write a new entry with a new version key and update a pointer to the current version. Readers always look up the current version pointer, then fetch the versioned entry.

# Versioned cache key pattern:
def get_user(user_id):
    version = cache.get(f"user:{user_id}:version") or 1
    return cache.get(f"user:{user_id}:v{version}")

def update_user(user_id, data):
    db.update(user_id, data)
    new_version = db.get_version(user_id)
    cache.set(f"user:{user_id}:v{new_version}", data, ex=300)
    cache.set(f"user:{user_id}:version", new_version, ex=300)
    # Old versioned entry naturally expires

Short TTL as the consistency bound. Accept eventual consistency with a defined bound. A 30-second TTL means divergence resolves within 30 seconds. This is an explicit choice, not a failure — document it, ensure stakeholders understand it, and verify it is acceptable for the data type.

The goal is not eliminating divergence — that requires transactions across two systems, which is impractical. The goal is understanding the divergence window, bounding it, and ensuring the system fails safely when it occurs.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

What I Wish I Knew Before Becoming a Tech Lead

Becoming a tech lead feels like a promotion. It is — and it also means your entire job description has quietly changed in ways nobody tells you about.

Read more

Stop Running Every Check on Every Commit

Running the full pipeline on every commit is a default, not a best practice. Selective execution based on what actually changed is one of the most underused techniques for reducing CI cost and developer wait time.

Read more

Negotiating Contracts Without Feeling Awkward

Talking money doesn’t have to feel like a root canal. Negotiating contracts can be professional, clear, and even comfortable.

Read more

Your Local Backend Talent Pool Is Not Going to Get Bigger — Here Is What to Do About It

Waiting for the local backend hiring market to improve is a plan. It's just not a plan that ships features.

Read more