Caching Is Not a Silver Bullet. It Is a Trade-off.

February 24, 2026

by Arif Ikhsanudin, Backend Developer

The Instinct Is Right but Incomplete

When a database query is slow, the instinct to cache the result is correct. Read the data once, store it, serve subsequent reads from memory. Response time drops. Database load drops. Everything looks better.

What the dashboard does not show: you now have two copies of the data in two different systems with no automatic mechanism to keep them in sync. Every write to the database creates a potential inconsistency with the cache. Every cache entry has a lifetime after which it may be stale. Every cache miss falls through to the database you were trying to protect.

Caching trades consistency for performance. That trade is often worth making. The mistake is not seeing it as a trade at all.

What You Are Actually Buying

Caching buys reduced read latency and reduced load on the origin (database, external service). The performance benefit is real and often dramatic. A Redis cache with sub-millisecond latency serving a result that would take 200ms to compute is a 200x improvement for cache hits.

What you are paying: staleness window, consistency complexity, an additional failure mode (cache unavailability), and memory cost for the cached data.

The staleness window is the most underappreciated cost. A cache entry with a 60-second TTL means any client can see data that is up to 60 seconds old. For a product catalog, that is acceptable — prices change infrequently, and a 60-second lag has no business impact. For an account balance, it is not — a user who just transferred money expects to see the updated balance immediately.

Cache what changes infrequently and is expensive to compute. Do not cache what changes frequently and must be current.

The Failure Modes

Cache stampede (thundering herd). A popular cache entry expires. At the moment of expiration, 500 concurrent requests all miss the cache, all hit the database simultaneously, and all attempt to populate the cache simultaneously. The database receives 500 queries for data it was previously being shielded from.

Mitigation: probabilistic early expiration (refresh cache before it expires, with probability increasing as expiration approaches) or a distributed lock on cache population — the first miss acquires the lock and populates, others wait for the new value.

# Cache stampede prevention with a simple lock:
def get_with_lock(key, ttl, compute_fn):
    value = cache.get(key)
    if value is not None:
        return value

    lock_key = f"lock:{key}"
    acquired = cache.set(lock_key, "1", nx=True, ex=5)  # 5s lock timeout

    if acquired:
        value = compute_fn()
        cache.set(key, value, ex=ttl)
        cache.delete(lock_key)
        return value
    else:
        # Another process is computing -- wait briefly and retry
        time.sleep(0.1)
        return cache.get(key)  # May still be None on retry -- handle appropriately

Cache as a crutch for a slow query. If the query behind the cache is slow, cache misses are expensive. If cache hit rates drop — new users, cache restarts, invalidation events — the database is exposed. A slow query that runs 100ms with 99% cache hit rate becomes a 100ms multiplied by 100 concurrent miss problem during a cache restart. Fix slow queries; do not hide them.

Over-caching writes. Write-through caching (update the cache on every write) seems safe but adds latency to every write. Write-behind caching (update the database asynchronously from the cache) risks data loss if the cache fails before the write is persisted. Neither is a good default for write-heavy data.

When Not to Cache

Do not cache user-specific data at a shared cache layer without scoping it carefully by user ID — caching the wrong user's data for another user is a serious security incident. Do not cache data with regulatory requirements for freshness (financial balances, health records). Do not cache computed results from operations with side effects.

Cache reads. Be careful about caching writes. Be explicit about staleness tolerance before you choose a TTL. These decisions belong in the design, not as an afterthought when performance problems surface.

Our offices

Follow us

Caching Is Not a Silver Bullet. It Is a Trade-off.

The Instinct Is Right but Incomplete

What You Are Actually Buying

The Failure Modes

When Not to Cache

Scale Your Backend - Need an Experienced Backend Developer?

Tell us about your project

Our offices

More articles

Dealing With Client Pressure Without Losing Your Mind

Getting Paid on Time Is a System. Here Is How to Build One.

Why Context Switching Kills Developer Productivity

Configuring Spring Boot for Docker and Kubernetes — Health Probes, Graceful Shutdown, and Resource Limits