Caching Is Not a Silver Bullet. It Is a Trade-off.

by Arif Ikhsanudin, Backend Developer

The Instinct Is Right but Incomplete

When a database query is slow, the instinct to cache the result is correct. Read the data once, store it, serve subsequent reads from memory. Response time drops. Database load drops. Everything looks better.

What the dashboard does not show: you now have two copies of the data in two different systems with no automatic mechanism to keep them in sync. Every write to the database creates a potential inconsistency with the cache. Every cache entry has a lifetime after which it may be stale. Every cache miss falls through to the database you were trying to protect.

Caching trades consistency for performance. That trade is often worth making. The mistake is not seeing it as a trade at all.

What You Are Actually Buying

Caching buys reduced read latency and reduced load on the origin (database, external service). The performance benefit is real and often dramatic. A Redis cache with sub-millisecond latency serving a result that would take 200ms to compute is a 200x improvement for cache hits.

What you are paying: staleness window, consistency complexity, an additional failure mode (cache unavailability), and memory cost for the cached data.

The staleness window is the most underappreciated cost. A cache entry with a 60-second TTL means any client can see data that is up to 60 seconds old. For a product catalog, that is acceptable — prices change infrequently, and a 60-second lag has no business impact. For an account balance, it is not — a user who just transferred money expects to see the updated balance immediately.

Cache what changes infrequently and is expensive to compute. Do not cache what changes frequently and must be current.

The Failure Modes

Cache stampede (thundering herd). A popular cache entry expires. At the moment of expiration, 500 concurrent requests all miss the cache, all hit the database simultaneously, and all attempt to populate the cache simultaneously. The database receives 500 queries for data it was previously being shielded from.

Mitigation: probabilistic early expiration (refresh cache before it expires, with probability increasing as expiration approaches) or a distributed lock on cache population — the first miss acquires the lock and populates, others wait for the new value.

# Cache stampede prevention with a simple lock:
def get_with_lock(key, ttl, compute_fn):
    value = cache.get(key)
    if value is not None:
        return value

    lock_key = f"lock:{key}"
    acquired = cache.set(lock_key, "1", nx=True, ex=5)  # 5s lock timeout

    if acquired:
        value = compute_fn()
        cache.set(key, value, ex=ttl)
        cache.delete(lock_key)
        return value
    else:
        # Another process is computing -- wait briefly and retry
        time.sleep(0.1)
        return cache.get(key)  # May still be None on retry -- handle appropriately

Cache as a crutch for a slow query. If the query behind the cache is slow, cache misses are expensive. If cache hit rates drop — new users, cache restarts, invalidation events — the database is exposed. A slow query that runs 100ms with 99% cache hit rate becomes a 100ms multiplied by 100 concurrent miss problem during a cache restart. Fix slow queries; do not hide them.

Over-caching writes. Write-through caching (update the cache on every write) seems safe but adds latency to every write. Write-behind caching (update the database asynchronously from the cache) risks data loss if the cache fails before the write is persisted. Neither is a good default for write-heavy data.

When Not to Cache

Do not cache user-specific data at a shared cache layer without scoping it carefully by user ID — caching the wrong user's data for another user is a serious security incident. Do not cache data with regulatory requirements for freshness (financial balances, health records). Do not cache computed results from operations with side effects.

Cache reads. Be careful about caching writes. Be explicit about staleness tolerance before you choose a TTL. These decisions belong in the design, not as an afterthought when performance problems surface.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Dealing With Client Pressure Without Losing Your Mind

It starts with a “quick update?” and suddenly it’s three messages, two calls, and a new deadline. Client pressure is real—but it doesn’t have to break you.

Read more

Getting Paid on Time Is a System. Here Is How to Build One.

Late payments do not happen because clients are malicious — they happen because the contractor never built a system that made paying on time the path of least resistance.

Read more

Why Context Switching Kills Developer Productivity

Developers often juggle multiple tasks at once. What looks like multitasking is actually a productivity killer called context switching.

Read more

Configuring Spring Boot for Docker and Kubernetes — Health Probes, Graceful Shutdown, and Resource Limits

Spring Boot applications deployed to Kubernetes need specific configuration to behave correctly under orchestration — proper health probes, graceful shutdown, container-aware resource limits, and externalized configuration. Here is the complete setup.

Read more