Caching Is Not a Performance Fix. It Is a Performance Tool.

by Eric Hanson, Backend Developer at Clean Systems Consulting

The Cache That Made Things Worse

An e-commerce platform was experiencing slow product pages. Someone added Redis caching for product data with a 24-hour TTL. Response times improved dramatically. Six months later, a support ticket: customers were seeing outdated prices and stock levels — sometimes for hours after a price change. The cache was correct. The business wasn't getting what it needed.

This is not a story about caching being bad. It's a story about caching being applied without fully thinking through the consistency implications. Caching always trades freshness for speed. Whether that tradeoff is acceptable depends entirely on what the data is and how it's used.

What Caching Actually Solves

Caching is the right tool for a specific set of problems:

Repeated reads of data that changes infrequently relative to how often it's read. Country codes, product categories, configuration values, user preferences — these are read thousands of times per second and change at most a few times per day. The cache hit rate is high. The staleness window is short relative to the change frequency.

Expensive computations that produce deterministic results for the same input. If generating a report takes 8 seconds and the underlying data changes hourly, caching the result for an hour means paying the 8-second cost once, not 3,600 times.

Results of third-party API calls where the external data changes slowly and the API has rate limits or latency you can't control.

What caching does not solve: slow queries caused by missing indexes, N+1 query patterns, connection pool exhaustion, or anything where the data changes at the same frequency it's read.

The Consistency Problem Is Not Optional

Every cache introduces a window of inconsistency between the cache and the source of truth. The size of that window is your TTL, plus however long it takes for invalidation to propagate. During that window, readers may see stale data. Whether this is acceptable is not a technical question — it is a product question.

For product page pricing, 24-hour staleness is probably not acceptable. For a dashboard showing "total users registered," staleness of five minutes is likely fine.

The discipline is to have this conversation before implementing the cache, not after the first customer complaint.

Cache Invalidation: The Hard Part

There are two hard problems in computer science: naming things, cache invalidation, and off-by-one errors. Cache invalidation is listed second but is arguably hardest in practice.

The basic strategies:

TTL-based expiry: Simple, predictable, always produces some staleness. Appropriate when you can tolerate the staleness window and when explicit invalidation is difficult.

Event-driven invalidation: Write operations publish events that delete or update cache entries. This approach can achieve near-zero staleness but requires tight coordination between write and cache paths. A missed invalidation event leaves stale data indefinitely.

// Write-through: update cache on every write
public void updateProduct(Product product) {
    productRepository.save(product);
    // Invalidate immediately; don't update — let the next read repopulate
    cache.delete("product:" + product.getId());
}

Version-based keys: Include a version number or content hash in the cache key. Stale entries are never served — instead, they're orphaned and expire naturally. This is safe but requires managing key space growth.

product:42:v7  -> { price: 29.99, ... }

When product 42 is updated, the write creates key product:42:v8. Old key expires on TTL. Zero chance of serving stale data. Downside: the first read after an update always misses, and you need to clean up orphaned keys.

The Thundering Herd Problem

A cache that expires all entries at the same time (or where a highly-requested entry expires while traffic is high) causes a thundering herd: all concurrent requests for that data hit the database simultaneously. For a key that was being served from cache at 10,000 reads/second, the cache expiration becomes an instant load spike.

Mitigations: jitter (randomize TTLs within a range to spread expiration), probabilistic early recomputation (the approach described in the XFetch algorithm — refresh the cache probabilistically as the TTL approaches), or a distributed lock that allows only one thread to recompute while others wait for the updated value.

Local Cache vs. Distributed Cache

An in-process cache (Caffeine for JVM, functools.lru_cache for Python) is fast — reads are in nanoseconds, no network. It is also per-instance: in a horizontally scaled service, each instance has a different cache state. A write to instance A doesn't invalidate instance B's cache. This is acceptable for data that is effectively immutable or when per-instance staleness is tolerable. It is not acceptable for user-specific data that changes across requests.

Redis or Memcached provides a shared cache. Reads take ~1ms over the network. All instances share the same view. The consistency story is better; the latency is worse than local memory by a factor of 1,000.

The Practical Takeaway

Before implementing any cache: write down the TTL you're considering and then ask whether staleness of that duration is acceptable for the specific data being cached. If you can't answer that without involving a product decision-maker, involve one. Then decide whether event-driven invalidation is worth the implementation cost. If it is, implement it before the cache goes to production — retrofitting invalidation logic into an existing cache is significantly harder than building it in from the start.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

When Should You Actually Break Your Spring Boot App into Microservices

The decision to extract a microservice is an engineering tradeoff, not an architectural rite of passage. Here is how to tell the difference between a legitimate reason and a rationalization.

Read more

Your Transactions Are Bigger Than They Need to Be

Oversized transactions are one of the most common sources of lock contention, replication lag, and autovacuum interference in production databases — and they are almost always fixable without changing business logic.

Read more

Risk Management in Software Development

Software projects rarely fail because of one big mistake. They fail because of many small risks left unchecked.

Read more

The Real Cost of Hiring a Backend Developer in Barcelona Once You Add Employer Contributions

The salary on the offer letter is only part of what a Barcelona backend hire actually costs. Most founders find out the rest after they've already committed.

Read more