HTTP Response Caching in Spring Boot — Cache-Control Headers, ETags, and CDN Integration
by Eric Hanson, Backend Developer at Clean Systems Consulting
The difference between application caching and HTTP caching
@Cacheable is application-layer caching — it prevents repeated database queries within the application. The response is still assembled, serialized, and sent over the network on every request.
HTTP caching is network-layer caching — it prevents repeated requests from reaching the application at all. A browser, CDN, or reverse proxy holds the cached response and serves it without contacting the origin server. Done correctly, HTTP caching can serve thousands of requests per second with zero application load.
The two layers are complementary. @Cacheable for a product query reduces database calls. Cache-Control: max-age=300 on the product endpoint prevents the same browser from requesting the product again within five minutes.
Cache-Control — the primary directive
Cache-Control is the HTTP/1.1 header that controls caching behavior. The directives that matter for API responses:
max-age=N — the response is fresh for N seconds. Clients (browsers, CDNs) can serve the cached response without contacting the origin for this duration.
no-cache — the response can be cached, but the client must revalidate with the origin before serving it. The origin can respond with 304 Not Modified (cached response is still valid) or a new response. This is not "don't cache" — it's "cache but always check freshness."
no-store — the response must not be stored. Used for sensitive data (authentication responses, financial data, user-specific data).
private — the response can be cached by the client (browser) but not by shared caches (CDNs, reverse proxies). For user-specific data that should be cached browser-side but not in a CDN that serves all users.
public — the response can be cached by any cache, including CDNs.
s-maxage=N — like max-age but applies only to shared caches (CDNs). Allows different TTLs for browser cache and CDN cache.
stale-while-revalidate=N — the client may serve a stale response for N seconds while fetching a fresh one in the background. Eliminates the latency spike when a cached response expires.
Setting these in Spring Boot:
@GetMapping("/products/{id}")
public ResponseEntity<ProductResponse> getProduct(@PathVariable Long id) {
Product product = productService.findProduct(id);
return ResponseEntity.ok()
.cacheControl(CacheControl.maxAge(Duration.ofMinutes(10))
.cachePublic()
.staleWhileRevalidate(Duration.ofMinutes(2)))
.body(ProductResponse.from(product));
}
This produces: Cache-Control: max-age=600, public, stale-while-revalidate=120
For user-specific data:
@GetMapping("/account/profile")
public ResponseEntity<ProfileResponse> getProfile(
@AuthenticationPrincipal UserDetails user) {
return ResponseEntity.ok()
.cacheControl(CacheControl.maxAge(Duration.ofMinutes(5)).cachePrivate())
.body(profileService.findProfile(user.getUsername()));
}
Cache-Control: max-age=300, private — browser caches for 5 minutes, CDN does not cache.
For sensitive data that must never be cached:
@GetMapping("/account/payment-methods")
public ResponseEntity<List<PaymentMethod>> getPaymentMethods(...) {
return ResponseEntity.ok()
.cacheControl(CacheControl.noStore())
.body(paymentService.findPaymentMethods(userId));
}
ETags — conditional requests and validation
An ETag (entity tag) is a fingerprint of the response content. When a client has a cached response with an ETag, it sends If-None-Match: <etag> on subsequent requests. The server compares the ETag — if the content hasn't changed, it returns 304 Not Modified with no body. The client uses its cached copy.
304 Not Modified eliminates the response body over the network. For large responses, this is significant. The application still executes the handler method (to compute or fetch the current ETag), but the response body is not serialized or transmitted.
Spring's ShallowEtagHeaderFilter generates ETags automatically by hashing the response body:
@Bean
public FilterRegistrationBean<ShallowEtagHeaderFilter> shallowEtagHeaderFilter() {
FilterRegistrationBean<ShallowEtagHeaderFilter> filterRegistrationBean =
new FilterRegistrationBean<>(new ShallowEtagHeaderFilter());
filterRegistrationBean.addUrlPatterns("/api/*");
filterRegistrationBean.setName("etagFilter");
return filterRegistrationBean;
}
ShallowEtagHeaderFilter intercepts the response, computes an MD5 hash of the body, adds it as an ETag header, and returns 304 if the client's If-None-Match matches the computed hash.
The limitation: ShallowEtagHeaderFilter still executes the full handler and serializes the response to compute the hash — it only avoids sending the body. The database query, application logic, and serialization all run. The saving is network bandwidth, not application load.
Deep ETags — computing the ETag from the resource state (version number, last-modified timestamp) without executing the full request:
@GetMapping("/products/{id}")
public ResponseEntity<ProductResponse> getProduct(
@PathVariable Long id,
@RequestHeader(value = "If-None-Match", required = false) String ifNoneMatch) {
// Check ETag from lightweight metadata query — no full fetch
String currentETag = productService.getETag(id);
if (currentETag.equals(ifNoneMatch)) {
return ResponseEntity.status(HttpStatus.NOT_MODIFIED)
.eTag(currentETag)
.build();
}
Product product = productService.findProduct(id);
return ResponseEntity.ok()
.eTag(currentETag)
.cacheControl(CacheControl.maxAge(Duration.ofMinutes(10)))
.body(ProductResponse.from(product));
}
productService.getETag(id) runs a lightweight query for just the version or hash — SELECT version FROM products WHERE id = ?. Only on a mismatch does the full product load execute. This reduces database load for frequently-checked resources.
The ETag value should be deterministic for the same content and change when content changes. A resource's version number, a hash of key fields, or a timestamp of last modification all work.
Last-Modified and conditional GET
Last-Modified / If-Modified-Since is the timestamp-based equivalent of ETag / If-None-Match. Less precise (second granularity) but simpler to implement:
@GetMapping("/products/{id}")
public ResponseEntity<ProductResponse> getProduct(
@PathVariable Long id,
WebRequest webRequest) {
Product product = productService.findProduct(id);
long lastModifiedMs = product.getUpdatedAt().toEpochMilli();
// Returns 304 if client's If-Modified-Since >= lastModifiedMs
if (webRequest.checkNotModified(lastModifiedMs)) {
return null; // Spring handles 304 response
}
return ResponseEntity.ok()
.lastModified(product.getUpdatedAt())
.cacheControl(CacheControl.maxAge(Duration.ofMinutes(10)))
.body(ProductResponse.from(product));
}
webRequest.checkNotModified() handles the comparison and sets the response status to 304 if the content hasn't changed. Returning null from the handler when checkNotModified returns true is the Spring MVC convention — the framework completes the 304 response.
Use ETag when content changes can't be captured by timestamp alone (sub-second changes, same-second multiple updates). Use Last-Modified for simpler cases where modification time is sufficient.
CDN integration
CDNs (CloudFront, Fastly, Cloudflare) sit in front of your application and cache responses per URL. They honor Cache-Control headers — a response with Cache-Control: max-age=600, public is cached at CDN edge nodes for 10 minutes, serving all subsequent requests within that window without hitting your origin.
The CDN-specific Cache-Control directives:
s-maxage — CDN TTL, independently of browser TTL:
CacheControl.maxAge(Duration.ofMinutes(1)) // browser: 1 minute
.sMaxAge(Duration.ofHours(1)) // CDN: 1 hour
.cachePublic()
The browser revalidates frequently (protecting users from stale data). The CDN holds the response for an hour (protecting the origin from load). Combined with stale-while-revalidate, CDN nodes can serve slightly stale content while refreshing in the background.
Surrogate keys / cache tags (Fastly, Cloudflare) — tag responses with logical identifiers and purge by tag when data changes:
@GetMapping("/products/{id}")
public ResponseEntity<ProductResponse> getProduct(@PathVariable Long id) {
Product product = productService.findProduct(id);
return ResponseEntity.ok()
.header("Cache-Tag", "product-" + id + " category-" + product.getCategoryId())
.cacheControl(CacheControl.maxAge(Duration.ofHours(1)).cachePublic())
.body(ProductResponse.from(product));
}
// When a product changes:
@EventListener
public void onProductUpdated(ProductUpdatedEvent event) {
fastlyClient.purgeByTag("product-" + event.getProductId());
}
Purging by tag invalidates all cached responses tagged with that identifier — without knowing the exact URL. All product pages, all category listings that contain the product, and all search results can be invalidated by product tag when the product changes.
CloudFront uses similar functionality via invalidation paths (/products/123*). Cloudflare uses cache tags with the Cache-Tag header.
Vary header — correct CDN caching for content negotiation
Vary tells CDNs that the response varies by certain request headers — the CDN should cache separate responses for different values of those headers:
@GetMapping(value = "/products/{id}", produces = {
MediaType.APPLICATION_JSON_VALUE,
"application/vnd.api+json"
})
public ResponseEntity<ProductResponse> getProduct(@PathVariable Long id) {
return ResponseEntity.ok()
.vary("Accept") // different Accept headers get different cache entries
.cacheControl(CacheControl.maxAge(Duration.ofMinutes(10)).cachePublic())
.body(ProductResponse.from(productService.findProduct(id)));
}
Without Vary: Accept, a CDN might serve a JSON response to a client requesting application/vnd.api+json. With it, separate cache entries exist per Accept value.
Vary: Authorization is a trap — effectively disables CDN caching because every user has a different Authorization header, producing unique cache entries that are never reused. For user-specific endpoints, use Cache-Control: private instead of Vary: Authorization.
What to cache and what not to
Cache at the HTTP layer:
- Static product catalogs, configuration endpoints, public content
- Read-heavy endpoints with infrequent updates
- Responses that are identical for all users (or identifiable user segments)
Don't cache at the HTTP layer:
- User-specific data — use
Cache-Control: privateif at all, not CDN caching - Endpoints that reflect real-time state (inventory counts, live prices)
- Financial and transactional responses —
no-store - Endpoints whose freshness requirements are tighter than the minimum CDN TTL you'd set
The question for each endpoint: what is the acceptable staleness window? If the answer is zero — "users must always see the latest data" — HTTP caching is wrong for this endpoint. If the answer is "5 minutes is fine," max-age=300 is appropriate.