Spring Boot Request Processing Overhead — Filter Chains, Serialization, and What's Worth Measuring

by Eric Hanson, Backend Developer at Clean Systems Consulting

The layers a request passes through

Before your controller method runs and after it returns, every Spring Boot request passes through:

  • Tomcat connector — thread selection, socket I/O, HTTP parsing
  • Servlet filter chain — Spring Security, CORS, request logging, tracing, compression, custom filters
  • DispatcherServlet — handler mapping, content negotiation
  • HandlerInterceptors — pre/post processing hooks
  • Argument resolution@RequestBody deserialization, @PathVariable extraction, @AuthenticationPrincipal resolution
  • Controller execution — your code
  • Return value handling — response serialization, @ResponseBody processing
  • Filter chain post-processing — response headers, tracing completion

For a service where requests take 50ms on average, 48ms of that is probably the database and 2ms is everything above. Optimizing the filter chain saves fractions of a millisecond. For a service where requests are inherently fast (sub-millisecond computations, mostly cached data), the 2ms overhead is the dominant cost.

Profile before optimizing. async-profiler with -e cpu on a service under load shows which of these layers consumes meaningful CPU. Without measurement, you're guessing.

Filter chain overhead

Each servlet filter in the chain processes every request — pre-processing before the controller and post-processing after the response. The number of filters and their individual cost determines the chain overhead.

Inspect the filter chain:

@Component
public class FilterChainLogger implements ApplicationContextAware {

    @Override
    public void setApplicationContext(ApplicationContext ctx) {
        FilterChainProxy securityFilterChain = ctx.getBean(FilterChainProxy.class);
        securityFilterChain.getFilterChains().forEach(chain -> {
            log.info("Security filter chain: {}", chain.getFilters().stream()
                .map(f -> f.getClass().getSimpleName())
                .collect(Collectors.joining(" -> ")));
        });
    }
}

Or inspect via Actuator:

curl http://localhost:8080/actuator/mappings | jq '.contexts.application.mappings.dispatcherServlets'

A default Spring Boot application with Spring Security has 15–20 security filters, each executing on every request. Most are lightweight. A few warrant attention:

SessionManagementFilter — for stateless REST APIs, Spring Security's session management adds overhead maintaining session state that's never used. Disable it explicitly for stateless APIs:

@Configuration
public class SecurityConfig {

    @Bean
    public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
        return http
            .sessionManagement(session ->
                session.sessionCreationPolicy(SessionCreationPolicy.STATELESS))
            .csrf(csrf -> csrf.disable())  // stateless APIs don't need CSRF
            .build();
    }
}

STATELESS policy prevents session creation and disables session-related filters. For a REST API that uses JWT or API keys, this is correct configuration regardless of performance.

CsrfFilter — computes and validates CSRF tokens. Disabled for stateless APIs (above). For server-rendered apps that need CSRF, the overhead is inherent.

Custom filters that do I/O. A request logging filter that writes to a database or a filter that validates API keys against Redis adds I/O latency to every request before the controller runs. Evaluate whether the validation belongs in the filter chain or in a Spring Security AuthenticationProvider where it can be cached.

Measuring filter chain cost

Add timing to identify which filters are expensive:

@Component
@Order(Ordered.HIGHEST_PRECEDENCE)  // runs first
public class RequestTimingFilter extends OncePerRequestFilter {

    @Override
    protected void doFilterInternal(HttpServletRequest request,
            HttpServletResponse response, FilterChain chain)
            throws ServletException, IOException {

        long start = System.nanoTime();
        try {
            chain.doFilter(request, response);
        } finally {
            long elapsed = System.nanoTime() - start;
            // Entire filter chain + controller + response writing
            log.debug("Request {} {} completed in {}ms",
                request.getMethod(), request.getRequestURI(),
                TimeUnit.NANOSECONDS.toMillis(elapsed));
        }
    }
}

Place this filter at the highest precedence (runs before all others) — it measures total request processing time including all other filters. Compare this against the time recorded closer to the controller to isolate filter chain overhead.

Micrometer's http.server.requests metric measures from DispatcherServlet entry to response completion — it excludes most servlet filter overhead. If the p99 latency in your APM is much higher than http.server.requests p99, the delta is filter chain overhead.

Argument resolution — @RequestBody deserialization

@RequestBody deserializes the request body using Jackson. For large request bodies or complex object graphs, this is measurable overhead.

Jackson's deserialization is fast for simple types — a few microseconds for a small JSON object. It becomes significant for:

Large arrays. A request body with 10,000 items being deserialized into List<Order> allocates one Order object per item, plus the intermediate JSON tokens. For bulk API endpoints, streaming deserialization (reading the JSON stream without building the full list in memory) is more efficient:

@PostMapping("/orders/bulk")
public ResponseEntity<BulkResult> bulkCreateOrders(HttpServletRequest request)
        throws IOException {
    try (JsonParser parser = objectMapper.getFactory()
            .createParser(request.getInputStream())) {

        // Read and process each order as it's parsed — no full list in memory
        MappingIterator<CreateOrderRequest> orders =
            objectMapper.readValues(parser, CreateOrderRequest.class);

        BulkResult result = new BulkResult();
        while (orders.hasNext()) {
            CreateOrderRequest order = orders.next();
            result.add(orderService.createOrder(order));
        }
        return ResponseEntity.ok(result);
    }
}

Unknown field scanning. With DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES enabled (not the Spring Boot default), Jackson inspects every field in the JSON against the target type's known properties. For JSON with many unknown fields, this adds overhead. Spring Boot disables this by default — verify it's not re-enabled somewhere.

Response serialization overhead

@ResponseBody serializes the return value to JSON using Jackson. For endpoints returning large responses, serialization is significant:

Measure serialization cost in isolation:

@GetMapping("/orders")
public List<OrderSummary> listOrders() {
    long queryStart = System.nanoTime();
    List<OrderSummary> orders = orderService.findOrders();
    long queryEnd = System.nanoTime();

    // Serialization happens after this method returns in return value handling
    // Use @ResponseBody with a ResponseBodyAdvice to measure serialization separately
    log.debug("Query: {}ms, count: {}", 
        TimeUnit.NANOSECONDS.toMillis(queryEnd - queryStart), orders.size());
    return orders;
}

Streaming for large responses. StreamingResponseBody writes the response incrementally, releasing the Tomcat thread while writing. For very large responses, this prevents thread pool exhaustion during slow network writes:

@GetMapping(value = "/orders/export", produces = MediaType.APPLICATION_JSON_VALUE)
public StreamingResponseBody exportOrders() {
    return outputStream -> {
        try (JsonGenerator generator = objectMapper.getFactory()
                .createGenerator(outputStream)) {
            generator.writeStartArray();
            orderRepository.streamAll().forEach(order -> {
                try {
                    objectMapper.writeValue(generator, OrderExportRow.from(order));
                } catch (IOException e) {
                    throw new UncheckedIOException(e);
                }
            });
            generator.writeEndArray();
        }
    };
}

StreamingResponseBody runs on a different thread pool — the Tomcat request thread is released immediately, allowing new requests to be accepted while the response is being written.

Response compression

HTTP response compression (gzip/Brotli) reduces network transfer at the cost of CPU. Spring Boot's embedded Tomcat compresses responses when configured:

server:
  compression:
    enabled: true
    mime-types: application/json, application/xml, text/html, text/plain
    min-response-size: 2048  # only compress responses larger than 2KB

Compression is worth enabling for JSON API responses over 2KB — JSON compresses well (60–80% size reduction typical). For small responses (< 1KB), compression overhead exceeds the network savings. The min-response-size threshold handles this automatically.

CPU cost of gzip: roughly 1–3ms per response for a 10KB JSON payload on modern hardware. For a service handling 10,000 requests per second, this is 10,000–30,000ms of CPU per second — meaningful at scale. Profile whether compression is CPU-bound at your traffic levels before enabling it.

HTTP/2 and connection multiplexing

HTTP/2 is worth enabling for APIs consumed by browsers or clients that make multiple concurrent requests to the same host. Multiplexing multiple requests over a single TCP connection reduces connection establishment overhead:

server:
  http2:
    enabled: true

For most backend-to-backend API calls, HTTP/2's benefit is minimal — the client typically makes sequential requests or maintains a persistent connection pool anyway. For public APIs consumed by many different clients, HTTP/2 reduces connection overhead significantly.

What's actually worth optimizing

The filter chain, argument resolution, and serialization together typically add 1–5ms to requests where the database takes 20–100ms. Optimizing them saves a small percentage of total latency.

The cases where these layers dominate:

  • Endpoints that return cached data (database time is near zero — framework overhead is all that's left)
  • Validation-heavy endpoints where the request body is large and complex
  • High-frequency lightweight operations (heartbeat endpoints, metrics endpoints, health checks)

For these cases, the optimizations above — stateless session policy, streaming for large bodies, response compression, HTTP/2 — are worth applying. For endpoints that are database-bound, the same effort applied to query optimization returns more.

The measurement that tells you where to invest: compare http.server.requests duration (DispatcherServlet to response) against total request duration in your load balancer or APM. A large gap indicates filter chain overhead worth investigating. A small gap confirms that the framework layers are not the bottleneck.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Race Conditions and Visibility in Java — What the Memory Model Actually Guarantees

The Java Memory Model defines precisely which writes are visible to which reads, and under what conditions. Without understanding it, thread-safe code is guesswork. With it, the correct tool for each situation becomes clear.

Read more

Turning Your First Project Failure Into a Success Story

That moment when everything falls apart—missed deadlines, bugs everywhere, unhappy client. It feels like the end, but it’s actually the beginning of something useful.

Read more

The Questions You Should Ask a Backend Contractor Before You Sign Anything

Most contractor evaluations focus on technical skills. The questions that actually predict a good engagement are about something else entirely.

Read more

Toronto Has More Backend Developers Than Most Cities — and Still Cannot Fill Senior Roles Fast Enough

Toronto's developer population is genuinely large. The senior backend engineers your startup needs are still harder to hire than they should be.

Read more