Spring Boot Performance in Production — Where the Time Actually Goes

March 14, 2026

by Arif Ikhsanudin, Backend Developer

The request lifecycle overhead

A request in a Spring Boot application passes through more layers than the business code suggests. From the time a request arrives to when the response leaves:

Tomcat thread pool accepts the connection
Servlet filter chain executes (Spring Security, logging, tracing, CORS, custom filters)
DispatcherServlet routes to the controller
AOP interceptors run (transaction proxy, caching proxy, security proxy, custom aspects)
Controller method executes
Service layer executes
Repository layer executes (JPA, JDBC)
Database query executes and returns
Response is serialized (Jackson)
Filter chain post-processing
Response is written to the socket

Steps 1–4 and 9–11 are Spring Boot overhead. In a well-tuned application, they consume 1–5ms per request. In a poorly configured application — too many filters, eager AOP proxies for everything, slow Jackson serialization — they can consume 20–50ms regardless of what the business logic does.

Profile before optimizing. async-profiler on a running Spring Boot application under load produces a flamegraph that shows exactly which of these layers is consuming time. The rule: find the widest frame, optimize it, measure the improvement, repeat.

Database — usually where most of the time goes

For most Spring Boot web services, 60–80% of request latency is database time. The Spring Boot overhead is real but secondary. The database work is where optimization pays off most.

N+1 queries are the most common and most impactful database performance problem in JPA applications. Spring Data JPA's lazy loading is the usual cause:

// Innocent-looking loop that generates N+1 queries
List<Order> orders = orderRepository.findByStatus(OrderStatus.PENDING);
orders.forEach(order -> {
    // Each access to order.getUser() fires a SELECT on the users table
    sendNotification(order.getUser().getEmail(), order);
});

order.getUser() is a lazy-loaded association. Each call fires a SELECT users WHERE id = ?. For 100 orders, this is 101 queries: 1 for orders, 100 for users.

Fix: JOIN FETCH in the repository query:

@Query("SELECT o FROM Order o JOIN FETCH o.user WHERE o.status = :status")
List<Order> findByStatusWithUser(@Param("status") OrderStatus status);

Or use @EntityGraph to specify eager loading per query without modifying the entity mapping:

@EntityGraph(attributePaths = {"user", "lineItems"})
List<Order> findByStatus(OrderStatus status);

Enable query logging in development to catch N+1 before it reaches production:

spring:
  jpa:
    show-sql: true
logging:
  level:
    org.hibernate.SQL: DEBUG
    org.hibernate.type.descriptor.sql: TRACE  # logs bind parameters

HikariCP — the connection pool configuration that matters:

spring:
  datasource:
    hikari:
      maximum-pool-size: 20         # start here, tune based on metrics
      minimum-idle: 5               # keep connections warm
      connection-timeout: 30000     # fail fast if pool is exhausted
      idle-timeout: 600000          # close idle connections after 10 minutes
      max-lifetime: 1800000         # rotate connections every 30 minutes
      leak-detection-threshold: 60000  # log if connection held > 60s

leak-detection-threshold is the most useful diagnostic option. It logs a stack trace when a connection is held longer than the threshold — surfaces slow queries and forgotten transaction boundaries immediately.

The right maximum-pool-size is not "as large as possible." The formula from HikariCP's documentation: connections = ((core_count * 2) + effective_spindle_count). For a 4-core server with SSD storage, roughly 10 connections. Monitor hikaricp.connections.pending — sustained pending connections signal pool exhaustion, not a pool that's too small. The database may be the bottleneck, not the pool.

JPA and Hibernate overhead

Persistence context accumulation. In a long-running @Transactional method that loads many entities, the persistence context (first-level cache) accumulates all loaded entities and tracks changes. For bulk operations, this has two costs: memory for tracking all entities, and the dirty-checking pass at transaction commit that examines every tracked entity.

For bulk reads that don't need change tracking:

// Read-only hint — no dirty checking, reduced memory
@Transactional(readOnly = true)
public List<OrderSummary> getOrderSummaries() {
    return orderRepository.findAllProjectedBy();
}

@Transactional(readOnly = true) sets the Hibernate session to read-only — dirty checking is disabled, the flush mode is MANUAL, and Hibernate can optimize memory usage. For query-heavy endpoints, this is a meaningful improvement.

Projections over full entity loading. Loading full entities to return a subset of fields is wasteful — Hibernate hydrates all mapped columns:

// Loads entire Order entity — all columns, all lazy associations initialized if accessed
List<Order> orders = orderRepository.findAll();
return orders.stream().map(o -> new OrderSummary(o.getId(), o.getTotal())).toList();

// Projection — only fetches id and total columns
public interface OrderSummary {
    Long getId();
    Long getTotal();
}
List<OrderSummary> summaries = orderRepository.findAllProjectedBy();

Spring Data JPA's projection interfaces translate to SELECT id, total FROM orders instead of SELECT * FROM orders. For wide tables with many columns, this reduces data transfer significantly.

Serialization overhead

Jackson serialization is measurable overhead at high throughput. For a service serializing thousands of responses per second, serialization can consume 5–15% of CPU.

ObjectMapper is expensive to create. A new ObjectMapper per request is a common mistake:

// Wrong — creates ObjectMapper on every call
public String serialize(Order order) {
    return new ObjectMapper().writeValueAsString(order); // DON'T
}

// Correct — inject the configured singleton
@Autowired ObjectMapper objectMapper;

Spring Boot auto-configures a singleton ObjectMapper with sensible defaults. Inject it everywhere.

Jackson configuration for performance:

@Bean
public ObjectMapper objectMapper() {
    return Jackson2ObjectMapperBuilder.json()
        .featuresToDisable(
            SerializationFeature.WRITE_DATES_AS_TIMESTAMPS,  // ISO strings, not longs
            DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES  // ignore extra fields
        )
        .featuresToEnable(
            MapperFeature.DEFAULT_VIEW_INCLUSION
        )
        .build();
}

FAIL_ON_UNKNOWN_PROPERTIES disabled by default in Spring Boot — clients can send extra fields without breaking deserialization. For high-throughput APIs where clients send large payloads with many unknown fields, this avoids expensive reflection to verify every field is known.

@JsonView for response shaping. Selectively serializing fields by view avoids creating separate DTO classes for each response variant:

public class Order {
    @JsonView(Views.Summary.class)
    private Long id;

    @JsonView(Views.Summary.class)
    private OrderStatus status;

    @JsonView(Views.Detail.class)  // only in detail view
    private List<LineItem> lineItems;
}

The serialization cost is proportional to the data serialized — fewer fields means less work.

AOP proxy overhead

Spring's AOP proxy chains — @Transactional, @Cacheable, @Async, @PreAuthorize — add method interception overhead. Each annotation on a method adds one proxy call in the chain.

The overhead per proxy call is small (microseconds) but compounds when annotations accumulate on hot methods. A method annotated with @Transactional, @Cacheable, @PreAuthorize, and a custom audit annotation has four proxy interceptors.

The case that matters: @Transactional on a method called millions of times per second. The proxy interception overhead — checking for existing transactions, creating transaction context, restoring context after — is non-trivial at that frequency. For truly hot paths with simple operations, consider programmatic transaction management via TransactionTemplate which avoids proxy overhead:

// Bypasses @Transactional proxy
transactionTemplate.execute(status -> {
    repository.save(entity);
    return entity;
});

More importantly: @Transactional on a private method does nothing — Spring AOP works through proxies, and proxies can only intercept public method calls from outside the bean. A @Transactional private method is called directly on this, bypassing the proxy. The annotation is silently ignored. Move transaction boundaries to public methods.

Startup time vs runtime performance

Spring Boot startup time and runtime performance are different concerns with different tools. Startup time is primarily affected by:

Component scanning scope — limit @ComponentScan to application packages, not the entire classpath
Eager bean initialization — spring.main.lazy-initialization=true defers to first use (useful in development, evaluate for production)
Auto-configuration classes loading — the spring-context-indexer pre-computes the component index at build time

Runtime performance is affected by JIT compilation warmup. The first few thousand requests on a fresh JVM run slower as the JIT compiles hot methods. In production:

-XX:+TieredCompilation            # default since Java 8 — gradual JIT optimization
-XX:CompileThreshold=1000         # lower threshold = faster warmup, more CPU upfront

For canary deployments and rolling restarts, gradually routing traffic to new instances rather than immediate full traffic allows JIT warmup before peak load.

The metrics that tell you where to look

Before profiling, use the metrics from the previous observability article to identify the layer to profile:

http.server.requests p99 high + hikaricp.connections.pending > 0 → database or connection pool
http.server.requests p99 high + hikaricp.connections.pending = 0 → CPU or application code
jvm.gc.pause.seconds.max > 100ms → GC tuning needed
High CPU + low http.server.requests throughput → serialization, AOP overhead, or application logic

The metric combination tells you which layer to profile. async-profiler then shows the specific code. Fix the top frame in the flamegraph, re-measure, repeat.

Performance optimization without measurement is expensive and usually wrong. The layers above each have specific, measurable overhead. Profile them in production-scale conditions — synthetic benchmarks with development databases and empty caches are not representative.

Our offices

Follow us

Spring Boot Performance in Production — Where the Time Actually Goes

The request lifecycle overhead

Database — usually where most of the time goes

JPA and Hibernate overhead

Serialization overhead

AOP proxy overhead

Startup time vs runtime performance

The metrics that tell you where to look

Scale Your Backend - Need an Experienced Backend Developer?

Tell us about your project

Our offices

More articles

How to Know When Your Team Needs a Tech Lead

Remote Work Does Not Mean Always Available. Here Is How to Set That Expectation.

Why Some Companies Prefer Independent Contractors

The Builder Pattern in Java — When It Helps and When It Becomes a Liability