Spring Boot Performance in Production — Where the Time Actually Goes

by Eric Hanson, Backend Developer at Clean Systems Consulting

The request lifecycle overhead

A request in a Spring Boot application passes through more layers than the business code suggests. From the time a request arrives to when the response leaves:

  1. Tomcat thread pool accepts the connection
  2. Servlet filter chain executes (Spring Security, logging, tracing, CORS, custom filters)
  3. DispatcherServlet routes to the controller
  4. AOP interceptors run (transaction proxy, caching proxy, security proxy, custom aspects)
  5. Controller method executes
  6. Service layer executes
  7. Repository layer executes (JPA, JDBC)
  8. Database query executes and returns
  9. Response is serialized (Jackson)
  10. Filter chain post-processing
  11. Response is written to the socket

Steps 1–4 and 9–11 are Spring Boot overhead. In a well-tuned application, they consume 1–5ms per request. In a poorly configured application — too many filters, eager AOP proxies for everything, slow Jackson serialization — they can consume 20–50ms regardless of what the business logic does.

Profile before optimizing. async-profiler on a running Spring Boot application under load produces a flamegraph that shows exactly which of these layers is consuming time. The rule: find the widest frame, optimize it, measure the improvement, repeat.

Database — usually where most of the time goes

For most Spring Boot web services, 60–80% of request latency is database time. The Spring Boot overhead is real but secondary. The database work is where optimization pays off most.

N+1 queries are the most common and most impactful database performance problem in JPA applications. Spring Data JPA's lazy loading is the usual cause:

// Innocent-looking loop that generates N+1 queries
List<Order> orders = orderRepository.findByStatus(OrderStatus.PENDING);
orders.forEach(order -> {
    // Each access to order.getUser() fires a SELECT on the users table
    sendNotification(order.getUser().getEmail(), order);
});

order.getUser() is a lazy-loaded association. Each call fires a SELECT users WHERE id = ?. For 100 orders, this is 101 queries: 1 for orders, 100 for users.

Fix: JOIN FETCH in the repository query:

@Query("SELECT o FROM Order o JOIN FETCH o.user WHERE o.status = :status")
List<Order> findByStatusWithUser(@Param("status") OrderStatus status);

Or use @EntityGraph to specify eager loading per query without modifying the entity mapping:

@EntityGraph(attributePaths = {"user", "lineItems"})
List<Order> findByStatus(OrderStatus status);

Enable query logging in development to catch N+1 before it reaches production:

spring:
  jpa:
    show-sql: true
logging:
  level:
    org.hibernate.SQL: DEBUG
    org.hibernate.type.descriptor.sql: TRACE  # logs bind parameters

HikariCP — the connection pool configuration that matters:

spring:
  datasource:
    hikari:
      maximum-pool-size: 20         # start here, tune based on metrics
      minimum-idle: 5               # keep connections warm
      connection-timeout: 30000     # fail fast if pool is exhausted
      idle-timeout: 600000          # close idle connections after 10 minutes
      max-lifetime: 1800000         # rotate connections every 30 minutes
      leak-detection-threshold: 60000  # log if connection held > 60s

leak-detection-threshold is the most useful diagnostic option. It logs a stack trace when a connection is held longer than the threshold — surfaces slow queries and forgotten transaction boundaries immediately.

The right maximum-pool-size is not "as large as possible." The formula from HikariCP's documentation: connections = ((core_count * 2) + effective_spindle_count). For a 4-core server with SSD storage, roughly 10 connections. Monitor hikaricp.connections.pending — sustained pending connections signal pool exhaustion, not a pool that's too small. The database may be the bottleneck, not the pool.

JPA and Hibernate overhead

Persistence context accumulation. In a long-running @Transactional method that loads many entities, the persistence context (first-level cache) accumulates all loaded entities and tracks changes. For bulk operations, this has two costs: memory for tracking all entities, and the dirty-checking pass at transaction commit that examines every tracked entity.

For bulk reads that don't need change tracking:

// Read-only hint — no dirty checking, reduced memory
@Transactional(readOnly = true)
public List<OrderSummary> getOrderSummaries() {
    return orderRepository.findAllProjectedBy();
}

@Transactional(readOnly = true) sets the Hibernate session to read-only — dirty checking is disabled, the flush mode is MANUAL, and Hibernate can optimize memory usage. For query-heavy endpoints, this is a meaningful improvement.

Projections over full entity loading. Loading full entities to return a subset of fields is wasteful — Hibernate hydrates all mapped columns:

// Loads entire Order entity — all columns, all lazy associations initialized if accessed
List<Order> orders = orderRepository.findAll();
return orders.stream().map(o -> new OrderSummary(o.getId(), o.getTotal())).toList();

// Projection — only fetches id and total columns
public interface OrderSummary {
    Long getId();
    Long getTotal();
}
List<OrderSummary> summaries = orderRepository.findAllProjectedBy();

Spring Data JPA's projection interfaces translate to SELECT id, total FROM orders instead of SELECT * FROM orders. For wide tables with many columns, this reduces data transfer significantly.

Serialization overhead

Jackson serialization is measurable overhead at high throughput. For a service serializing thousands of responses per second, serialization can consume 5–15% of CPU.

ObjectMapper is expensive to create. A new ObjectMapper per request is a common mistake:

// Wrong — creates ObjectMapper on every call
public String serialize(Order order) {
    return new ObjectMapper().writeValueAsString(order); // DON'T
}

// Correct — inject the configured singleton
@Autowired ObjectMapper objectMapper;

Spring Boot auto-configures a singleton ObjectMapper with sensible defaults. Inject it everywhere.

Jackson configuration for performance:

@Bean
public ObjectMapper objectMapper() {
    return Jackson2ObjectMapperBuilder.json()
        .featuresToDisable(
            SerializationFeature.WRITE_DATES_AS_TIMESTAMPS,  // ISO strings, not longs
            DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES  // ignore extra fields
        )
        .featuresToEnable(
            MapperFeature.DEFAULT_VIEW_INCLUSION
        )
        .build();
}

FAIL_ON_UNKNOWN_PROPERTIES disabled by default in Spring Boot — clients can send extra fields without breaking deserialization. For high-throughput APIs where clients send large payloads with many unknown fields, this avoids expensive reflection to verify every field is known.

@JsonView for response shaping. Selectively serializing fields by view avoids creating separate DTO classes for each response variant:

public class Order {
    @JsonView(Views.Summary.class)
    private Long id;

    @JsonView(Views.Summary.class)
    private OrderStatus status;

    @JsonView(Views.Detail.class)  // only in detail view
    private List<LineItem> lineItems;
}

The serialization cost is proportional to the data serialized — fewer fields means less work.

AOP proxy overhead

Spring's AOP proxy chains — @Transactional, @Cacheable, @Async, @PreAuthorize — add method interception overhead. Each annotation on a method adds one proxy call in the chain.

The overhead per proxy call is small (microseconds) but compounds when annotations accumulate on hot methods. A method annotated with @Transactional, @Cacheable, @PreAuthorize, and a custom audit annotation has four proxy interceptors.

The case that matters: @Transactional on a method called millions of times per second. The proxy interception overhead — checking for existing transactions, creating transaction context, restoring context after — is non-trivial at that frequency. For truly hot paths with simple operations, consider programmatic transaction management via TransactionTemplate which avoids proxy overhead:

// Bypasses @Transactional proxy
transactionTemplate.execute(status -> {
    repository.save(entity);
    return entity;
});

More importantly: @Transactional on a private method does nothing — Spring AOP works through proxies, and proxies can only intercept public method calls from outside the bean. A @Transactional private method is called directly on this, bypassing the proxy. The annotation is silently ignored. Move transaction boundaries to public methods.

Startup time vs runtime performance

Spring Boot startup time and runtime performance are different concerns with different tools. Startup time is primarily affected by:

  • Component scanning scope — limit @ComponentScan to application packages, not the entire classpath
  • Eager bean initialization — spring.main.lazy-initialization=true defers to first use (useful in development, evaluate for production)
  • Auto-configuration classes loading — the spring-context-indexer pre-computes the component index at build time

Runtime performance is affected by JIT compilation warmup. The first few thousand requests on a fresh JVM run slower as the JIT compiles hot methods. In production:

-XX:+TieredCompilation            # default since Java 8 — gradual JIT optimization
-XX:CompileThreshold=1000         # lower threshold = faster warmup, more CPU upfront

For canary deployments and rolling restarts, gradually routing traffic to new instances rather than immediate full traffic allows JIT warmup before peak load.

The metrics that tell you where to look

Before profiling, use the metrics from the previous observability article to identify the layer to profile:

  • http.server.requests p99 high + hikaricp.connections.pending > 0 → database or connection pool
  • http.server.requests p99 high + hikaricp.connections.pending = 0 → CPU or application code
  • jvm.gc.pause.seconds.max > 100ms → GC tuning needed
  • High CPU + low http.server.requests throughput → serialization, AOP overhead, or application logic

The metric combination tells you which layer to profile. async-profiler then shows the specific code. Fix the top frame in the flamegraph, re-measure, repeat.

Performance optimization without measurement is expensive and usually wrong. The layers above each have specific, measurable overhead. Profile them in production-scale conditions — synthetic benchmarks with development databases and empty caches are not representative.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Optimistic Locking in Hibernate — @Version, Retry Strategies, and Conflict Resolution

Concurrent updates to the same entity without coordination produce lost updates — the last write wins and intermediate changes are silently discarded. Optimistic locking detects this at commit time. Here is how it works and how to handle the conflicts it surfaces.

Read more

Employee vs Contractor: The Real Financial Difference

Why that “expensive” contractor rate isn’t as simple as it looks (and why employees aren’t as cheap as they seem)

Read more

Why Professional Software Consultants Carry Insurance

It’s easy to assume software is “low risk” work — no heavy machinery, no physical danger. Until one bug takes down a business and suddenly, it’s very expensive.

Read more

How Legacy Systems Trap Engineering Teams

Legacy systems can feel like a trap—working, but only barely, and often at the cost of the team trying to maintain them.

Read more