Writing Efficient JPA Queries — Fetch Strategies, Projections, and Native Queries

by Eric Hanson, Backend Developer at Clean Systems Consulting

Fetch strategies — the decision that determines query count

Every association in a JPA entity has a fetch type: EAGER (loaded with the parent) or LAZY (loaded on first access). The default differs by association type:

  • @ManyToOne, @OneToOne — default EAGER
  • @OneToMany, @ManyToMany — default LAZY

The @ManyToOne eager default is the primary source of unintended JOINs. Every query for Order automatically JOINs User because of the eager user association — even when the user is never accessed in the response.

Set all associations to LAZY globally, then load eagerly per query:

@Entity
public class Order {
    @ManyToOne(fetch = FetchType.LAZY)    // override the default
    private User user;

    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)  // already default
    private List<LineItem> lineItems;

    @OneToOne(fetch = FetchType.LAZY)     // override the default
    private ShippingAddress shippingAddress;
}

With all associations lazy, a query for orders never automatically joins related tables. Load what you need explicitly per query.

JOIN FETCH — eager loading for a specific query

JOIN FETCH in JPQL tells Hibernate to include the association in the same SQL query:

// Loads order + user + line items in one SQL statement
@Query("SELECT o FROM Order o " +
       "JOIN FETCH o.user " +
       "LEFT JOIN FETCH o.lineItems " +
       "WHERE o.status = :status")
List<Order> findByStatusWithUserAndItems(@Param("status") OrderStatus status);

Generated SQL:

SELECT o.*, u.*, li.*
FROM orders o
JOIN users u ON o.user_id = u.id
LEFT JOIN line_items li ON li.order_id = o.id
WHERE o.status = ?

One query. No N+1. The tradeoff: JOIN FETCH with @OneToMany produces a Cartesian product — each order row is duplicated for each line item. An order with 10 line items returns 10 rows, all with identical order and user columns.

For small, bounded collections this is acceptable. For large collections (hundreds of items per parent), the row duplication is wasteful. Use DISTINCT in JPQL to deduplicate at the Hibernate level:

@Query("SELECT DISTINCT o FROM Order o LEFT JOIN FETCH o.lineItems WHERE o.status = :status")
List<Order> findByStatusWithItems(@Param("status") OrderStatus status);

DISTINCT in JPQL filters duplicates in memory after the query — it does not add DISTINCT to the SQL (Hibernate adds it to the SQL only in specific scenarios). The result set is deduplicated by entity identity.

@EntityGraph — declarative fetch paths

@EntityGraph achieves the same result as JOIN FETCH without modifying the JPQL:

@EntityGraph(attributePaths = {"user", "lineItems"})
List<Order> findByStatus(OrderStatus status);

// Generates the same JOIN FETCH query as above

@EntityGraph is cleaner for repository-level control without explicit JPQL. Its limitation: it always generates a JOIN, which is not appropriate for large @OneToMany collections where batch fetching is more efficient.

For named entity graphs defined on the entity:

@Entity
@NamedEntityGraph(
    name = "Order.withDetails",
    attributeNodes = {
        @NamedAttributeNode("user"),
        @NamedAttributeNode(value = "lineItems", subgraph = "lineItems.product")
    },
    subgraphs = {
        @NamedSubgraph(name = "lineItems.product",
                       attributeNodes = @NamedAttributeNode("product"))
    }
)
public class Order { ... }

// Repository
@EntityGraph("Order.withDetails")
Optional<Order> findById(Long id);

Named entity graphs express complex fetch paths including subgraphs. The definition lives on the entity and can be reused across multiple repository methods.

Batch fetching — the alternative for large collections

JOIN FETCH for @OneToMany collections causes row duplication. Batch fetching avoids this by loading collections in IN clause batches after the parent query:

// On the entity
@OneToMany(mappedBy = "order")
@BatchSize(size = 50)
private List<LineItem> lineItems;

With @BatchSize(50), loading 200 orders and accessing their line items generates:

  • 1 query for orders
  • 4 queries for line items: WHERE order_id IN (1..50), IN (51..100), etc.

Total: 5 queries instead of 201. No row duplication. For large collections accessed across many parents, batch fetching is significantly more efficient than JOIN FETCH.

Configure batch fetching globally for all associations:

spring:
  jpa:
    properties:
      hibernate:
        default_batch_fetch_size: 50

The global setting applies to all lazy associations unless overridden per-entity. This is often the single configuration change with the most impact on N+1 query reduction across an entire application.

Projections — selecting only what you need

Full entity loading returns all mapped columns. For read-only views, summary lists, and API responses that need a subset of fields, projections reduce data transfer and entity tracking overhead.

Interface projections — Spring Data generates a SELECT for only the projected columns:

public interface OrderSummary {
    Long getId();
    OrderStatus getStatus();
    BigDecimal getTotal();

    // Nested projection — joins to user table
    UserEmail getUser();

    interface UserEmail {
        String getEmail();
    }
}

List<OrderSummary> findByStatus(OrderStatus status);
// SELECT o.id, o.status, o.total, u.email FROM orders o JOIN users u ON ...

Interface projections create proxy objects — Spring generates an implementation at runtime. They're convenient but carry the overhead of proxy creation for each result row.

Record projections — no proxy, direct object construction:

public record OrderRow(Long id, OrderStatus status, BigDecimal total, String userEmail) {}

@Query("SELECT new com.example.OrderRow(o.id, o.status, o.total, u.email) " +
       "FROM Order o JOIN o.user u WHERE o.status = :status")
List<OrderRow> findOrderRows(@Param("status") OrderStatus status);

Constructor expressions in JPQL create value objects directly — no entity lifecycle, no dirty checking, no persistence context tracking. Faster for high-volume read operations.

The projection vs entity tradeoff. Projections are faster for reads but can't be used for writes — they're not managed entities. The pattern: use projections for queries that feed API responses or reports; use full entities for queries that lead to updates.

When JPQL is not enough

JPQL covers most query patterns. Three cases where it falls short:

Database-specific functions and operators. Window functions (ROW_NUMBER(), LAG(), LEAD()), LATERAL joins, RETURNING, PostgreSQL's tsvector full-text search — JPQL has no syntax for these.

Complex aggregations. GROUPING SETS, CUBE, ROLLUP, conditional aggregation across multiple columns with complex filter conditions.

Performance-critical queries. When you need specific index hints, join order control, or query plan determinism.

For these cases, native SQL:

@Query(value = """
    WITH monthly_totals AS (
        SELECT user_id,
               DATE_TRUNC('month', created_at) AS month,
               SUM(total) AS monthly_total,
               COUNT(*) AS order_count
        FROM orders
        WHERE status = 'COMPLETED'
        GROUP BY user_id, DATE_TRUNC('month', created_at)
    )
    SELECT u.email,
           mt.month,
           mt.monthly_total,
           mt.order_count,
           RANK() OVER (PARTITION BY mt.month ORDER BY mt.monthly_total DESC) AS rank
    FROM monthly_totals mt
    JOIN users u ON mt.user_id = u.id
    WHERE :minTotal IS NULL OR mt.monthly_total > :minTotal
    """, nativeQuery = true)
List<Object[]> findMonthlyTopSpenders(@Param("minTotal") BigDecimal minTotal);

Native queries return Object[] by default. Map to a projection interface for cleaner access:

public interface MonthlySpenderSummary {
    String getEmail();
    Instant getMonth();
    BigDecimal getMonthlyTotal();
    Long getOrderCount();
    Long getRank();
}

@Query(value = "...", nativeQuery = true)
List<MonthlySpenderSummary> findMonthlyTopSpenders(@Param("minTotal") BigDecimal minTotal);

Spring Data maps native query column names to interface getter names (case-insensitive, underscore-to-camelCase conversion). monthly_total maps to getMonthlyTotal().

Pagination — controlling result set size

Unbounded queries on large tables are a reliability risk. Always apply pagination or explicit limits:

// Pageable — adds LIMIT and OFFSET
Page<Order> findByStatus(OrderStatus status, Pageable pageable);

// Slice — adds LIMIT only, no COUNT query
Slice<Order> findByStatus(OrderStatus status, Pageable pageable);

// Top N — implicit limit
List<Order> findTop10ByStatusOrderByCreatedAtDesc(OrderStatus status);

Page<T> executes two queries: the data query with LIMIT/OFFSET and a COUNT(*) for total element count. For large tables, the count query is expensive — avoid it when clients don't need the total.

Slice<T> executes only the data query, fetching pageSize + 1 rows to determine if more pages exist. Use Slice for infinite scroll and cursor-based pagination where total count is irrelevant.

For the data query itself, OFFSET-based pagination degrades with large offsets — OFFSET 100000 requires the database to read and discard 100,000 rows before returning results. For large datasets, cursor-based pagination (keyset pagination) is more efficient:

@Query("SELECT o FROM Order o WHERE o.createdAt < :cursor ORDER BY o.createdAt DESC")
List<Order> findBeforeCursor(@Param("cursor") Instant cursor, Pageable pageable);

The cursor is the createdAt value of the last item on the previous page. The database uses the index on created_at to find the starting position directly — no row skipping.

The query audit before production

Before any endpoint goes to production:

Enable SQL logging and exercise the endpoint with a realistic data volume:

logging:
  level:
    org.hibernate.SQL: DEBUG
    org.hibernate.type.descriptor.sql: TRACE

Check for:

  • Query count per request — more than 3–5 queries for a single request warrants investigation
  • Full entity loads where projections would suffice — look for SELECT * where only some columns are used
  • Missing WHERE clause — unbounded queries that grow with table size
  • OFFSET on queries where cursor pagination would be more appropriate

The SQL log shows exactly what Hibernate generates. The gap between what you intended and what was generated is where most JPA performance problems live.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

The Business Side of Software Engineering

Software isn’t just built — it’s funded, prioritized, and traded off. Behind every technical decision, there’s a business decision hiding.

Read more

How Singapore Scaleups Reduce Backend Overhead Efficiently

Your engineering team doubled last year. Your backend output didn't. Somewhere between the new hires and the new meetings, the actual building slowed down.

Read more

Dell, Apple, Tesla Are in Austin — and They Are Hiring the Same Developers You Need

When the biggest companies in the world set up in your city, the hiring market doesn't get easier. Here's how startups are staying in the game.

Read more

Canada's Big Banks Are Winning the Toronto Backend Talent War — Here Is How Startups Fight Back

Toronto's financial institutions have deep pockets, stable careers, and a head start on recruiting. Startups need a different playbook.

Read more