Stop Returning Everything When the Client Only Needs a Few Fields

April 1, 2026

by Arif Ikhsanudin, Backend Developer

The list endpoint that returns full objects

A product list page needs to display 50 items, each showing a name, thumbnail URL, and price. Your API returns the full product object: name, description, price, SKU, weight, dimensions, inventory count, supplier ID, cost basis, tax class, 12 image URLs, and 40 other fields.

The client serializes 50 of these, transmits them over the network, and deserializes them, then uses six fields and discards the rest. The work multiplies across every mobile client, every request, every page load.

This is over-fetching. It wastes bandwidth, increases latency on slow connections, increases memory pressure on mobile clients, and leaks fields the client probably should not have (cost basis, supplier ID, inventory count).

The REST approach: sparse fieldsets

The JSON:API spec defines a fields query parameter for requesting specific fields:

GET /products?fields[products]=name,price,thumbnail_url

A more common REST convention uses fields or include as flat query params:

GET /products?fields=name,price,thumbnail_url

The response includes only the requested fields:

{
  "data": [
    { "id": "prod_01HZ", "name": "Widget Pro", "price": 29.99, "thumbnail_url": "https://..." },
    ...
  ]
}

Implementation in a typical REST framework:

@app.get("/products")
def list_products(fields: Optional[str] = None):
    field_list = fields.split(",") if fields else DEFAULT_FIELDS
    allowed = set(PUBLIC_PRODUCT_FIELDS)
    requested = set(field_list) & allowed  # never allow fields outside whitelist
    
    products = db.query(Product).limit(50).all()
    return [serialize(p, fields=requested) for p in products]

The whitelist is non-negotiable. Do not allow clients to request any field they name — enforce a set of fields the client is permitted to see. This prevents the sparse fieldset mechanism from becoming a data exposure vector.

GraphQL as structural projection

GraphQL solves the same problem at the protocol level. Clients specify exactly the shape of the data they need:

query {
  products(first: 50) {
    nodes {
      name
      price
      thumbnailUrl
    }
  }
}

The server resolves only the requested fields. With a dataloader pattern, this also reduces database queries: instead of fetching related objects that were not requested, only the queried fields trigger data resolution.

GraphQL makes sense when: clients have highly variable data needs, you have multiple clients (mobile, web, partner) with different field requirements, and you are willing to invest in the schema definition and resolver architecture.

It does not make sense when: your API has a small number of well-defined resources with relatively stable shapes, you do not control the clients (public API), or you need HTTP-level caching (GraphQL's POST-only convention breaks standard HTTP caching).

The database query implication

Returning fewer fields is most valuable when it also reduces the database query. If you are doing SELECT * and then filtering in application code, you saved network bandwidth but not I/O.

Use projection at the query level:

# Before: fetches all columns
products = db.query(Product).all()

# After: only fetches requested columns
products = db.query(
    Product.id, Product.name, Product.price, Product.thumbnail_url
).all()

In PostgreSQL, this reduces the amount of data read from disk for tables with wide rows, especially when the columns not requested include large text fields or JSONB columns.

For list endpoints with many rows, this compound effect — fewer bytes from the database, fewer bytes over the network, less serialization work — is meaningful. A product table with a description column averaging 2KB per row: a list of 50 products requesting everything reads 100KB+ just for descriptions. A projection that excludes description drops that to near zero.

ETags and conditional requests for change detection

When a client needs to check whether data has changed (polling pattern), returning the full object on every poll is wasteful even if the data is the same.

Use ETag and If-None-Match for conditional responses:

First request

GET /products/42
→ 200 OK
ETag: "v1-abc123"
{full product object}

Subsequent request

GET /products/42
If-None-Match: "v1-abc123"
→ 304 Not Modified
(empty body)

The client's cache is still valid. No data transferred. This requires computing a hash of the response content (or using a version field), but the bandwidth savings for polling-heavy clients are significant.

What the defaults should be

Every list endpoint should have a defined default field set — the fields that are returned when no fields parameter is specified. Make the default minimal and useful rather than comprehensive. Add fields on explicit request, not by default.

This is a contract. Once you ship a default field set, removing fields from it is a breaking change. Be conservative about what goes in the default from the start.

Our offices

Follow us

Stop Returning Everything When the Client Only Needs a Few Fields

The list endpoint that returns full objects

The REST approach: sparse fieldsets

GraphQL as structural projection

The database query implication

ETags and conditional requests for change detection

First request

Subsequent request

What the defaults should be

Scale Your Backend - Need an Experienced Backend Developer?

Tell us about your project

Our offices

More articles

When You Merge Into Main by Mistake

TSMC and MediaTek Built Taipei's Engineering Culture Around Hardware — Software Backend Is an Afterthought

Service Locator vs Dependency Injection in Java — Understanding the Tradeoffs

Why Your Services Can't Stop Talking to Each Other