Stop Guessing Why Your Query Is Slow. Use EXPLAIN.

by Eric Hanson, Backend Developer at Clean Systems Consulting

The optimization that didn't help

A query is running in 6 seconds. A developer looks at it, decides the users join is the problem, and adds an index on users.email. The query still runs in 6 seconds. They add another index. Still slow. They escalate.

An hour later, someone runs EXPLAIN ANALYZE and discovers the bottleneck is a sort on 2 million rows that's spilling to disk because work_mem is set to 4MB. The indexes were irrelevant. The problem was never the join.

This is the standard pattern when teams optimize without EXPLAIN. They act on intuition and get lucky or waste time. EXPLAIN takes 10 seconds to run and tells you exactly what the database is doing. Use it first, always.

EXPLAIN vs EXPLAIN ANALYZE: the critical difference

-- EXPLAIN: shows the plan the optimizer would use, WITHOUT executing the query
EXPLAIN SELECT * FROM orders WHERE user_id = 42;

-- EXPLAIN ANALYZE: actually executes the query and shows real timings and row counts
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 42;

EXPLAIN alone shows estimated costs and rows. It does not tell you how long the query actually took or how many rows were actually returned. For diagnosis, this is almost never enough.

EXPLAIN ANALYZE executes the query. For a slow SELECT, this means you're running the slow query twice — once to diagnose. That's fine and necessary. For destructive queries (UPDATE, DELETE), wrap in a transaction you roll back:

BEGIN;
EXPLAIN ANALYZE DELETE FROM orders WHERE created_at < '2023-01-01';
ROLLBACK;

The full diagnostic command

For PostgreSQL, this is what you should run:

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT
  o.id,
  u.name,
  SUM(oi.quantity * oi.unit_price) AS total
FROM orders o
JOIN users u ON o.user_id = u.id
JOIN order_items oi ON o.id = oi.order_id
WHERE o.created_at > NOW() - INTERVAL '30 days'
  AND o.status = 'completed'
GROUP BY o.id, u.name
ORDER BY total DESC;

The BUFFERS option adds buffer cache statistics — how many pages were read from cache (shared hit) vs disk (shared read). High shared reads on a hot query is a sign of cache pressure.

For MySQL:

EXPLAIN FORMAT=JSON
SELECT ...;

MySQL's JSON format is more information-dense than the default tabular output. Look for "access_type": "ALL" (full table scan) and "rows_examined_per_scan" for each table.

Reading the output: the one-minute version

Start at the bottom of the plan (the leaf nodes — where data is read) and work up:

Sort  (actual time=3821.43..3894.23 rows=289341 width=72)
  Sort Key: (sum((oi.quantity * oi.unit_price))) DESC
  Sort Method: external merge  Disk: 28640kB          ← SPILLING TO DISK
  ->  HashAggregate  (actual time=2341.12..2589.44 rows=289341)
        ->  Hash Join  (actual time=234.33..1923.44 rows=1247891)
              ->  Seq Scan on order_items oi
              ->  Hash
                    ->  Index Scan on orders o
                          Filter: (status = 'completed' AND created_at > ...)
                          Rows Removed by Filter: 1823441  ← POOR SELECTIVITY

Two immediate findings:

  1. Sort Method: external merge Disk: 28640kB — the sort is spilling 28MB to disk. Fix: increase work_mem for this session (SET work_mem = '256MB' before the query).

  2. Rows Removed by Filter: 1823441 on the index scan — the index on orders is being used, but after following index entries, 1.8 million rows are being discarded by the filter. The index might not be selective enough for this query, or a composite index covering (status, created_at) would reduce the rows fetched from the heap.

What each finding points to

FindingLikely causeFix
Seq Scan on large tableMissing index, or function on columnAdd index, rewrite filter
High rows estimate vs actualStale statisticsANALYZE <table>
External merge diskInsufficient work_memIncrease work_mem
Nested Loop + large outerWrong join type chosenCheck statistics, consider enable_nestloop = off temporarily to test
High shared read vs hitWorking set not in buffer poolIncrease shared_buffers, add caching layer
Index scan with many filtered rowsIndex not selective enoughComposite index or partial index

The habit to build

Before you write an index migration, before you refactor a query, before you escalate a slow query report — run EXPLAIN (ANALYZE, BUFFERS). Read the actual row counts. Find the node with the highest actual time. Fix that node. Run the plan again to confirm improvement. This process, done consistently, catches 90% of query performance problems at the root cause, not at the symptom.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

You're Missing Indexes Where It Matters Most

Most teams add indexes reactively — after a slow query surfaces in production. The indexes that matter most are the ones you plan for upfront, based on your actual query patterns rather than generic advice.

Read more

System Design Is Not About Drawing Pretty Diagrams

Most system design conversations produce polished diagrams that look great in a slide deck and fall apart in production. The diagram is not the design — the decisions behind it are.

Read more

OpenAPI Specs: The Documentation Format Worth Getting Right From the Start

An OpenAPI spec done well is a contract, a test harness, and an SDK generator. An OpenAPI spec done poorly is a documentation burden that diverges from reality within weeks.

Read more

How I Make Architecture Decisions Without Endless Meetings

Architecture decisions don't need a calendar invite — they need a clear process, the right people, and a bias toward writing things down. Here's the framework I actually use.

Read more