Why Your Query Is Slow Even Though You Have an Index
by Eric Hanson, Backend Developer at Clean Systems Consulting
The index that isn't helping
You've confirmed the index exists. \d orders shows it clearly. The query is still slow. You run EXPLAIN and see Seq Scan on orders where you expected Index Scan. This is one of the most frustrating situations in database debugging, and it has a small, identifiable set of root causes.
Cause 1: A function or expression wraps the indexed column
This is the most common cause. Any transformation applied to an indexed column in a WHERE clause defeats the index, because the index stores the raw column values, not the transformed values.
-- Index exists on: created_at
-- Query does NOT use the index:
WHERE DATE(created_at) = '2024-01-15'
WHERE EXTRACT(YEAR FROM created_at) = 2024
WHERE LOWER(email) = 'user@example.com'
WHERE CAST(user_id AS TEXT) = '42'
WHERE created_at::DATE = CURRENT_DATE
In each case, the database must compute the function for every row before comparing, making the index useless.
Fix: rearrange the condition so the column is bare:
-- Use range condition instead of function
WHERE created_at >= '2024-01-15' AND created_at < '2024-01-16'
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'
-- Normalize at write time (store emails lowercase)
WHERE email = 'user@example.com' -- after normalizing during insert
-- Fix the type mismatch upstream
WHERE user_id = 42 -- not CAST(user_id AS TEXT)
Alternatively, create a functional index that stores the pre-computed expression:
-- PostgreSQL: functional index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
-- Now this query uses the index:
WHERE LOWER(email) = 'user@example.com'
Cause 2: A type mismatch causes implicit casting
-- user_id is BIGINT in the table
-- But the application passes a string:
WHERE user_id = '42' -- implicit CAST(user_id AS TEXT) or CAST('42' AS BIGINT)
The behavior depends on the database and the types involved. In PostgreSQL, comparing a BIGINT column to a string literal '42' will cast the literal to BIGINT — the index is used. But comparing a VARCHAR column to an integer literal may cast the column, defeating the index.
The safe rule: always match the literal type to the column type in queries. In ORM code, ensure the parameter type matches the column type — mismatches are a common ORM bug.
Cause 3: Low selectivity — the optimizer knows better
An index on a column with few distinct values (like status with values active, inactive) is only useful when the filtered value is rare. If 95% of rows have status = 'active', the optimizer correctly determines that reading the entire table sequentially is cheaper than following index pointers to 95% of rows.
-- If 90% of orders are 'completed', this may (correctly) use a seq scan:
WHERE status = 'completed'
-- If 0.1% of orders are 'pending', this will use an index:
WHERE status = 'pending'
You can verify the optimizer's estimate vs reality with EXPLAIN ANALYZE:
Seq Scan on orders (cost=0.00..45231.00 rows=9823451 width=...)
(actual rows=9412088 ...)
If the estimate is close to actual and the optimizer chose a seq scan, it's probably correct. If the estimate is wildly off (e.g., estimated 100 rows, actual 1,000,000), stale statistics are the problem — run ANALYZE orders.
Cause 4: Stale or missing statistics
The query optimizer relies on column statistics to estimate how many rows a filter will return. If those statistics are stale (table has grown significantly since the last ANALYZE run), the optimizer may underestimate or overestimate cardinality and choose a suboptimal plan.
-- PostgreSQL: manually update statistics
ANALYZE orders;
-- Check when statistics were last collected
SELECT
schemaname,
tablename,
last_analyze,
last_autoanalyze,
n_live_tup,
n_dead_tup
FROM pg_stat_user_tables
WHERE tablename = 'orders';
If last_autoanalyze is days or weeks old on a heavily written table, autovacuum is not keeping up. Tune autovacuum_analyze_scale_factor and autovacuum_analyze_threshold for that specific table:
ALTER TABLE orders SET (
autovacuum_analyze_scale_factor = 0.01, -- Analyze after 1% of rows change (default 20%)
autovacuum_analyze_threshold = 1000
);
Cause 5: The index doesn't match the query's column order (composite indexes)
-- Composite index on (user_id, status)
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Uses the index (leftmost prefix matched):
WHERE user_id = 42
WHERE user_id = 42 AND status = 'pending'
-- Does NOT use the index:
WHERE status = 'pending' -- 'status' is not the leftmost column
If your query filters only on the second column of a composite index, the index is not used. You either need a separate index on status, or the composite index needs to be restructured.
Cause 6: The index exists but the table is tiny
The optimizer will almost always choose a sequential scan for tables under a few hundred rows, regardless of index availability. It's cheaper to read 3 heap pages sequentially than to traverse a B-tree, fetch row pointers, and then do random page reads. This is correct behavior — don't force index use on small tables.
The debugging sequence
When a query is slow despite having an index:
- Run
EXPLAIN ANALYZE— confirm whether the index is being used - If seq scan: check whether the WHERE clause applies any function to the indexed column
- Check selectivity — is the filtered value rare enough to justify an index scan?
- Run
ANALYZE <tablename>to refresh statistics, then re-runEXPLAIN ANALYZE - Check for type mismatches between column type and query parameter
- If composite index: verify the leftmost prefix rule is satisfied
Most cases are explained by steps 1-4.