The SQL Mistake That Looks Correct But Returns Wrong Data
by Eric Hanson, Backend Developer at Clean Systems Consulting
The query that passes review and corrupts your reports
A billing report has been running in production for eight months. Finance notices the totals are slightly off. Someone looks at the query, spots a LEFT JOIN that's multiplying rows before the SUM, and realizes the revenue figures have been overstated by 12-15% depending on the month. The query looked right. The results looked plausible. Nobody caught it until the numbers were audited.
SQL correctness bugs are worse than performance bugs because they're silent. A slow query is obvious. A query that returns wrong data can run for months before anyone notices — especially for aggregations where "plausible" is not the same as "correct."
Mistake 1: Aggregating after a multiplying JOIN
This is the most common SQL correctness bug in reporting queries:
-- Orders table: one row per order
-- Order items table: multiple rows per order
-- Tags table: multiple rows per order (order can have multiple tags)
SELECT
o.id,
SUM(oi.amount) AS order_total
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
JOIN order_tags ot ON o.id = ot.order_id -- This multiplies rows!
WHERE o.created_at >= '2024-01-01'
GROUP BY o.id;
If an order has 3 items and 4 tags, the join produces 12 rows for that order (3 × 4). SUM(oi.amount) adds each item amount 4 times — once per tag. The total is 4x too large.
The fix: aggregate before joining, or use subqueries to pre-aggregate:
-- Aggregate items before joining to tags
WITH order_totals AS (
SELECT order_id, SUM(amount) AS total
FROM order_items
GROUP BY order_id
)
SELECT
o.id,
ot_agg.total AS order_total,
ARRAY_AGG(t.name) AS tags
FROM orders o
JOIN order_totals ot_agg ON o.id = ot_agg.order_id
LEFT JOIN order_tags otag ON o.id = otag.order_id
LEFT JOIN tags t ON otag.tag_id = t.id
WHERE o.created_at >= '2024-01-01'
GROUP BY o.id, ot_agg.total;
Whenever you have a one-to-many JOIN combined with an aggregation, trace the cardinality before writing the GROUP BY. Ask: how many rows does each join produce per parent row? If the answer is "many," you need to aggregate before or after the join, not during.
Mistake 2: INNER JOIN silently excluding records
-- Intended: total revenue for all users, including users with no orders
SELECT u.id, u.name, SUM(o.total) AS revenue
FROM users u
JOIN orders o ON u.id = o.user_id -- INNER JOIN excludes users with zero orders
GROUP BY u.id, u.name;
Users with no orders are dropped from the result. If you're building a "revenue by user including $0" report, this is silently wrong. The fix is a LEFT JOIN:
SELECT u.id, u.name, COALESCE(SUM(o.total), 0) AS revenue
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name;
COALESCE(SUM(o.total), 0) converts the NULL that SUM returns for an all-NULL group into 0.
The bug is subtle because the query still returns correct data for users who have orders — the error only manifests for users with no orders. If all your test users have orders (as they usually do in dev/staging), you'll never see it.
Mistake 3: WHERE filters that should be ON conditions in outer joins
-- Intended: all orders, with discount info where available (discount is optional)
SELECT o.id, o.total, d.discount_amount
FROM orders o
LEFT JOIN discounts d ON o.id = d.order_id
WHERE d.campaign_id = 5; -- This converts the LEFT JOIN to an INNER JOIN!
The LEFT JOIN returns NULL for d columns when no discount exists. But the WHERE clause d.campaign_id = 5 filters out rows where d.campaign_id IS NULL — which includes all rows with no discount. The result is identical to an INNER JOIN.
When filtering on a LEFT-joined table's columns, the filter belongs in the ON clause, not WHERE:
-- Only bring in discounts from campaign 5 (still returns all orders)
SELECT o.id, o.total, d.discount_amount
FROM orders o
LEFT JOIN discounts d ON o.id = d.order_id AND d.campaign_id = 5;
Now orders without a campaign-5 discount still appear, with NULL for discount_amount.
Mistake 4: DISTINCT as a band-aid for a bad join
-- Slow, and masks the real problem
SELECT DISTINCT u.id, u.name
FROM users u
JOIN user_roles ur ON u.id = ur.user_id
JOIN permissions p ON ur.role_id = p.role_id
WHERE p.name = 'admin';
If DISTINCT is the only thing preventing duplicate users, there's a fan-out in the join (a user with multiple roles, each with the 'admin' permission, appears multiple times). DISTINCT hides this, but it also does a full sort or hash deduplication on the entire result set.
The explicit fix:
-- Use EXISTS or a subquery to check permission without fan-out
SELECT u.id, u.name
FROM users u
WHERE EXISTS (
SELECT 1 FROM user_roles ur
JOIN permissions p ON ur.role_id = p.role_id
WHERE ur.user_id = u.id AND p.name = 'admin'
);
EXISTS short-circuits on the first match — no fan-out, no deduplication needed.
Mistake 5: COUNT(*) vs COUNT(column) confusion
SELECT COUNT(*) AS total, COUNT(cancelled_at) AS cancelled_count
FROM orders;
COUNT(*) counts all rows including NULLs. COUNT(column) counts only rows where that column is NOT NULL. If cancelled_at is NULL for non-cancelled orders, COUNT(cancelled_at) correctly counts only cancelled orders. This is often the intended behavior — but developers who expect COUNT(column) to work like COUNT(*) get confused when the numbers differ.
The explicit and readable equivalent:
SELECT
COUNT(*) AS total_orders,
COUNT(*) FILTER (WHERE cancelled_at IS NOT NULL) AS cancelled_orders,
COUNT(*) FILTER (WHERE cancelled_at IS NULL) AS active_orders
FROM orders;
The verification habit
For any reporting query, generate a small test dataset where the expected result is manually calculable. Run the query. Compare. Pay special attention to edge cases: records with no children in a JOIN, records that match multiple join conditions, NULL values in filtered columns. If the query involves aggregation over multiple joined tables, trace the row cardinality through every join step before trusting the output.