GROUP BY Is More Powerful Than Most Developers Use It For

March 15, 2026

by Arif Ikhsanudin, Backend Developer

The reporting query that should be one statement

You need a revenue report: total revenue by region, by product category, and a grand total — all in one result set. Your current solution is three queries, a JOIN in the application layer, and a Python loop to compute the grand total. The entire thing runs in 4 seconds. It could be one SQL query running in 400ms.

GROUP BY combined with SQL's grouping extensions (ROLLUP, CUBE, GROUPING SETS) can produce multi-dimensional aggregations in a single pass over the data. Most developers don't know these exist because they're not in the basic SQL tutorials.

The HAVING clause: filtering on aggregated values

WHERE filters rows before aggregation. HAVING filters on the result of aggregation — a distinction that confuses developers who try to put aggregate functions in WHERE and get an error.

-- Wrong: aggregate function in WHERE
SELECT user_id, SUM(total) AS revenue
FROM orders
WHERE SUM(total) > 1000   -- ERROR: aggregate functions not allowed in WHERE
GROUP BY user_id;

-- Right: aggregate condition in HAVING
SELECT user_id, SUM(total) AS revenue
FROM orders
GROUP BY user_id
HAVING SUM(total) > 1000;

HAVING runs after aggregation, so it can reference aggregate expressions. You can combine both:

-- Only high-value completed orders, grouped by user, showing only users with > $5k revenue
SELECT
  user_id,
  COUNT(*) AS order_count,
  SUM(total) AS total_revenue
FROM orders
WHERE status = 'completed'        -- Filters rows before grouping
GROUP BY user_id
HAVING SUM(total) > 5000          -- Filters groups after aggregation
ORDER BY total_revenue DESC;

ROLLUP: subtotals and grand totals

GROUP BY ROLLUP(a, b) generates all subtotal combinations: grouped by (a, b), grouped by (a) alone, and a grand total row. This replaces multiple queries with UNION ALL.

SELECT
  region,
  category,
  SUM(revenue) AS total_revenue,
  GROUPING(region) AS is_region_subtotal,    -- 1 if this row is a subtotal
  GROUPING(category) AS is_category_subtotal
FROM sales
GROUP BY ROLLUP(region, category)
ORDER BY region NULLS LAST, category NULLS LAST;

Result includes:

One row per (region, category) combination
One row per region (category = NULL), summing all categories in that region
One grand total row (region = NULL, category = NULL)

GROUPING() returns 1 when the column is rolled up (NULL due to subtotal), 0 otherwise. This lets you distinguish a row where category is genuinely NULL from a subtotal row.

CUBE: all combinations

GROUP BY CUBE(a, b, c) generates subtotals for all 2ⁿ combinations of the grouped columns. For three columns, that's 8 grouping sets.

SELECT
  year,
  quarter,
  region,
  SUM(revenue)
FROM sales
GROUP BY CUBE(year, quarter, region);
-- Generates: (year,quarter,region), (year,quarter), (year,region),
--            (quarter,region), (year), (quarter), (region), ()

CUBE is expensive on large datasets — it's doing multiple aggregation passes. Use it for fixed-dimension OLAP-style reports, not for high-traffic OLTP queries.

GROUPING SETS: explicit control

GROUPING SETS lets you specify exactly which combinations you want, without generating all possible combinations:

-- Only want: by region, by category, and grand total — not by (region, category)
SELECT
  region,
  category,
  SUM(revenue)
FROM sales
GROUP BY GROUPING SETS (
  (region),
  (category),
  ()         -- grand total
);

This replaces three separate queries with UNION ALL:

-- Equivalent (less efficient — three scans)
SELECT region, NULL as category, SUM(revenue) FROM sales GROUP BY region
UNION ALL
SELECT NULL, category, SUM(revenue) FROM sales GROUP BY category
UNION ALL
SELECT NULL, NULL, SUM(revenue) FROM sales;

The GROUPING SETS version reads the table once. The UNION ALL version reads it three times.

Conditional aggregation: pivoting rows to columns

A classic reporting requirement: show each month's revenue as a separate column for the current year.

SELECT
  product_id,
  SUM(CASE WHEN EXTRACT(MONTH FROM order_date) = 1  THEN total ELSE 0 END) AS jan,
  SUM(CASE WHEN EXTRACT(MONTH FROM order_date) = 2  THEN total ELSE 0 END) AS feb,
  SUM(CASE WHEN EXTRACT(MONTH FROM order_date) = 3  THEN total ELSE 0 END) AS mar,
  -- ... through December
  SUM(total) AS annual_total
FROM orders
WHERE EXTRACT(YEAR FROM order_date) = 2024
GROUP BY product_id;

One scan, one result set, one row per product. The alternative — 12 queries or a complex pivot in application code — is slower and harder to maintain.

The efficiency principle

All these features — ROLLUP, CUBE, GROUPING SETS, conditional aggregation — exploit the same underlying advantage: one table scan to produce multiple aggregation levels. The database processes data in blocks from storage; every additional query adds another full scan. Expressing multi-dimensional requirements in a single SQL statement is not just more elegant — it's faster by a constant factor equal to the number of queries it replaces.

Our offices

Follow us

GROUP BY Is More Powerful Than Most Developers Use It For

The reporting query that should be one statement

The HAVING clause: filtering on aggregated values

ROLLUP: subtotals and grand totals

CUBE: all combinations

GROUPING SETS: explicit control

Conditional aggregation: pivoting rows to columns

The efficiency principle

Scale Your Backend - Need an Experienced Backend Developer?

Tell us about your project

Our offices

More articles

Designing for Failure Is Not Optional in Distributed Systems

What Happens When You Accidentally Delete the Production Database

Java Optional — What It's For, What It's Not For, and How to Use It Well

The Difference Between a Message Queue and an Event Stream