Window Functions: The SQL Feature That Changes How You Think About Data

by Eric Hanson, Backend Developer at Clean Systems Consulting

The query that requires a self-join until it doesn't

You need to rank customers by revenue within each region. Without window functions, you write a self-join: join the orders table to itself, count how many customers in the same region have higher revenue, add one. The query is convoluted, slow, and hard to maintain. Someone adds a window function and the query becomes three lines. This is the moment window functions click for most developers.

Window functions compute a value for each row based on a set of related rows — the "window" — without aggregating the rows into one. The output has the same number of rows as the input. Each row keeps its own data plus the computed window value.

The OVER clause: defining the window

The OVER clause is what makes a function a window function:

-- Regular aggregate: one row per group
SELECT department_id, AVG(salary) FROM employees GROUP BY department_id;

-- Window function: one row per employee, with department average alongside
SELECT
  employee_id,
  name,
  salary,
  AVG(salary) OVER (PARTITION BY department_id) AS dept_avg_salary,
  salary - AVG(salary) OVER (PARTITION BY department_id) AS diff_from_avg
FROM employees;

PARTITION BY divides rows into groups (like GROUP BY) but without collapsing them. ORDER BY within the OVER clause defines row order within the partition — critical for ranking and cumulative functions.

Ranking functions

SELECT
  user_id,
  revenue,
  -- ROW_NUMBER: unique rank, no ties (arbitrary tiebreak)
  ROW_NUMBER() OVER (ORDER BY revenue DESC) AS row_num,

  -- RANK: same rank for ties, gaps after ties (1, 1, 3, 4)
  RANK() OVER (ORDER BY revenue DESC) AS rank,

  -- DENSE_RANK: same rank for ties, no gaps (1, 1, 2, 3)
  DENSE_RANK() OVER (ORDER BY revenue DESC) AS dense_rank,

  -- Percentile rank (0 to 1)
  PERCENT_RANK() OVER (ORDER BY revenue DESC) AS pct_rank
FROM user_revenue;

Ranking within a partition (e.g., top customer per region):

WITH ranked AS (
  SELECT
    user_id,
    region,
    revenue,
    ROW_NUMBER() OVER (PARTITION BY region ORDER BY revenue DESC) AS rank_in_region
  FROM user_revenue
)
SELECT user_id, region, revenue
FROM ranked
WHERE rank_in_region = 1;

This replaces a self-join or a correlated subquery. It's readable, and it runs in one pass over the data.

Running totals and cumulative aggregations

SELECT
  order_date,
  daily_revenue,
  SUM(daily_revenue) OVER (ORDER BY order_date) AS cumulative_revenue,
  AVG(daily_revenue) OVER (
    ORDER BY order_date
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW   -- 7-day moving average
  ) AS seven_day_avg
FROM daily_revenue_summary
ORDER BY order_date;

The ROWS BETWEEN clause defines the frame — which rows around the current row to include in the window. Without it, the default frame is "all rows from the partition start to the current row" for ordered windows, which is what gives you the running total.

Frame specifications:

  • ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: from partition start to current row (running total)
  • ROWS BETWEEN 6 PRECEDING AND CURRENT ROW: current row and 6 preceding (7-row sliding window)
  • ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: entire partition (same as partition aggregate)

LAG and LEAD: comparing to adjacent rows

Without window functions, comparing a row to the previous row requires a self-join on offset row numbers — awkward and slow. LAG and LEAD do this directly:

SELECT
  order_date,
  daily_revenue,
  LAG(daily_revenue, 1) OVER (ORDER BY order_date) AS prev_day_revenue,
  daily_revenue - LAG(daily_revenue, 1) OVER (ORDER BY order_date) AS day_over_day_change,
  LEAD(daily_revenue, 7) OVER (ORDER BY order_date) AS revenue_next_week
FROM daily_revenue_summary;

LAG(expr, n) returns the value of expr from n rows before the current row in the ordered window. LEAD(expr, n) returns n rows after. Both accept a default value for when the offset is out of bounds.

NTILE: bucketing rows into groups

-- Divide customers into 4 revenue quartiles
SELECT
  user_id,
  revenue,
  NTILE(4) OVER (ORDER BY revenue DESC) AS quartile
FROM user_revenue;
-- Quartile 1: top 25%, Quartile 4: bottom 25%

FIRST_VALUE and LAST_VALUE: anchored window values

SELECT
  user_id,
  order_date,
  total,
  FIRST_VALUE(total) OVER (
    PARTITION BY user_id
    ORDER BY order_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS first_order_total,
  LAST_VALUE(total) OVER (
    PARTITION BY user_id
    ORDER BY order_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS latest_order_total
FROM orders;

Note: LAST_VALUE requires the explicit ROWS BETWEEN ... UNBOUNDED FOLLOWING frame — the default frame stops at the current row, so without it, LAST_VALUE returns the current row's value.

Performance characteristics

Window functions are executed after the WHERE and GROUP BY clauses but before the final ORDER BY and LIMIT. They're applied in a single pass over the result set. For large result sets with multiple OVER clauses using different PARTITION BY combinations, the database may need multiple sort operations. Covering indexes that match the PARTITION BY and ORDER BY columns can eliminate these sorts.

The practical improvement over self-joins is substantial. A self-join on a 1M-row table for row-comparison purposes requires a full join product before filtering — often O(n²) in the worst case. A window function processes rows in a single ordered pass — O(n log n) for the sort, O(n) for the computation.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Confessions of a Junior Contractor: Learning the Hard Way

Being a junior contractor isn’t just about coding—it’s about surviving mistakes, awkward emails, and learning faster than you thought possible. Here’s what I learned the hard way.

Read more

Amazon and Microsoft Pay US Salaries in Vancouver — Local Startups Are Competing in the Wrong Currency

Vancouver's tech giants pay their engineers in US dollars at US rates. Canadian startups are making offers in a currency that's already at a structural disadvantage.

Read more

Why Seoul's Startup Scene Is Thriving But Its Backend Talent Is Locked Up in Chaebols

Seoul's startup ecosystem has real momentum. The backend engineers who could staff it are mostly somewhere else.

Read more

Why Contractors Thrive When Given Autonomy, Not Office Orders

“Just be in the office from 9 to 6 and follow our process.” That’s usually where contractor performance starts losing its edge.

Read more