Writing SQL That Still Makes Sense Six Months Later

by Arif Ikhsanudin, Backend Developer

The query you wrote in a hurry

You're six months into a project. A bug report comes in about incorrect revenue totals. You open the query responsible and see 80 lines of SQL with five subqueries, aliases like t1, t2, and x, and a comment that says -- TODO: clean this up. The person who wrote it was you. You have no memory of why any of it exists.

SQL readability is underinvested across the industry. Developers who would never write a 200-line method with no comments routinely write opaque queries that are impossible to debug under pressure. The database doesn't care about formatting. Future you does.

Formatting is not optional

Consistent formatting is the baseline. It costs nothing in runtime and makes queries scannable by someone under pressure.

-- Hard to scan
SELECT u.id,u.name,SUM(o.total) as revenue FROM users u LEFT JOIN orders o ON u.id=o.user_id WHERE u.created_at>'2024-01-01' AND o.status='completed' GROUP BY u.id,u.name HAVING SUM(o.total)>1000 ORDER BY revenue DESC;

-- Readable
SELECT
  u.id,
  u.name,
  SUM(o.total) AS revenue
FROM users u
LEFT JOIN orders o
  ON u.id = o.user_id
WHERE
  u.created_at > '2024-01-01'
  AND o.status = 'completed'
GROUP BY
  u.id,
  u.name
HAVING SUM(o.total) > 1000
ORDER BY revenue DESC;

Each clause gets its own line. Each selected column gets its own line. JOIN conditions are indented under the JOIN. This sounds obvious. It is rarely practiced.

Name things like you'll read them cold

Aliases are the biggest readability debt in SQL. t1, a, sub — these are meaningless. Use abbreviated but descriptive aliases:

-- Meaningless
SELECT t1.id, t2.name, t3.amount
FROM orders t1
JOIN users t2 ON t1.user_id = t2.id
JOIN invoices t3 ON t1.invoice_id = t3.id;

-- Self-documenting
SELECT ord.id, usr.name, inv.amount
FROM orders ord
JOIN users usr ON ord.user_id = usr.id
JOIN invoices inv ON ord.invoice_id = inv.id;

Column aliases should reflect what the value means in business terms, not just what the expression is:

-- Expression alias (describes the computation)
SUM(line_item.quantity * line_item.unit_price) AS total_line_item_revenue

-- Not just
SUM(li.qty * li.price) AS s

Use CTEs to break complex logic into named steps

Common Table Expressions (CTEs) — the WITH clause — let you assign a name to an intermediate result. This is the single highest-leverage readability tool in SQL.

-- Without CTEs: one giant query, hard to reason about
SELECT
  u.name,
  cohort_data.first_order_date,
  revenue_data.total_revenue
FROM users u
JOIN (
  SELECT user_id, MIN(created_at) AS first_order_date
  FROM orders
  GROUP BY user_id
) cohort_data ON u.id = cohort_data.user_id
JOIN (
  SELECT user_id, SUM(total) AS total_revenue
  FROM orders
  WHERE status = 'completed'
  GROUP BY user_id
) revenue_data ON u.id = revenue_data.user_id;

-- With CTEs: each step has a name and a purpose
WITH user_first_orders AS (
  SELECT
    user_id,
    MIN(created_at) AS first_order_date
  FROM orders
  GROUP BY user_id
),
user_revenue AS (
  SELECT
    user_id,
    SUM(total) AS total_revenue
  FROM orders
  WHERE status = 'completed'
  GROUP BY user_id
)
SELECT
  u.name,
  ufo.first_order_date,
  ur.total_revenue
FROM users u
JOIN user_first_orders ufo ON u.id = ufo.user_id
JOIN user_revenue ur ON u.id = ur.user_id;

The CTE version is slightly longer. It is far easier to debug, modify, and hand off to someone else.

One caveat: some database engines (MySQL pre-8.0, older versions of SQL Server) materialize CTEs as temp tables, which can cause performance regressions if the optimizer can't push predicates into them. In PostgreSQL, CTEs prior to version 12 were optimization fences by default. Know your engine's behavior before using deeply nested CTEs in hot paths.

Comment the why, not the what

SQL comments should explain the business reason for a filter or join, not describe what the clause does syntactically.

-- Bad comment (describes syntax)
-- Join to orders table
JOIN orders o ON u.id = o.user_id

-- Good comment (explains business reason)
-- Include only users who have placed at least one order; users without orders
-- are handled by a separate onboarding flow and excluded from revenue reports
JOIN orders o ON u.id = o.user_id

If there's a WHERE clause that filters out a specific edge case, say why. Future maintainers will thank you when they're tempted to remove it.

Organize query files like code

If you're storing queries in files (in a /sql directory, in migration files, in a reporting repo), treat them like code:

  • One query per file with a descriptive filename (monthly_cohort_revenue_by_region.sql)
  • A header comment block with: what the query returns, who uses it, any known edge cases
  • Avoid copy-paste variants of the same query — refactor shared logic into views or CTEs

The practical takeaway

Find the most complex query your team owns — the one that's been modified five times without ever being refactored. Rewrite it using CTEs, explicit column lists, and descriptive aliases. Don't change the logic. Just restructure for readability. Run both versions through EXPLAIN to confirm the plans match. Then commit the readable version. You'll spend 30 minutes now and save hours later when something breaks at 11pm.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

If Your API Needs a Long Explanation It Is Probably Too Complex

An API that requires extensive documentation to use is an API whose complexity has been transferred to the consumer. Simplicity is a design goal, not a constraint.

Read more

Observability Is Not Just Logging. Here Is What You Are Missing.

Logs tell you what happened in one service. Metrics tell you how a service is performing over time. Traces tell you how a request traveled across services. You need all three, connected, to understand a distributed system.

Read more

Getting Paid on Time Is a System. Here Is How to Build One.

Late payments do not happen because clients are malicious — they happen because the contractor never built a system that made paying on time the path of least resistance.

Read more

Why Your API Returns 200 Even When Something Goes Wrong

Returning HTTP 200 for failed operations hides errors, breaks client logic, and makes systems harder to debug. Using proper status codes is not pedantry—it’s critical for correctness and reliability.

Read more