How to Model Relationships in SQL Without Regretting It Later

by Eric Hanson, Backend Developer at Clean Systems Consulting

The relationship that seemed obvious

You're building a tagging system. Tags can be applied to articles, products, and users. The quick approach: one tags table, and a taggable_id + taggable_type column to point at whatever is being tagged. You've seen Rails call this "polymorphic associations." You ship it.

A year later, you can't add a foreign key constraint. Your database has orphaned tags pointing at deleted records. Querying "all articles with tag X" requires a WHERE taggable_type = 'Article' filter that can't use an index on taggable_id alone. And your ORM generates N+1 queries because it can't join polymorphically.

This is the cost of modeling a relationship by convention rather than structure.

One-to-many: the baseline

The standard foreign key relationship. Each order belongs to one user. Each user has many orders.

CREATE TABLE users (
  id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  email VARCHAR(255) NOT NULL UNIQUE
);

CREATE TABLE orders (
  id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
  total DECIMAL(10,2) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_orders_user_id ON orders(user_id);

Two things worth being deliberate about:

ON DELETE behavior: RESTRICT prevents deleting a user who has orders (the safest default). CASCADE deletes orders when the user is deleted (risky without audit log). SET NULL nullifies the FK (only valid if user_id is nullable, which you should have a reason for). Choose explicitly — the default is engine-dependent and often wrong.

The index on the FK column: It's not automatic in PostgreSQL or MySQL. Without it, every lookup of orders by user requires a full table scan.

Many-to-many: use a junction table

Articles and tags. A article has many tags; a tag applies to many articles. The correct model is a junction table:

CREATE TABLE tags (
  id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  name VARCHAR(100) NOT NULL UNIQUE
);

CREATE TABLE article_tags (
  article_id BIGINT NOT NULL REFERENCES articles(id) ON DELETE CASCADE,
  tag_id BIGINT NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
  PRIMARY KEY (article_id, tag_id)
);

CREATE INDEX idx_article_tags_tag_id ON article_tags(tag_id);

The composite primary key prevents duplicate associations. The index on tag_id makes "find all articles with this tag" fast. This is the pattern. Don't deviate from it unless you have a clear reason.

When to add columns to the junction table

Junction tables aren't just for foreign keys. They model the relationship itself, which sometimes has its own attributes:

-- A user can be a member of multiple organizations, with a role per membership
CREATE TABLE organization_members (
  organization_id BIGINT NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
  user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  role VARCHAR(50) NOT NULL DEFAULT 'member',
  joined_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  PRIMARY KEY (organization_id, user_id)
);

The role and join date belong to the relationship (membership), not to either entity (organization or user). This is the right place for them.

The polymorphic association problem

The pattern:

CREATE TABLE comments (
  id BIGINT PRIMARY KEY,
  commentable_type VARCHAR(50),  -- 'Article', 'Product', 'Video'
  commentable_id BIGINT,
  body TEXT
);

This breaks relational integrity. commentable_id = 42 means nothing without commentable_type. You can't add a foreign key constraint. You can't enforce referential integrity. You'll have orphaned comments when the referenced record is deleted.

The correct alternatives:

Separate junction tables (preferred for few types):

CREATE TABLE article_comments (
  comment_id BIGINT REFERENCES comments(id) ON DELETE CASCADE,
  article_id BIGINT REFERENCES articles(id) ON DELETE CASCADE,
  PRIMARY KEY (comment_id)
);

CREATE TABLE product_comments (
  comment_id BIGINT REFERENCES comments(id) ON DELETE CASCADE,
  product_id BIGINT REFERENCES products(id) ON DELETE CASCADE,
  PRIMARY KEY (comment_id)
);

Shared supertype table (for many types with shared attributes):

CREATE TABLE content_items (
  id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  type VARCHAR(50) NOT NULL,  -- discriminator
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE articles (
  id BIGINT PRIMARY KEY REFERENCES content_items(id),
  title TEXT NOT NULL
);

CREATE TABLE products (
  id BIGINT PRIMARY KEY REFERENCES content_items(id),
  price DECIMAL(10,2) NOT NULL
);

CREATE TABLE comments (
  id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  content_item_id BIGINT NOT NULL REFERENCES content_items(id) ON DELETE CASCADE,
  body TEXT NOT NULL
);

Now comments.content_item_id has a proper foreign key. Referential integrity is enforced. The tradeoff is one extra join to get the specific type's attributes.

Self-referential relationships

For hierarchical data (org charts, category trees, threaded comments):

CREATE TABLE categories (
  id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  parent_id BIGINT REFERENCES categories(id) ON DELETE SET NULL,
  name VARCHAR(255) NOT NULL
);

Querying hierarchies in standard SQL requires recursive CTEs:

WITH RECURSIVE category_tree AS (
  SELECT id, name, parent_id, 0 AS depth
  FROM categories WHERE parent_id IS NULL
  UNION ALL
  SELECT c.id, c.name, c.parent_id, ct.depth + 1
  FROM categories c
  JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT * FROM category_tree ORDER BY depth, name;

For deeply nested or frequently traversed hierarchies, consider the Closure Table pattern or PostgreSQL's ltree extension, which provides path-based queries without recursive CTEs.

The decision you should make explicitly

Every relationship in your domain is either one-to-many (foreign key), many-to-many (junction table), or hierarchical (self-referential). Choose the right structure before writing any queries against it, and enforce referential integrity at the database level with foreign keys. Application-level enforcement alone is not sufficient — it breaks on direct database modifications, bulk imports, and bugs.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

The Status Update That Keeps Clients Calm Without Wasting Your Time

A well-structured status update takes five minutes to write and saves hours of unnecessary back-and-forth. Most contractors write them too long, too rarely, or not at all.

Read more

How to Run Your Spring Boot App and Database Together With Docker Compose

Getting a Spring Boot application and PostgreSQL to start together correctly in Docker Compose requires more than just listing both services — you need health checks, proper dependency ordering, and connection URL configuration that works inside a container network.

Read more

The Unit Test That Passes Locally and Fails in CI Is a Design Problem

Tests that behave differently depending on where they run are not environment problems — they are design problems. The test is exposing hidden dependencies on the execution environment that the production code carries too.

Read more

Timeouts in Microservices: The Setting Most Developers Never Configure

Unconfigured or incorrectly configured timeouts are one of the most common root causes of cascading failures in microservices. Default HTTP client timeouts are designed for general-purpose use, not for production inter-service communication.

Read more