How to Model Relationships in SQL Without Regretting It Later
by Eric Hanson, Backend Developer at Clean Systems Consulting
The relationship that seemed obvious
You're building a tagging system. Tags can be applied to articles, products, and users. The quick approach: one tags table, and a taggable_id + taggable_type column to point at whatever is being tagged. You've seen Rails call this "polymorphic associations." You ship it.
A year later, you can't add a foreign key constraint. Your database has orphaned tags pointing at deleted records. Querying "all articles with tag X" requires a WHERE taggable_type = 'Article' filter that can't use an index on taggable_id alone. And your ORM generates N+1 queries because it can't join polymorphically.
This is the cost of modeling a relationship by convention rather than structure.
One-to-many: the baseline
The standard foreign key relationship. Each order belongs to one user. Each user has many orders.
CREATE TABLE users (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
email VARCHAR(255) NOT NULL UNIQUE
);
CREATE TABLE orders (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
total DECIMAL(10,2) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_orders_user_id ON orders(user_id);
Two things worth being deliberate about:
ON DELETE behavior: RESTRICT prevents deleting a user who has orders (the safest default). CASCADE deletes orders when the user is deleted (risky without audit log). SET NULL nullifies the FK (only valid if user_id is nullable, which you should have a reason for). Choose explicitly — the default is engine-dependent and often wrong.
The index on the FK column: It's not automatic in PostgreSQL or MySQL. Without it, every lookup of orders by user requires a full table scan.
Many-to-many: use a junction table
Articles and tags. A article has many tags; a tag applies to many articles. The correct model is a junction table:
CREATE TABLE tags (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
name VARCHAR(100) NOT NULL UNIQUE
);
CREATE TABLE article_tags (
article_id BIGINT NOT NULL REFERENCES articles(id) ON DELETE CASCADE,
tag_id BIGINT NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
PRIMARY KEY (article_id, tag_id)
);
CREATE INDEX idx_article_tags_tag_id ON article_tags(tag_id);
The composite primary key prevents duplicate associations. The index on tag_id makes "find all articles with this tag" fast. This is the pattern. Don't deviate from it unless you have a clear reason.
When to add columns to the junction table
Junction tables aren't just for foreign keys. They model the relationship itself, which sometimes has its own attributes:
-- A user can be a member of multiple organizations, with a role per membership
CREATE TABLE organization_members (
organization_id BIGINT NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
role VARCHAR(50) NOT NULL DEFAULT 'member',
joined_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (organization_id, user_id)
);
The role and join date belong to the relationship (membership), not to either entity (organization or user). This is the right place for them.
The polymorphic association problem
The pattern:
CREATE TABLE comments (
id BIGINT PRIMARY KEY,
commentable_type VARCHAR(50), -- 'Article', 'Product', 'Video'
commentable_id BIGINT,
body TEXT
);
This breaks relational integrity. commentable_id = 42 means nothing without commentable_type. You can't add a foreign key constraint. You can't enforce referential integrity. You'll have orphaned comments when the referenced record is deleted.
The correct alternatives:
Separate junction tables (preferred for few types):
CREATE TABLE article_comments (
comment_id BIGINT REFERENCES comments(id) ON DELETE CASCADE,
article_id BIGINT REFERENCES articles(id) ON DELETE CASCADE,
PRIMARY KEY (comment_id)
);
CREATE TABLE product_comments (
comment_id BIGINT REFERENCES comments(id) ON DELETE CASCADE,
product_id BIGINT REFERENCES products(id) ON DELETE CASCADE,
PRIMARY KEY (comment_id)
);
Shared supertype table (for many types with shared attributes):
CREATE TABLE content_items (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
type VARCHAR(50) NOT NULL, -- discriminator
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE articles (
id BIGINT PRIMARY KEY REFERENCES content_items(id),
title TEXT NOT NULL
);
CREATE TABLE products (
id BIGINT PRIMARY KEY REFERENCES content_items(id),
price DECIMAL(10,2) NOT NULL
);
CREATE TABLE comments (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
content_item_id BIGINT NOT NULL REFERENCES content_items(id) ON DELETE CASCADE,
body TEXT NOT NULL
);
Now comments.content_item_id has a proper foreign key. Referential integrity is enforced. The tradeoff is one extra join to get the specific type's attributes.
Self-referential relationships
For hierarchical data (org charts, category trees, threaded comments):
CREATE TABLE categories (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
parent_id BIGINT REFERENCES categories(id) ON DELETE SET NULL,
name VARCHAR(255) NOT NULL
);
Querying hierarchies in standard SQL requires recursive CTEs:
WITH RECURSIVE category_tree AS (
SELECT id, name, parent_id, 0 AS depth
FROM categories WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.name, c.parent_id, ct.depth + 1
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT * FROM category_tree ORDER BY depth, name;
For deeply nested or frequently traversed hierarchies, consider the Closure Table pattern or PostgreSQL's ltree extension, which provides path-based queries without recursive CTEs.
The decision you should make explicitly
Every relationship in your domain is either one-to-many (foreign key), many-to-many (junction table), or hierarchical (self-referential). Choose the right structure before writing any queries against it, and enforce referential integrity at the database level with foreign keys. Application-level enforcement alone is not sufficient — it breaks on direct database modifications, bulk imports, and bugs.