Database Indexing in Rails — What I Check Before Every Deploy

by Eric Hanson, Backend Developer at Clean Systems Consulting

Why indexes get missed

Rails makes it easy to add columns and forget indexes. add_column :orders, :status, :string works perfectly. Queries against it work perfectly — until the table has 500k rows and the WHERE status = 'pending' query in your background job goes from 2ms to 8 seconds because it's doing a full sequential scan.

The reasons indexes get missed are structural: the developer adding the column is thinking about the feature, not the query plan. The feature works correctly in staging with a small dataset. The index is only discovered missing when production performance degrades enough to trigger an alert.

The checklist below runs before a PR merges, when the migration is visible and the cost of fixing it is zero.

Check 1: Every foreign key has an index

Foreign keys without indexes are the most common missing-index problem in Rails codebases. When you query order.line_items, Rails executes SELECT * FROM line_items WHERE order_id = ?. Without an index on order_id, this is a sequential scan of the entire line_items table for every order you access.

Rails does not create indexes on foreign key columns automatically. You must add them explicitly. If you use add_reference, the index is optional and off by default in older Rails versions:

# Adds foreign key column but NO index unless specified
add_reference :line_items, :order

# Correct — explicit index
add_reference :line_items, :order, index: true

# Or separately
add_column :line_items, :order_id, :bigint, null: false
add_index  :line_items, :order_id
add_foreign_key :line_items, :orders

In Rails 5+, add_reference with foreign_key: true adds the constraint but still requires index: true separately unless you've configured the generator default. Check your actual schema.rb — the index appears as add_index :line_items, :order_id and is visually distinct from the column declaration.

The audit query for missing foreign key indexes in PostgreSQL:

SELECT
  tc.table_name,
  kcu.column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
  ON tc.constraint_name = kcu.constraint_name
LEFT JOIN pg_indexes pi
  ON pi.tablename = tc.table_name
  AND pi.indexdef LIKE '%' || kcu.column_name || '%'
WHERE tc.constraint_type = 'FOREIGN KEY'
  AND pi.indexname IS NULL;

Run this against your production database. Every row in the result is a missing index that may be causing sequential scans.

Check 2: Columns used in WHERE, ORDER BY, or JOIN have indexes

Foreign keys are the automatic case. The less obvious case: any column that appears in a frequent WHERE, ORDER BY, or JOIN ON clause needs an index if the table is large or the query is frequent.

Walk through the new queries introduced in a PR. For each where, order, and joins call, verify the column is indexed. The test is EXPLAIN ANALYZE against production data, but the pre-deploy check is simpler: grep the query and check schema.rb:

# New scope added in this PR
scope :recently_active, -> { where("last_active_at > ?", 7.days.ago).order(:last_active_at) }

Both last_active_at in the WHERE and last_active_at in the ORDER BY benefit from the same index. One add_index :users, :last_active_at covers both.

The column types that most frequently miss indexes: status columns (string enums that appear in where clauses constantly), created_at and updated_at when used for range queries, user_id on any join table that isn't the primary model, and any column used as a group-by key in reporting queries.

Check 3: Composite index column order matches query patterns

A composite index is only useful if the query uses a prefix of the indexed columns. An index on (user_id, status) satisfies these queries:

  • WHERE user_id = ?
  • WHERE user_id = ? AND status = ?

But not:

  • WHERE status = ? — the query must lead with user_id to use this index

Column order in a composite index follows the most-specific-first rule: the column used in equality conditions goes first, the column used in range conditions or ordering goes last:

# For queries: WHERE user_id = ? ORDER BY created_at DESC
add_index :orders, [:user_id, :created_at]

# For queries: WHERE user_id = ? AND status = ?
add_index :orders, [:user_id, :status]

# Not this — status alone won't use the index
add_index :orders, [:status, :user_id]  # only if queries lead with status

Verify each composite index against actual query patterns. An index with wrong column order is present in schema.rb and looks correct but provides no benefit for the queries that need it.

Check 4: Partial indexes for selective columns

A partial index indexes only rows matching a condition. For columns where most rows have a common value you never query for, a partial index is smaller, faster to build, and faster to scan than a full index:

# Most orders are completed — you only query pending ones
add_index :orders, :created_at, where: "status = 'pending'", name: "idx_orders_pending_created_at"

# Soft-delete pattern — you almost never query deleted records
add_index :users, :email, where: "deleted_at IS NULL", name: "idx_users_active_email"

Partial indexes are the right default for soft-delete tables, status-based filtering, and any column where the "interesting" subset is a small fraction of total rows.

The consideration: the WHERE clause in the query must match the WHERE clause in the index. WHERE status = 'pending' uses the partial index; WHERE status != 'completed' does not, even if the result sets are similar.

Check 5: Unique indexes enforce uniqueness at the database level

Model-level uniqueness validations (validates :email, uniqueness: true) are not sufficient. They're checked in Ruby, and between the check and the write, another process can insert the same value — the classic TOCTOU race condition. Under concurrent load, this happens regularly.

The database-level enforcement is a unique index:

add_index :users, :email, unique: true

Without this, validates uniqueness: true is a best-effort check, not a guarantee. Every uniqueness validation in your models should have a corresponding unique index in schema.rb.

For compound uniqueness — a user can only have one active subscription per plan:

add_index :subscriptions, [:user_id, :plan_id], unique: true, where: "status = 'active'"

The partial unique index enforces the constraint only on active subscriptions, allowing cancelled subscriptions to exist with the same (user_id, plan_id) combination.

Check 6: Indexes on polymorphic associations

Polymorphic associations need composite indexes on both the type and id columns:

# Polymorphic belongs_to
belongs_to :commentable, polymorphic: true
# Generates: commentable_type (string) + commentable_id (bigint)

An index on commentable_id alone is not useful — the query always filters by both type and id:

SELECT * FROM comments WHERE commentable_type = 'Post' AND commentable_id = 42

The index must cover both:

add_index :comments, [:commentable_type, :commentable_id]

add_reference with polymorphic: true and index: true does this correctly. If you added the columns manually, verify both columns appear in a single composite index, not two separate single-column indexes.

The schema.rb audit before merging

The migration checklist above can be partially automated. The lol_dba gem analyzes your models and schema to detect missing indexes on foreign keys and commonly queried columns:

rails db:find_indexes

It produces a list of suggested add_index calls based on belongs_to declarations and query patterns it can infer from associations. It's not comprehensive — it can't see all query patterns — but it catches the systematic gaps.

The DatabaseConsistency gem goes further: it checks that uniqueness validations have corresponding unique indexes, that null: false columns have presence validations, and that foreign key constraints exist for belongs_to associations. Run it in CI:

bundle exec database_consistency

A clean database_consistency run means your schema and model validations are in sync. It doesn't replace reviewing new migrations manually, but it catches the gaps that slip through.

The deploy-time check that's too late

EXPLAIN ANALYZE is not a pre-deploy check — it requires production-scale data to be meaningful. An EXPLAIN on a staging database with 1,000 rows shows a sequential scan taking 0.5ms and reports no problem. The same query on production with 5 million rows shows a sequential scan taking 12 seconds.

For this reason, the pre-deploy checklist focuses on structural correctness: are the columns that will be queried indexed? The performance verification — EXPLAIN ANALYZE in production — happens after deploy but before the feature carries significant traffic. Use pg_stat_statements to find the slowest queries by total execution time after a feature has been live for 24 hours:

SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

This surfaces the queries worth optimizing next. The pre-deploy check prevents the obvious misses. The post-deploy monitoring finds the subtle ones.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Why Austin Startups Are Rethinking Local-Only Backend Hiring

The case for keeping your backend team local used to be obvious. The math has changed, and a lot of founders are noticing.

Read more

When to Stop a Software Project Instead of Continuing

Sometimes a project doesn’t fail because of bad luck. It fails because the foundation was broken from day one.

Read more

Chicago Has a Thriving Tech Scene — and a Fintech Sector That Absorbs All the Senior Backend Talent

Chicago's tech community is active and growing. Its fintech and trading infrastructure sector quietly employs most of the senior backend engineers that community depends on.

Read more

Caching at the API Level: The Performance Win Most Backends Skip

Database query optimization and index tuning get the attention. HTTP caching — the layer that can eliminate database hits entirely for read-heavy endpoints — often gets ignored.

Read more