Your Transactions Are Bigger Than They Need to Be

by Eric Hanson, Backend Developer at Clean Systems Consulting

The transaction that does too much

You're reviewing a checkout flow and find one database transaction that: validates the cart, locks inventory rows, creates an order, creates order items, charges a payment (via an internal stored procedure that calls an external service), updates inventory, and sends a confirmation email trigger. The payment step can take 1–3 seconds. The entire transaction holds row locks on inventory for that entire time.

Every other user trying to purchase items in that inventory is queued for up to 3 seconds per checkout. At 50 concurrent checkouts, you have a lock queue problem. The fix is not more database capacity — it's narrowing the transaction scope.

What belongs inside a transaction

A transaction should contain only the operations that must be atomic together. Ask: if these two operations execute and the third one fails, is the data in an invalid state that can't be recovered? If yes, they belong together. If not, they probably don't need to be in the same transaction.

Correct: debit account A + credit account B must be atomic
Incorrect reason: "they're related, so they should be in one transaction"

The inventory example, restructured:

Transaction 1 (fast, ~10ms):
  - Validate cart items exist and have sufficient stock
  - Reserve inventory (UPDATE stock SET reserved = reserved + qty WHERE ...)
  - Create order record in 'pending' state
  - Create order items

External call (outside any transaction):
  - Charge payment (~1–3 seconds)

Transaction 2 (fast, ~5ms):
  - If payment succeeded: update order to 'confirmed', decrement actual stock
  - If payment failed: update order to 'failed', release inventory reservation

Now inventory row locks are held for ~10ms per transaction instead of 3 seconds. Concurrency scales linearly.

The anti-patterns that inflate transaction scope

Including network calls inside transactions

# Wrong: HTTP call inside an open transaction
with db.transaction():
    order = Order.create(user_id=user.id, total=cart.total)
    payment_result = payment_service.charge(user.card, cart.total)  # 1–3 seconds
    if payment_result.success:
        order.update(status='confirmed')

The database transaction holds locks for the duration of the HTTP call. If the payment service is slow or times out, your transaction is held open indefinitely (until the query timeout or the connection drops).

Fetching data inside a write transaction when it's not needed

# Wrong: read-only lookup inside the write transaction
with db.transaction():
    product = Product.find(product_id)     # Just a read
    price = PricingRule.calculate(product) # Just a read, may be slow
    Order.create(product=product, price=price)  # The actual write

The reads don't need to be inside the transaction unless you need to lock the rows for consistency (e.g., you need to ensure the price doesn't change between reading it and using it — in which case use SELECT FOR UPDATE explicitly). In most cases, reads can happen before the transaction opens.

Batch operations in a single transaction

# Wrong: one transaction for all 100,000 records
with db.transaction():
    for record in large_dataset:
        Record.create(record)

If this fails at record 80,000, it rolls back all 100,000. The transaction is open for the full duration of the batch — which might be minutes. Every write page modified is held in the WAL. Replication lag accumulates. Autovacuum is blocked on those tables.

Batch in chunks:

BATCH_SIZE = 500
for chunk in chunks(large_dataset, BATCH_SIZE):
    with db.transaction():
        for record in chunk:
            Record.create(record)
    # Checkpoint: if this fails, restart from last successful chunk

500-row transactions commit every few hundred milliseconds. WAL is flushed incrementally. Replication stays current. A failure loses at most one batch.

Long-running transactions and VACUUM interference

In PostgreSQL, VACUUM cannot remove row versions that are still visible to any open transaction. A transaction open for 30 minutes prevents VACUUM from reclaiming 30 minutes' worth of dead rows across every table it has queried — not just the tables it's writing to.

Check for long-running transactions:

SELECT
  pid,
  now() - pg_stat_activity.xact_start AS duration,
  query,
  state
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
  AND now() - pg_stat_activity.xact_start > INTERVAL '5 minutes'
ORDER BY duration DESC;

Any transaction open for more than a few minutes in an OLTP system is suspicious. Either it's doing something it shouldn't (external calls, user interaction, large batch), or it's an idle-in-transaction connection that forgot to commit.

Set idle_in_transaction_session_timeout in PostgreSQL to automatically terminate sessions that start a transaction and then go idle:

-- Terminate connections idle in a transaction for more than 60 seconds
SET idle_in_transaction_session_timeout = '60s';
-- Or set globally in postgresql.conf

The design rule

Before writing any transaction, identify the minimum set of operations that must be atomic. Put only those inside the transaction. Do reads, external calls, and non-critical notifications outside of it. If atomicity requires including something slow, find an architectural pattern (two-phase commit, saga, outbox pattern) rather than holding database locks across that slow operation.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

How Lowball Specs Destroy Project Quality

Ever tried coding on a PC so slow that opening IntelliJ feels like watching paint dry? Low-spec machines and outdated tools do more damage to projects than most managers realize.

Read more

N+1 Queries in Rails — How I Find and Fix Them for Good

N+1 queries are the most common Rails performance problem and the most consistently underestimated. Here is a systematic approach to finding them, fixing them correctly, and preventing them from coming back.

Read more

Why “Hero Developers” Are Dangerous for Engineering Teams

Everyone loves a “rockstar” developer—until the team starts tripping over their code. Hero developers can quietly become the biggest risk to a project.

Read more

How to Avoid Misunderstandings With Remote Clients

Remote work can make communication tricky. Here’s how to keep your projects clear and clients on the same page.

Read more