Your Transactions Are Bigger Than They Need to Be

by Arif Ikhsanudin, Backend Developer

The transaction that does too much

You're reviewing a checkout flow and find one database transaction that: validates the cart, locks inventory rows, creates an order, creates order items, charges a payment (via an internal stored procedure that calls an external service), updates inventory, and sends a confirmation email trigger. The payment step can take 1–3 seconds. The entire transaction holds row locks on inventory for that entire time.

Every other user trying to purchase items in that inventory is queued for up to 3 seconds per checkout. At 50 concurrent checkouts, you have a lock queue problem. The fix is not more database capacity — it's narrowing the transaction scope.

What belongs inside a transaction

A transaction should contain only the operations that must be atomic together. Ask: if these two operations execute and the third one fails, is the data in an invalid state that can't be recovered? If yes, they belong together. If not, they probably don't need to be in the same transaction.

Correct: debit account A + credit account B must be atomic
Incorrect reason: "they're related, so they should be in one transaction"

The inventory example, restructured:

Transaction 1 (fast, ~10ms):
  - Validate cart items exist and have sufficient stock
  - Reserve inventory (UPDATE stock SET reserved = reserved + qty WHERE ...)
  - Create order record in 'pending' state
  - Create order items

External call (outside any transaction):
  - Charge payment (~1–3 seconds)

Transaction 2 (fast, ~5ms):
  - If payment succeeded: update order to 'confirmed', decrement actual stock
  - If payment failed: update order to 'failed', release inventory reservation

Now inventory row locks are held for ~10ms per transaction instead of 3 seconds. Concurrency scales linearly.

The anti-patterns that inflate transaction scope

Including network calls inside transactions

# Wrong: HTTP call inside an open transaction
with db.transaction():
    order = Order.create(user_id=user.id, total=cart.total)
    payment_result = payment_service.charge(user.card, cart.total)  # 1–3 seconds
    if payment_result.success:
        order.update(status='confirmed')

The database transaction holds locks for the duration of the HTTP call. If the payment service is slow or times out, your transaction is held open indefinitely (until the query timeout or the connection drops).

Fetching data inside a write transaction when it's not needed

# Wrong: read-only lookup inside the write transaction
with db.transaction():
    product = Product.find(product_id)     # Just a read
    price = PricingRule.calculate(product) # Just a read, may be slow
    Order.create(product=product, price=price)  # The actual write

The reads don't need to be inside the transaction unless you need to lock the rows for consistency (e.g., you need to ensure the price doesn't change between reading it and using it — in which case use SELECT FOR UPDATE explicitly). In most cases, reads can happen before the transaction opens.

Batch operations in a single transaction

# Wrong: one transaction for all 100,000 records
with db.transaction():
    for record in large_dataset:
        Record.create(record)

If this fails at record 80,000, it rolls back all 100,000. The transaction is open for the full duration of the batch — which might be minutes. Every write page modified is held in the WAL. Replication lag accumulates. Autovacuum is blocked on those tables.

Batch in chunks:

BATCH_SIZE = 500
for chunk in chunks(large_dataset, BATCH_SIZE):
    with db.transaction():
        for record in chunk:
            Record.create(record)
    # Checkpoint: if this fails, restart from last successful chunk

500-row transactions commit every few hundred milliseconds. WAL is flushed incrementally. Replication stays current. A failure loses at most one batch.

Long-running transactions and VACUUM interference

In PostgreSQL, VACUUM cannot remove row versions that are still visible to any open transaction. A transaction open for 30 minutes prevents VACUUM from reclaiming 30 minutes' worth of dead rows across every table it has queried — not just the tables it's writing to.

Check for long-running transactions:

SELECT
  pid,
  now() - pg_stat_activity.xact_start AS duration,
  query,
  state
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
  AND now() - pg_stat_activity.xact_start > INTERVAL '5 minutes'
ORDER BY duration DESC;

Any transaction open for more than a few minutes in an OLTP system is suspicious. Either it's doing something it shouldn't (external calls, user interaction, large batch), or it's an idle-in-transaction connection that forgot to commit.

Set idle_in_transaction_session_timeout in PostgreSQL to automatically terminate sessions that start a transaction and then go idle:

-- Terminate connections idle in a transaction for more than 60 seconds
SET idle_in_transaction_session_timeout = '60s';
-- Or set globally in postgresql.conf

The design rule

Before writing any transaction, identify the minimum set of operations that must be atomic. Put only those inside the transaction. Do reads, external calls, and non-critical notifications outside of it. If atomicity requires including something slow, find an architectural pattern (two-phase commit, saga, outbox pattern) rather than holding database locks across that slow operation.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Your Unit Tests Are Testing the Wrong Thing

Most unit test suites are heavily weighted toward testing implementation details rather than behavior. When the implementation changes — even correctly — the tests break, and the suite becomes a maintenance burden instead of a safety net.

Read more

How to Build a Network Without Feeling Fake

Networking often feels like acting—smiling, small talk, pretending to care. But real connections don’t come from performance; they come from being human.

Read more

Why Seattle Startups Lose Every Backend Hiring War to FAANG — and What Actually Works Instead

Your final-round candidate said they were excited. Then they went quiet for a week. Then the recruiter told you they took a return offer from Google.

Read more

When Laptops Are Domain-Locked and Developers Can’t Install Tools

Nothing kills momentum faster than a laptop you can’t fully control. As contractors, working on a domain-locked machine is often a recipe for frustration.

Read more