Background Jobs in Rails — Sidekiq Patterns I Rely On in Production

by Eric Hanson, Backend Developer at Clean Systems Consulting

The failure modes that show up first

Sidekiq jobs fail in predictable ways. They fail because the data they were passed no longer exists when the job runs. They fail because an external API is temporarily unavailable. They fail because two jobs running simultaneously race to modify the same record. They retry and cause the same external request to fire three times.

None of these are Sidekiq bugs. They're consequences of asynchronous execution that require explicit design decisions. The patterns below address each one.

Pass IDs, not objects

The single most important Sidekiq pattern. Job arguments are serialized to JSON and stored in Redis. ActiveRecord objects don't serialize cleanly — they serialize to a hash representation that may become stale before the job executes:

# Wrong — serializes object state at enqueue time, stale by execution time
UserWelcomeJob.perform_later(@user)

# Correct — loads fresh record at execution time
UserWelcomeJob.perform_later(@user.id)

In the job:

class UserWelcomeJob < ApplicationJob
  queue_as :default

  def perform(user_id)
    user = User.find(user_id)
    WelcomeMailer.welcome(user).deliver_now
  end
end

The consequence of loading by ID: the user may not exist when the job runs. A user can be deleted between job enqueue and job execution. find raises ActiveRecord::RecordNotFound — the job fails and retries, which is wrong because the user is gone and the job will never succeed.

Handle the missing record case explicitly:

def perform(user_id)
  user = User.find_by(user_id)
  return unless user  # record gone — stop processing, don't retry

  WelcomeMailer.welcome(user).deliver_now
end

return unless user stops the job cleanly without raising. Sidekiq marks it complete. No retry. This is the correct behavior when the target record no longer exists.

Idempotency — design for safe retries

Sidekiq retries failed jobs automatically (up to 25 times by default, with exponential backoff). Every job must be safe to execute multiple times with the same arguments. If it isn't, a retry causes duplicate effects — double charges, duplicate emails, double inventory deductions.

Idempotency strategies by effect type:

Database writes — use upsert or check-then-write with a unique constraint:

def perform(order_id, status)
  Order.where(id: order_id).update_all(
    status:     status,
    updated_at: Time.current
  ) unless Order.find_by(id: order_id, status: status)
end

# Or better — let the database enforce it
Order.upsert(
  { id: order_id, status: status },
  unique_by: :id,
  update_only: [:status]
)

External API calls — pass an idempotency key to the API:

def perform(order_id)
  order = Order.find(order_id)
  PaymentGateway.charge(
    amount:          order.total,
    idempotency_key: "charge-#{order_id}-#{order.payment_attempt}"
  )
end

The idempotency key must be stable across retries — derived from the job's inputs, not from SecureRandom. "charge-#{order_id}" is safe; "charge-#{SecureRandom.hex}" is not.

Email sends — track sent status in the database:

def perform(user_id, email_type)
  return if SentEmail.exists?(user_id: user_id, email_type: email_type)

  user = User.find(user_id)
  UserMailer.send(email_type, user).deliver_now
  SentEmail.create!(user_id: user_id, email_type: email_type)
end

The check-then-create has a race condition — two jobs could pass the check simultaneously. Add a unique index on (user_id, email_type) in sent_emails and rescue the uniqueness violation:

SentEmail.create!(user_id: user_id, email_type: email_type)
rescue ActiveRecord::RecordNotUnique
  # another job beat us — email already sent, nothing to do
end

Retry configuration — match retries to error type

The default retry count of 25 with exponential backoff means a job might retry over 21 days before hitting the dead queue. That's appropriate for transient infrastructure failures. It's not appropriate for logic errors that will never succeed regardless of how many times you retry.

Configure retries per job class:

class ExternalApiJob < ApplicationJob
  sidekiq_options retry: 5  # transient failures — 5 retries sufficient

  def perform(resource_id)
    # ...
  end
end

class DataProcessingJob < ApplicationJob
  sidekiq_options retry: 0  # logic errors — fail immediately to dead queue

  def perform(data_id)
    # ...
  end
end

Distinguish between retry-worthy errors and non-retry-worthy errors within a job:

def perform(order_id)
  order = Order.find_by(order_id)
  return unless order

  result = PaymentGateway.charge(order)

  if result.network_error?
    raise result.error  # retry — transient
  elsif result.declined?
    order.update!(status: :payment_failed, decline_reason: result.reason)
    # don't raise — no point retrying a declined card
  end
end

Raising re-enqueues the job for retry. Not raising marks the job complete regardless of the outcome. Intentional non-raising is how you handle expected failure states that shouldn't retry.

Dead queue monitoring

Jobs that exhaust retries go to the dead queue. Left unmonitored, the dead queue is where production problems hide. Set up alerting:

# config/initializers/sidekiq.rb
Sidekiq.configure_server do |config|
  config.death_handlers << ->(job, exception) do
    Sentry.capture_exception(
      exception,
      extra: { job_class: job["class"], job_args: job["args"] }
    )
    # or PagerDuty, Honeybadger, Datadog, etc.
  end
end

The death handler fires when a job enters the dead queue. Alert on it. A dead job is a production failure that requires human intervention — it shouldn't be discoverable only by manually browsing Sidekiq's web UI.

Set a dead queue size limit and expiry to prevent unbounded growth:

Sidekiq.configure_server do |config|
  config.options[:dead_max_jobs] = 10_000
  config.options[:dead_timeout_in_seconds] = 90.days
end

Preventing concurrency races with Sidekiq::Limiter or Redis locks

When multiple job instances process records concurrently, race conditions arise on shared state. Two workers processing the same order simultaneously can double-charge a card:

class ProcessOrderJob < ApplicationJob
  def perform(order_id)
    # Two workers can both start here before either updates the status
    order = Order.find(order_id)
    return if order.processed?

    charge_card(order)
    order.mark_processed!
  end
end

The fix is a distributed lock. The redis-mutex gem or Sidekiq Enterprise's Sidekiq::Limiter provides this:

# Using redlock gem — open source
def perform(order_id)
  lock_key     = "process_order:#{order_id}"
  lock_manager = Redlock::Client.new([Redis.current])

  lock_manager.lock!(lock_key, 30_000) do  # 30-second TTL
    order = Order.find_by(order_id)
    return unless order && !order.processed?

    charge_card(order)
    order.mark_processed!
  end
rescue Redlock::LockError
  # Another worker holds the lock — job will retry
  raise
end

The lock TTL must exceed the expected job duration. If the job takes 5 seconds and the TTL is 3 seconds, the lock expires and a second worker acquires it while the first is still running — the race condition you were trying to prevent.

For Sidekiq Pro/Enterprise, sidekiq-unique-jobs prevents duplicate jobs from being enqueued in the first place, which eliminates the race at the queue level rather than the execution level.

Queue configuration and priority

A flat queue structure where all jobs share one queue means a batch job that enqueues 100,000 records blocks user-facing jobs for hours. Separate queues by priority:

# config/sidekiq.yml
:queues:
  - [critical, 10]  # user-facing, real-time
  - [default, 5]    # standard background work
  - [bulk, 1]       # batch processing, low priority

The numbers are relative weights — critical gets 10x the poll frequency of bulk. Assign queues at the job level:

class UserNotificationJob < ApplicationJob
  queue_as :critical
end

class InvoiceGenerationJob < ApplicationJob
  queue_as :bulk
end

This ensures user-facing jobs like notification sends aren't blocked behind a CSV export that enqueued a million rows.

The job that should be a service object

The cleanest Sidekiq pattern is a thin job wrapper around a service object:

class ProcessOrderJob < ApplicationJob
  queue_as :default
  sidekiq_options retry: 3

  def perform(order_id)
    order = Order.find_by(id: order_id)
    return unless order

    ProcessOrder.call(order: order)
  end
end

The job handles Sidekiq concerns: loading the record, guard-returning on missing records, retry configuration, queue assignment. The service object handles business logic. The service object is independently testable, callable from the console, and reusable from controllers or rake tasks.

Testing a job is then two tests: one for the job (that it calls the service, handles missing records) and one for the service (that it does the right thing). Neither requires Sidekiq to be running.

RSpec.describe ProcessOrderJob do
  it "calls the service with the order" do
    order = create(:order)
    expect(ProcessOrder).to receive(:call).with(order: order)
    described_class.new.perform(order.id)
  end

  it "returns early when the order does not exist" do
    expect(ProcessOrder).not_to receive(:call)
    described_class.new.perform(-1)
  end
end

described_class.new.perform(id) calls the job synchronously without Sidekiq. No perform_enqueued_jobs wrapper needed in unit tests.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

How to Save Money When You Don’t Know Your Taxes

You get paid, you feel good… then suddenly remember taxes exist. And now you’re wondering how much of that money is actually yours.

Read more

Tokyo's Backend Hiring Problem Is Not Just Language — It Is Speed and Scale

Foreign founders in Tokyo assume the language barrier is their biggest hiring obstacle. It usually isn't.

Read more

Amsterdam Backend Salaries Hit €100K. Here Is How Startups Avoid That Overhead

Your next backend hire in Amsterdam will probably cost you six figures before you even factor in the 30% ruling changes and mandatory benefits. That number used to be reserved for staff engineers. Now it's table stakes for anyone decent.

Read more

Munich Cannot Produce Backend Engineers Fast Enough — Here Is How Growing Teams Adapt

TU München graduated another strong class of computer scientists. Half of them had signed offers at BMW or Siemens before the ceremony started.

Read more