Background Jobs in Rails — Sidekiq Patterns I Rely On in Production
by Eric Hanson, Backend Developer at Clean Systems Consulting
The failure modes that show up first
Sidekiq jobs fail in predictable ways. They fail because the data they were passed no longer exists when the job runs. They fail because an external API is temporarily unavailable. They fail because two jobs running simultaneously race to modify the same record. They retry and cause the same external request to fire three times.
None of these are Sidekiq bugs. They're consequences of asynchronous execution that require explicit design decisions. The patterns below address each one.
Pass IDs, not objects
The single most important Sidekiq pattern. Job arguments are serialized to JSON and stored in Redis. ActiveRecord objects don't serialize cleanly — they serialize to a hash representation that may become stale before the job executes:
# Wrong — serializes object state at enqueue time, stale by execution time
UserWelcomeJob.perform_later(@user)
# Correct — loads fresh record at execution time
UserWelcomeJob.perform_later(@user.id)
In the job:
class UserWelcomeJob < ApplicationJob
queue_as :default
def perform(user_id)
user = User.find(user_id)
WelcomeMailer.welcome(user).deliver_now
end
end
The consequence of loading by ID: the user may not exist when the job runs. A user can be deleted between job enqueue and job execution. find raises ActiveRecord::RecordNotFound — the job fails and retries, which is wrong because the user is gone and the job will never succeed.
Handle the missing record case explicitly:
def perform(user_id)
user = User.find_by(user_id)
return unless user # record gone — stop processing, don't retry
WelcomeMailer.welcome(user).deliver_now
end
return unless user stops the job cleanly without raising. Sidekiq marks it complete. No retry. This is the correct behavior when the target record no longer exists.
Idempotency — design for safe retries
Sidekiq retries failed jobs automatically (up to 25 times by default, with exponential backoff). Every job must be safe to execute multiple times with the same arguments. If it isn't, a retry causes duplicate effects — double charges, duplicate emails, double inventory deductions.
Idempotency strategies by effect type:
Database writes — use upsert or check-then-write with a unique constraint:
def perform(order_id, status)
Order.where(id: order_id).update_all(
status: status,
updated_at: Time.current
) unless Order.find_by(id: order_id, status: status)
end
# Or better — let the database enforce it
Order.upsert(
{ id: order_id, status: status },
unique_by: :id,
update_only: [:status]
)
External API calls — pass an idempotency key to the API:
def perform(order_id)
order = Order.find(order_id)
PaymentGateway.charge(
amount: order.total,
idempotency_key: "charge-#{order_id}-#{order.payment_attempt}"
)
end
The idempotency key must be stable across retries — derived from the job's inputs, not from SecureRandom. "charge-#{order_id}" is safe; "charge-#{SecureRandom.hex}" is not.
Email sends — track sent status in the database:
def perform(user_id, email_type)
return if SentEmail.exists?(user_id: user_id, email_type: email_type)
user = User.find(user_id)
UserMailer.send(email_type, user).deliver_now
SentEmail.create!(user_id: user_id, email_type: email_type)
end
The check-then-create has a race condition — two jobs could pass the check simultaneously. Add a unique index on (user_id, email_type) in sent_emails and rescue the uniqueness violation:
SentEmail.create!(user_id: user_id, email_type: email_type)
rescue ActiveRecord::RecordNotUnique
# another job beat us — email already sent, nothing to do
end
Retry configuration — match retries to error type
The default retry count of 25 with exponential backoff means a job might retry over 21 days before hitting the dead queue. That's appropriate for transient infrastructure failures. It's not appropriate for logic errors that will never succeed regardless of how many times you retry.
Configure retries per job class:
class ExternalApiJob < ApplicationJob
sidekiq_options retry: 5 # transient failures — 5 retries sufficient
def perform(resource_id)
# ...
end
end
class DataProcessingJob < ApplicationJob
sidekiq_options retry: 0 # logic errors — fail immediately to dead queue
def perform(data_id)
# ...
end
end
Distinguish between retry-worthy errors and non-retry-worthy errors within a job:
def perform(order_id)
order = Order.find_by(order_id)
return unless order
result = PaymentGateway.charge(order)
if result.network_error?
raise result.error # retry — transient
elsif result.declined?
order.update!(status: :payment_failed, decline_reason: result.reason)
# don't raise — no point retrying a declined card
end
end
Raising re-enqueues the job for retry. Not raising marks the job complete regardless of the outcome. Intentional non-raising is how you handle expected failure states that shouldn't retry.
Dead queue monitoring
Jobs that exhaust retries go to the dead queue. Left unmonitored, the dead queue is where production problems hide. Set up alerting:
# config/initializers/sidekiq.rb
Sidekiq.configure_server do |config|
config.death_handlers << ->(job, exception) do
Sentry.capture_exception(
exception,
extra: { job_class: job["class"], job_args: job["args"] }
)
# or PagerDuty, Honeybadger, Datadog, etc.
end
end
The death handler fires when a job enters the dead queue. Alert on it. A dead job is a production failure that requires human intervention — it shouldn't be discoverable only by manually browsing Sidekiq's web UI.
Set a dead queue size limit and expiry to prevent unbounded growth:
Sidekiq.configure_server do |config|
config.options[:dead_max_jobs] = 10_000
config.options[:dead_timeout_in_seconds] = 90.days
end
Preventing concurrency races with Sidekiq::Limiter or Redis locks
When multiple job instances process records concurrently, race conditions arise on shared state. Two workers processing the same order simultaneously can double-charge a card:
class ProcessOrderJob < ApplicationJob
def perform(order_id)
# Two workers can both start here before either updates the status
order = Order.find(order_id)
return if order.processed?
charge_card(order)
order.mark_processed!
end
end
The fix is a distributed lock. The redis-mutex gem or Sidekiq Enterprise's Sidekiq::Limiter provides this:
# Using redlock gem — open source
def perform(order_id)
lock_key = "process_order:#{order_id}"
lock_manager = Redlock::Client.new([Redis.current])
lock_manager.lock!(lock_key, 30_000) do # 30-second TTL
order = Order.find_by(order_id)
return unless order && !order.processed?
charge_card(order)
order.mark_processed!
end
rescue Redlock::LockError
# Another worker holds the lock — job will retry
raise
end
The lock TTL must exceed the expected job duration. If the job takes 5 seconds and the TTL is 3 seconds, the lock expires and a second worker acquires it while the first is still running — the race condition you were trying to prevent.
For Sidekiq Pro/Enterprise, sidekiq-unique-jobs prevents duplicate jobs from being enqueued in the first place, which eliminates the race at the queue level rather than the execution level.
Queue configuration and priority
A flat queue structure where all jobs share one queue means a batch job that enqueues 100,000 records blocks user-facing jobs for hours. Separate queues by priority:
# config/sidekiq.yml
:queues:
- [critical, 10] # user-facing, real-time
- [default, 5] # standard background work
- [bulk, 1] # batch processing, low priority
The numbers are relative weights — critical gets 10x the poll frequency of bulk. Assign queues at the job level:
class UserNotificationJob < ApplicationJob
queue_as :critical
end
class InvoiceGenerationJob < ApplicationJob
queue_as :bulk
end
This ensures user-facing jobs like notification sends aren't blocked behind a CSV export that enqueued a million rows.
The job that should be a service object
The cleanest Sidekiq pattern is a thin job wrapper around a service object:
class ProcessOrderJob < ApplicationJob
queue_as :default
sidekiq_options retry: 3
def perform(order_id)
order = Order.find_by(id: order_id)
return unless order
ProcessOrder.call(order: order)
end
end
The job handles Sidekiq concerns: loading the record, guard-returning on missing records, retry configuration, queue assignment. The service object handles business logic. The service object is independently testable, callable from the console, and reusable from controllers or rake tasks.
Testing a job is then two tests: one for the job (that it calls the service, handles missing records) and one for the service (that it does the right thing). Neither requires Sidekiq to be running.
RSpec.describe ProcessOrderJob do
it "calls the service with the order" do
order = create(:order)
expect(ProcessOrder).to receive(:call).with(order: order)
described_class.new.perform(order.id)
end
it "returns early when the order does not exist" do
expect(ProcessOrder).not_to receive(:call)
described_class.new.perform(-1)
end
end
described_class.new.perform(id) calls the job synchronously without Sidekiq. No perform_enqueued_jobs wrapper needed in unit tests.