Migrating from Sidekiq to Solid Queue in Rails: A Zero-Downtime Guide
Step-by-step guide to migrating from Sidekiq to Solid Queue without downtime. Covers incremental rollout, retry semantics, recurring jobs, and production rollback strategies.
Migrating background job systems in production is nerve-wracking. One mistake and you’re losing jobs, double-processing payments, or watching your queue explode.
In my previous post on Solid Queue, I covered why you’d want to use it. This post is about how to actually migrate from Sidekiq without surprises. I’ll share the exact runbook I use, the gotchas that cost me debugging time, and how to roll back instantly if things go wrong.
Why switch:
- Eliminate Redis dependency (one less service to manage)
- First-class Rails 8 defaults with better integration
- Simpler ops (unified database, easier monitoring)
- Fewer moving parts in your infrastructure
What changes:
- Queue adapter configuration
- Retry semantics (explicit vs implicit)
- Recurring job setup (config file vs gem)
- Concurrency controls (different mental model)
- Deploy and shutdown flow
Safe migration path:
- Incremental per-job rollout
- Run both systems side-by-side
- Clean rollback at any step
- Zero-downtime cutover
Before You Start: Inventory & Risk Map
Don’t touch code yet. Spend a few hours mapping what you have.
Catalogue Current Jobs
I use this script to inventory my Sidekiq jobs:
# lib/tasks/job_inventory.rake
namespace :jobs do
desc "Inventory all background jobs"
task inventory: :environment do
puts "=== Active Job Classes ==="
active_job_classes = ApplicationJob.descendants
active_job_classes.each do |klass|
queue = klass.queue_name
adapter = klass.queue_adapter.class.name
puts "#{klass.name}: queue=#{queue}, adapter=#{adapter}"
end
puts "\n=== Native Sidekiq::Worker Classes ==="
sidekiq_workers = ObjectSpace.each_object(Class).select { |k| k < Sidekiq::Worker }
sidekiq_workers.each do |klass|
options = klass.get_sidekiq_options
puts "#{klass.name}: #{options.inspect}"
end
puts "\n=== Sidekiq-Cron Jobs ==="
if defined?(Sidekiq::Cron::Job)
Sidekiq::Cron::Job.all.each do |job|
puts "#{job.name}: #{job.cron} -> #{job.klass}"
end
end
end
end
Look for:
- Native
Sidekiq::Workerclasses - Need rewrite to Active Job - Custom
sidekiq_options- Queues, retries, backtrace limits - Sidekiq middleware - Custom behavior that needs porting
- Pro/Enterprise features - Unique jobs, rate limiting, batches
- Complex retry logic - Death handlers, custom backoff
Map Scheduling Sources
Document everywhere jobs get scheduled:
Direct scheduling:
# Find all perform_later/perform_at calls
grep -r "perform_later\|perform_at\|perform_in" app/
Cron jobs:
# Check sidekiq-cron configuration
# config/initializers/sidekiq_cron.rb or
# config/schedule.yml
Enterprise periodic jobs:
# In Sidekiq Enterprise config
Sidekiq::Enterprise.configure do |config|
config.periodic do |periodic|
# Document these
end
end
Current Ops Footprint
Document how you operate Sidekiq today:
Graceful shutdown:
# Current deploy process
kill -TSTP <sidekiq_pid> # Quiet (stops accepting new jobs)
# Wait for jobs to finish
kill -TERM <sidekiq_pid> # Terminate
Monitoring:
- Sidekiq Web dashboard location
- Alert thresholds (queue depth, latency, failure rate)
- Metrics collection (AppSignal, New Relic, etc.)
Capacity:
# Current sidekiq.yml
:concurrency: 25
:queues:
- [critical, 5]
- [default, 3]
- [mailers, 2]
- [low_priority, 1]
Save this documentation. You’ll need it to configure Solid Queue equivalently.
Solid Queue in a Nutshell: What Changes Conceptually
Architecture Differences
Sidekiq:
- Redis stores job data
- Workers poll Redis
- Fast (in-memory)
- Separate service
Solid Queue:
- PostgreSQL stores job data
- Workers poll with
FOR UPDATE SKIP LOCKED - Slower (disk-backed, but still fast)
- Same database
Queues & Priorities
This is where mental models differ.
Sidekiq uses queue weights:
# sidekiq.yml
:queues:
- [critical, 5] # 5x more likely to process
- [default, 3] # 3x more likely
- [low_priority, 1]
Solid Queue uses queue order OR job priority (not both):
# config/queue.yml - Queue order approach
production:
workers:
- queues: [critical, default, low_priority] # Processes in order
threads: 5
Or priority-based:
# Job-level priority (smaller number = higher priority)
class UrgentJob < ApplicationJob
queue_with_priority 0 # Highest
end
class NormalJob < ApplicationJob
queue_with_priority 50 # Medium
end
class BackgroundJob < ApplicationJob
queue_with_priority 100 # Lowest
end
Recommendation: Use queue order for simplicity. Only use priorities if you need fine-grained control within a single queue.
Recurring Jobs
Sidekiq-Cron:
# config/schedule.yml
daily_summary:
cron: "0 9 * * *"
class: "DailySummaryJob"
queue: mailers
Solid Queue:
# config/recurring.yml
production:
daily_summary:
class: DailySummaryJob
schedule: every day at 9am
queue: mailers
Solid Queue uses Fugit cron syntax, which is more readable.
Signals & Shutdown
Sidekiq:
TSTP # Quiet mode (finish current jobs, don't start new ones)
TERM # Graceful shutdown (wait for jobs, then exit)
INT # Graceful shutdown (same as TERM)
QUIT # Immediate shutdown (kill jobs)
Solid Queue:
TERM # Graceful shutdown (wait for jobs)
INT # Graceful shutdown (same as TERM)
QUIT # Immediate shutdown (kill jobs)
Note: No “quiet mode” in Solid Queue. Just stop the supervisor.
Incremental Adoption Plan: Side-by-Side, Low Blast Radius
This is the key to safe migration. Don’t flip everything at once.
Phase 1: Install & Scaffold (No Traffic Yet)
# Add gem
bundle add solid_queue
# Install
bin/rails solid_queue:install
# This creates:
# - config/queue.yml
# - config/recurring.yml (empty)
# - db/queue_schema.rb
# - Migration to create solid_queue tables
Configure separate queue database (recommended):
# config/database.yml
production:
primary:
<<: *default
database: myapp_production
queue:
<<: *default
database: myapp_queue_production
migrations_paths: db/queue_migrate
# Create queue database
RAILS_ENV=production bin/rails db:create:queue
RAILS_ENV=production bin/rails db:migrate:queue
Start Solid Queue worker (separate from Sidekiq):
# In a separate process/container
bin/jobs
At this point, Solid Queue is running but processing nothing. All jobs still go through Sidekiq.
Phase 2: Per-Job Opt-In
This is the magic. Migrate one job at a time.
# app/jobs/low_risk_job.rb
class LowRiskJob < ApplicationJob
# Override adapter for just this job
self.queue_adapter = :solid_queue
queue_as :default
def perform(user_id)
# Job logic
end
end
Keep global adapter as Sidekiq:
# config/application.rb
config.active_job.queue_adapter = :sidekiq # Still default
Now LowRiskJob.perform_later(123) goes to Solid Queue. Everything else goes to Sidekiq.
Start with safe jobs:
- Non-critical background tasks
- Jobs that can be retried safely
- Low-volume jobs
- Jobs with good monitoring
Avoid migrating first:
- Payment processing
- Critical notifications
- High-volume jobs
- Jobs with complex retry logic
Phase 3: Flip Active Job Globally
After a few weeks of incremental migration:
# config/application.rb
config.active_job.queue_adapter = :solid_queue # Now default
For jobs that must stay on Sidekiq (temporarily):
class LegacyJob < ApplicationJob
self.queue_adapter = :sidekiq # Explicit override
def perform
# Will migrate later
end
end
Native Sidekiq::Worker classes keep running until you rewrite them:
# This still works, processes via Sidekiq
class OldSidekiqWorker
include Sidekiq::Worker
def perform
# Legacy code
end
end
Queue Naming & Routing: Keep Behavior the Same
Map your Sidekiq topology to Solid Queue.
Queue Mapping
Sidekiq queues (from earlier inventory):
:queues:
- [critical, 5] # ~42% of cycles
- [default, 3] # ~25%
- [mailers, 2] # ~17%
- [low_priority, 1] # ~8%
Equivalent Solid Queue topology:
# config/queue.yml
production:
dispatchers:
- polling_interval: 1
batch_size: 500
workers:
# Critical: 2 processes, 5 threads each = 10 workers
- queues: critical
threads: 5
processes: 2
polling_interval: 0.1
# Default: 2 processes, 3 threads each = 6 workers
- queues: default
threads: 3
processes: 2
polling_interval: 1
# Mailers: 1 process, 4 threads = 4 workers (I/O bound)
- queues: mailers
threads: 4
processes: 1
polling_interval: 2
# Low priority: 1 process, 2 threads = 2 workers
- queues: low_priority
threads: 2
processes: 1
polling_interval: 5
Capacity comparison:
- Sidekiq: 25 concurrent jobs (from
:concurrency: 25) - Solid Queue: 10 + 6 + 4 + 2 = 22 concurrent jobs
Adjust threads/processes to match your capacity needs.
Keep Queue Names Stable
# DON'T change queue names during migration
class ImportantJob < ApplicationJob
queue_as :critical # Keep existing name
def perform
# ...
end
end
Changing queue names during migration causes confusion. Keep names identical.
Retries & Error Handling: Match Semantics Explicitly
This is where most migrations go wrong.
Sidekiq Default Retry Behavior
Sidekiq automatically retries failed jobs ~25 times over ~21 days:
# Sidekiq's default retry schedule (you don't write this)
# Retry delays: 15s, 16s, 31s, 96s, 271s, ... up to ~21 days
# After 25 failures, job moves to "Dead" queue
Solid Queue Has No Automatic Retries
Important: Solid Queue doesn’t retry by default. You must configure retries in Active Job.
# app/jobs/application_job.rb
class ApplicationJob < ActiveJob::Base
# Match Sidekiq's retry behavior
retry_on StandardError,
wait: :exponentially_longer,
attempts: 25
# Or be more specific
retry_on Timeout::Error,
wait: :exponentially_longer,
attempts: 5
retry_on ActiveRecord::Deadlocked,
wait: 5.seconds,
attempts: 3
# Discard jobs that shouldn't retry
discard_on ActiveJob::DeserializationError
discard_on ActiveRecord::RecordNotFound
end
Real-World Retry Configuration
Here’s what I use in production:
# app/jobs/application_job.rb
class ApplicationJob < ActiveJob::Base
# Don't retry job if record was deleted
discard_on ActiveRecord::RecordNotFound
discard_on ActiveJob::DeserializationError
# Retry transient errors
retry_on ActiveRecord::Deadlocked,
wait: 5.seconds,
attempts: 3
retry_on Redis::ConnectionError,
wait: :exponentially_longer,
attempts: 5
retry_on Net::HTTPServerError,
wait: :polynomially_longer,
attempts: 10
# Default catch-all (like Sidekiq)
retry_on StandardError,
wait: :exponentially_longer,
attempts: 25
# Logging
before_perform do |job|
Rails.logger.info "Starting #{job.class.name} with #{job.arguments}"
end
after_perform do |job|
Rails.logger.info "Completed #{job.class.name}"
end
rescue_from(StandardError) do |exception|
Rails.logger.error "Job failed: #{exception.message}"
Rails.error.report(exception, handled: true, context: {
job_class: self.class.name,
job_id: job_id,
arguments: arguments
})
raise # Re-raise to trigger retry
end
end
Per-Job Retry Overrides
# app/jobs/payment_processor_job.rb
class PaymentProcessorJob < ApplicationJob
# Override global retry for payment-specific errors
retry_on PaymentGateway::TemporaryError,
wait: :exponentially_longer,
attempts: 5
discard_on PaymentGateway::CardDeclined # Don't retry declined cards
def perform(transaction_id)
# Process payment
end
end
Custom Retry Delays
class ApiSyncJob < ApplicationJob
# Custom backoff: 1s, 4s, 9s, 16s, 25s...
retry_on ApiError,
wait: ->(executions) { executions ** 2 },
attempts: 10
def perform
# Sync with external API
end
end
Failed Job Inspection
Sidekiq Web:
- Click “Dead” tab
- View error and backtrace
- Retry or delete
Mission Control - Jobs:
# Mount dashboard
mount MissionControl::Jobs::Engine, at: "/jobs"
# Visit /jobs/failed
# - View error details
# - Retry individually or in bulk
# - Discard permanently
Scheduling & Recurring Jobs: Cron Migration
Migrating from sidekiq-cron to Solid Queue recurring jobs.
From sidekiq-cron
Current setup (Sidekiq):
# config/schedule.yml
daily_summary:
cron: "0 9 * * *"
class: "DailySummaryJob"
queue: mailers
description: "Send daily summary emails"
cleanup_sessions:
cron: "0 */6 * * *"
class: "SessionCleanupJob"
queue: low_priority
process_subscriptions:
cron: "0 2 * * *"
class: "SubscriptionChargeJob"
queue: critical
args:
force: true
To Solid Queue recurring.yml
# config/recurring.yml
production:
daily_summary:
class: DailySummaryJob
schedule: every day at 9am
queue: mailers
# description: "Send daily summary emails" # Not supported, use comments
cleanup_sessions:
class: SessionCleanupJob
schedule: every 6 hours
queue: low_priority
process_subscriptions:
class: SubscriptionChargeJob
schedule: every day at 2am
queue: critical
args: [{ force: true }]
Cron Syntax Translation
sidekiq-cron uses standard cron:
0 9 * * * # Daily at 9am
*/15 * * * * # Every 15 minutes
0 */4 * * * # Every 4 hours
Solid Queue uses Fugit (more readable):
schedule: every day at 9am
schedule: every 15 minutes
schedule: every 4 hours
schedule: "0 9 * * *" # Can still use cron syntax
Real-World Migration Example
# config/recurring.yml
production:
# FinTech reconciliation (was 0 1 * * *)
daily_reconciliation:
class: TransactionReconciliationJob
schedule: every day at 1am
queue: critical
# Report generation (was 0 6 * * 1)
weekly_reports:
class: WeeklyReportJob
schedule: every monday at 6am
queue: default
# Cleanup old data (was 0 3 * * *)
cleanup_old_records:
class: DataCleanupJob
schedule: every day at 3am
queue: low_priority
# Sync with external API (was */30 * * * *)
api_sync:
class: ExternalApiSyncJob
schedule: every 30 minutes
queue: default
# Send digest emails (was 0 8 * * 1,3,5)
digest_emails:
class: DigestEmailJob
schedule: "0 8 * * 1,3,5" # Mon, Wed, Fri at 8am
queue: mailers
One Source of Truth During Cutover
Critical: Avoid double-enqueueing during migration.
Bad approach (causes duplicates):
# Day 1: Both systems enabled
# sidekiq-cron runs DailySummaryJob at 9am
# Solid Queue also runs DailySummaryJob at 9am
# Users get TWO summary emails
Good approach:
Deploy N (disable sidekiq-cron):
# config/initializers/sidekiq_cron.rb
unless ENV['ENABLE_SIDEKIQ_CRON'] == 'true'
# Don't load sidekiq-cron schedule
Rails.logger.info "Sidekiq-cron disabled"
end
Deploy N+1 (enable Solid Queue scheduler):
# config/recurring.yml is now active
# Scheduler starts on next deploy
Verification:
# Check Solid Queue scheduled jobs
SolidQueue::RecurringTask.all.each do |task|
puts "#{task.key}: #{task.schedule}"
end
Concurrency, Throttling & Uniqueness: The Gotchas
Different mental models between Sidekiq and Solid Queue.
Sidekiq Concurrency
Process-level:
# sidekiq.yml
:concurrency: 25 # 25 threads per Sidekiq process
Tuning:
# Run multiple processes for more concurrency
bundle exec sidekiq -c 25 # Process 1
bundle exec sidekiq -c 25 # Process 2
# Total: 50 concurrent jobs
Solid Queue Concurrency
Per-queue configuration:
# config/queue.yml
production:
workers:
- queues: default
threads: 5 # 5 jobs at once
processes: 3 # Across 3 processes
# Total: 15 concurrent jobs for 'default' queue
Calculation: threads × processes = concurrent jobs per queue
Job-Level Concurrency Controls
Solid Queue offers per-job concurrency limits:
# app/jobs/report_generation_job.rb
class ReportGenerationJob < ApplicationJob
# Only 3 report jobs run at once (across ALL workers)
limits_concurrency to: 3, key: -> { "reports" }
def perform(user_id, report_type)
# CPU-intensive report generation
end
end
Per-resource concurrency:
# app/jobs/invoice_export_job.rb
class InvoiceExportJob < ApplicationJob
# Only 1 export per account at a time
limits_concurrency to: 1, key: -> (account_id) {
"invoice_export_#{account_id}"
}
def perform(account_id)
account = Account.find(account_id)
InvoiceExporter.generate_all(account)
end
end
When to use concurrency controls:
- Protecting external APIs from rate limits
- Preventing database lock contention
- Limiting resource-intensive operations
- Avoiding race conditions on shared resources
When NOT to use:
- General throughput control (use queue topology instead)
- Simple prioritization (use queue order)
- Most jobs don’t need this
Unique Jobs: The Gap
Sidekiq Enterprise has built-in unique jobs:
class UniqueJob
include Sidekiq::Worker
sidekiq_options unique_for: 10.minutes
def perform(user_id)
# Only one instance per user_id in 10 minutes
end
end
Solid Queue doesn’t have built-in uniqueness yet. Workarounds:
Option 1: Concurrency controls
class ProcessUserJob < ApplicationJob
limits_concurrency to: 1, key: -> (user_id) { "process_user_#{user_id}" }
def perform(user_id)
# Only one job per user at a time
end
end
Option 2: Database-backed idempotency
class ProcessPaymentJob < ApplicationJob
def perform(payment_id)
# Use database lock
payment = Payment.lock.find(payment_id)
return if payment.processed? # Already done
process_payment(payment)
payment.update!(processed: true)
end
end
Option 3: Redis-backed deduplication
class DeduplicatedJob < ApplicationJob
def perform(user_id)
key = "job:#{self.class.name}:#{user_id}"
# Try to acquire lock
acquired = Redis.current.set(key, "1", nx: true, ex: 600)
return unless acquired # Another job already running
begin
# Do work
process_user(user_id)
ensure
Redis.current.del(key)
end
end
end
This is an area where Sidekiq Enterprise is more mature. Plan accordingly.
Observability & Dashboards
Replace Sidekiq Web with Mission Control.
Mounting Dashboards
Before (Sidekiq Web):
# config/routes.rb
require 'sidekiq/web'
authenticate :user, ->(user) { user.admin? } do
mount Sidekiq::Web, at: '/sidekiq'
end
After (Mission Control - Jobs):
# Gemfile
gem 'mission_control-jobs'
# config/routes.rb
authenticate :user, ->(user) { user.admin? } do
mount MissionControl::Jobs::Engine, at: '/jobs'
end
Dashboard Features Comparison
| Feature | Sidekiq Web | Mission Control |
|---|---|---|
| Active jobs | ✓ | ✓ |
| Failed jobs | ✓ | ✓ |
| Scheduled jobs | ✓ | ✓ |
| Retry/delete | ✓ | ✓ |
| Real-time stats | ✓ | ✓ |
| Historical graphs | ✓ | Limited |
| Job details | ✓ | ✓ |
| Recurring jobs | Via sidekiq-cron | ✓ Built-in |
Metrics Integration
AppSignal (works with both):
# Gemfile
gem 'appsignal'
# Automatically tracks Active Job metrics:
# - Job duration
# - Success/failure rates
# - Queue depth
# - Error details
Custom metrics:
# app/jobs/application_job.rb
class ApplicationJob < ActiveJob::Base
around_perform do |job, block|
start_time = Time.current
begin
block.call
# Track success
ActiveSupport::Notifications.instrument(
'job.success',
job_class: job.class.name,
duration: Time.current - start_time
)
rescue => error
# Track failure
ActiveSupport::Notifications.instrument(
'job.failure',
job_class: job.class.name,
error: error.class.name,
duration: Time.current - start_time
)
raise
end
end
end
# Subscribe to notifications
ActiveSupport::Notifications.subscribe('job.success') do |name, start, finish, id, payload|
# Send to your metrics system
Metrics.increment('jobs.success', tags: ["job:#{payload[:job_class]}"])
Metrics.histogram('jobs.duration', payload[:duration], tags: ["job:#{payload[:job_class]}"])
end
Rolling Deploys & Zero-Downtime Cutovers
How to deploy without losing jobs or causing downtime.
Current Sidekiq Deploy Process
Typical flow:
# 1. Quiet Sidekiq (stop accepting new jobs)
kill -TSTP $(cat tmp/pids/sidekiq.pid)
# 2. Wait for current jobs to finish (with timeout)
timeout 60 bash -c 'while kill -0 $(cat tmp/pids/sidekiq.pid) 2>/dev/null; do sleep 1; done'
# 3. Deploy new code
git pull
bundle install
# ... restart app
# 4. Start new Sidekiq
bundle exec sidekiq -d -C config/sidekiq.yml
# 5. Terminate old Sidekiq (if still running)
kill -TERM $(cat tmp/pids/sidekiq.pid.oldbin)
Solid Queue Deploy Process
Simpler flow:
# 1. Send TERM to supervisor (graceful shutdown)
kill -TERM $(cat tmp/pids/solid_queue.pid)
# Wait for shutdown (respects shutdown_timeout)
# Default: 60 seconds
# 2. Deploy new code
git pull
bundle install
# 3. Start new Solid Queue
bin/jobs
Configure shutdown timeout:
# config/queue.yml
production:
workers:
- queues: default
threads: 5
shutdown_timeout: 60 # Wait 60s for jobs to finish
Blue/Green Migration Strategy
Run both systems during transition:
Week 1-2: Sidekiq primary, Solid Queue testing
# Most jobs on Sidekiq
config.active_job.queue_adapter = :sidekiq
# Test jobs on Solid Queue
class TestJob < ApplicationJob
self.queue_adapter = :solid_queue
end
Week 3: Split traffic
# Move 50% of jobs to Solid Queue
class ApplicationJob < ActiveJob::Base
# Default to Solid Queue
end
# Keep critical jobs on Sidekiq temporarily
class PaymentJob < ApplicationJob
self.queue_adapter = :sidekiq
end
Week 4: Solid Queue primary
# All jobs on Solid Queue
config.active_job.queue_adapter = :solid_queue
# Disable sidekiq-cron
# Keep Sidekiq running to drain old jobs
Week 5+: Decommission Sidekiq
# Verify no jobs in Sidekiq
Sidekiq::Queue.all.each do |queue|
puts "#{queue.name}: #{queue.size}"
end
# Verify no scheduled jobs
puts "Scheduled: #{Sidekiq::ScheduledSet.new.size}"
puts "Retries: #{Sidekiq::RetrySet.new.size}"
puts "Dead: #{Sidekiq::DeadSet.new.size}"
# All zeros? Safe to shut down Sidekiq
kill -TERM $(cat tmp/pids/sidekiq.pid)
# Remove from systemd/Docker/Procfile
Puma Plugin Caveat
Don’t use Puma plugin in production for Solid Queue:
# config/puma.rb
# DON'T DO THIS IN PRODUCTION
plugin :solid_queue # Doesn't support phased restarts
Why: Puma’s phased restart doesn’t gracefully shut down Solid Queue workers.
Better: Run bin/jobs as separate service (systemd, Docker, Kamal).
Step-by-Step Migration Runbook
Copy-paste this into your migration plan.
Preparation
- Run job inventory script
- Document all Sidekiq queues and concurrency settings
- List all sidekiq-cron jobs
- Identify jobs with unique/rate-limit requirements
- Review retry and error handling logic
- Plan rollback strategy
- Set up staging environment for testing
Phase 1: Install
# Install Solid Queue
bundle add solid_queue
bin/rails solid_queue:install
# Configure separate database (optional but recommended)
# Edit config/database.yml
# Create database and run migrations
RAILS_ENV=production bin/rails db:create:queue
RAILS_ENV=production bin/rails db:migrate:queue
# Configure worker topology
# Edit config/queue.yml
# Start Solid Queue (separate process)
bin/jobs
Verify:
# Check Solid Queue is running
ps aux | grep solid_queue
# Check database
rails console
SolidQueue::Job.count # Should be 0
Phase 2: Migrate Low-Risk Jobs
Pick 2-3 non-critical jobs:
# app/jobs/cleanup_job.rb
class CleanupJob < ApplicationJob
self.queue_adapter = :solid_queue # Add this line
queue_as :low_priority
def perform
# Existing logic
end
end
Deploy and verify:
# Enqueue test job
CleanupJob.perform_later
# Check Mission Control
# Visit /jobs and verify job appears
Monitor for a week or two:
- Check error rates
- Verify jobs complete successfully
- Compare performance with Sidekiq
Phase 3: Align Retry Semantics
Add explicit retry configuration:
# app/jobs/application_job.rb
class ApplicationJob < ActiveJob::Base
# Match Sidekiq behavior
retry_on StandardError,
wait: :exponentially_longer,
attempts: 25
discard_on ActiveJob::DeserializationError
discard_on ActiveRecord::RecordNotFound
# Add logging and error reporting
rescue_from(StandardError) do |exception|
Rails.error.report(exception, handled: true, context: {
job_class: self.class.name,
job_id: job_id,
arguments: arguments
})
raise
end
end
Test failure scenarios:
# Create job that fails
class TestFailureJob < ApplicationJob
self.queue_adapter = :solid_queue
def perform
raise "Test error"
end
end
TestFailureJob.perform_later
# Check Mission Control /jobs/failed
# Verify retry behavior
# Verify error reporting
Phase 4: Migrate Recurring Jobs
Create config/recurring.yml:
production:
daily_summary:
class: DailySummaryJob
schedule: every day at 9am
queue: mailers
cleanup_sessions:
class: SessionCleanupJob
schedule: every 6 hours
queue: low_priority
Deploy with sidekiq-cron disabled:
# config/initializers/sidekiq_cron.rb
if ENV['ENABLE_SIDEKIQ_CRON'] == 'true'
# Load schedule
else
Rails.logger.info "Sidekiq-cron disabled, using Solid Queue recurring jobs"
end
Verify recurring jobs:
SolidQueue::RecurringTask.all.each do |task|
puts "#{task.key}: next run at #{task.next_time}"
end
Monitor for a week:
- Verify jobs run at correct times
- Check for duplicates (should be none)
- Verify no missed executions
Phase 5: Match Throughput
Tune config/queue.yml to match Sidekiq capacity:
production:
workers:
# Calculate: Sidekiq concurrency = 25
# Distribute across Solid Queue workers
- queues: critical
threads: 5
processes: 2 # 10 workers
- queues: default
threads: 5
processes: 2 # 10 workers
- queues: [mailers, low_priority]
threads: 5
processes: 1 # 5 workers
# Total: 25 concurrent jobs (matches Sidekiq)
Load test:
# Enqueue 1000 jobs
1000.times do |i|
SomeJob.perform_later(i)
end
# Monitor processing rate
# Compare with Sidekiq baseline
Phase 6: Flip Global Adapter
# config/application.rb
config.active_job.queue_adapter = :solid_queue # Change from :sidekiq
Keep override for critical jobs (if needed):
class CriticalPaymentJob < ApplicationJob
self.queue_adapter = :sidekiq # Temporary, migrate later
end
Deploy and monitor closely:
- Watch error rates
- Monitor queue depths
- Check job latency
- Verify no jobs stuck
Phase 7: Decommission Sidekiq
After 1-2 weeks of stable Solid Queue operation:
# 1. Verify Sidekiq queues empty
Sidekiq::Queue.all.map(&:size).sum # Should be 0
# 2. Verify no scheduled jobs
Sidekiq::ScheduledSet.new.size +
Sidekiq::RetrySet.new.size +
Sidekiq::DeadSet.new.size # Should be 0
# 3. Stop Sidekiq
systemctl stop sidekiq
# or
kill -TERM $(cat tmp/pids/sidekiq.pid)
# 4. Remove from deploy config
# - Remove from Procfile/systemd
# - Remove sidekiq.yml
# - Remove config/initializers/sidekiq.rb
# 5. Remove gems
# Gemfile
# gem 'sidekiq'
# gem 'sidekiq-cron'
bundle install
Archive Sidekiq metrics and configuration for reference.
Rollback Plan: Practice It Once
You need a tested rollback plan. Practice before migration.
Immediate Rollback
Scenario: Solid Queue is causing issues, need to revert NOW.
# 1. Revert adapter change
git revert <commit-hash> # Revert queue adapter change
# 2. Deploy immediately
git push
# Trigger deploy
# 3. Restart Sidekiq (if stopped)
systemctl start sidekiq
# or
bundle exec sidekiq -d -C config/sidekiq.yml
# 4. Keep Solid Queue running
# Let it drain already-enqueued jobs
# Or explicitly fail and re-enqueue later
Re-enqueue failed Solid Queue jobs to Sidekiq:
# In Rails console
SolidQueue::Job.failed.find_each do |job|
# Extract job info
job_class = job.class_name.constantize
arguments = job.arguments
# Re-enqueue to Sidekiq
job_class.set(queue: job.queue_name).perform_later(*arguments)
end
Graceful Rollback
Scenario: Issues discovered, want controlled rollback.
Phase 1:
# Move jobs back to Sidekiq one by one
class SomeJob < ApplicationJob
self.queue_adapter = :sidekiq # Add override
end
# Deploy incrementally
Phase 2:
# Revert global adapter
config.active_job.queue_adapter = :sidekiq
# Re-enable sidekiq-cron
ENV['ENABLE_SIDEKIQ_CRON'] = 'true'
# Stop Solid Queue
kill -TERM $(cat tmp/pids/solid_queue.pid)
Practice Rollback in Staging
Before production migration:
# 1. Set up staging with both systems
# 2. Migrate to Solid Queue
# 3. Run production-like load
# 4. Practice rollback
# 5. Verify all jobs processed correctly
Measure rollback time. Should be quick and reliable.
Testing & CI Safety Nets
Automated tests to catch migration issues.
Active Job Test Helpers
# spec/jobs/my_job_spec.rb
require 'rails_helper'
RSpec.describe MyJob, type: :job do
describe '#perform' do
it 'enqueues job to correct queue' do
MyJob.perform_later(123)
expect(MyJob).to have_been_enqueued.with(123)
expect(MyJob).to have_been_enqueued.on_queue('default')
end
it 'schedules job for future' do
MyJob.set(wait: 1.hour).perform_later(123)
expect(MyJob).to have_been_enqueued.at(1.hour.from_now).with(123)
end
it 'retries on errors' do
allow_any_instance_of(MyJob).to receive(:perform).and_raise(StandardError)
MyJob.perform_later(123)
perform_enqueued_jobs
# Should retry based on retry_on configuration
expect(MyJob).to have_been_enqueued.at_least(:twice)
end
end
end
Migration-Specific Tests
# spec/jobs/migration_spec.rb
require 'rails_helper'
RSpec.describe 'Job migration to Solid Queue' do
before do
# Ensure using Solid Queue adapter
ActiveJob::Base.queue_adapter = :solid_queue
end
it 'processes jobs successfully' do
expect {
MyJob.perform_later(123)
perform_enqueued_jobs
}.not_to raise_error
end
it 'retries failed jobs correctly' do
allow_any_instance_of(MyJob).to receive(:perform).and_raise(StandardError).once
allow_any_instance_of(MyJob).to receive(:perform).and_call_original
MyJob.perform_later(123)
perform_enqueued_jobs
# Should succeed on retry
expect(MyJob).to have_been_performed
end
it 'respects concurrency limits' do
# Test job-level concurrency controls
jobs = 5.times.map { ConcurrencyLimitedJob.perform_later }
# Only configured number should run simultaneously
# Implementation depends on your concurrency setup
end
end
Canary Job
Add a recurring canary to verify scheduler health:
# config/recurring.yml
production:
canary_health_check:
class: CanaryJob
schedule: every 5 minutes
queue: default
# app/jobs/canary_job.rb
class CanaryJob < ApplicationJob
queue_as :default
def perform
# Record successful execution
Rails.cache.write(
'canary_last_run',
Time.current,
expires_in: 10.minutes
)
# Send metric
ActiveSupport::Notifications.instrument(
'canary.success',
timestamp: Time.current
)
end
end
Monitor canary in production:
# Health check endpoint
def jobs_health
last_canary = Rails.cache.read('canary_last_run')
if last_canary && last_canary > 10.minutes.ago
render json: { status: 'ok', last_canary: last_canary }
else
render json: { status: 'unhealthy', last_canary: last_canary }, status: 503
end
end
Alert if canary hasn’t run in > 10 minutes.
Common Pitfalls & How to Avoid Them
Issues I encountered during migrations.
1. Assuming Sidekiq Retry Semantics Carry Over
Problem:
# This job worked in Sidekiq (automatic retries)
class ImportantJob < ApplicationJob
def perform
ExternalAPI.call # Sometimes fails
end
end
# In Solid Queue: fails once, goes to failed queue, never retries
Solution: Explicit retry configuration
class ImportantJob < ApplicationJob
retry_on StandardError, wait: :exponentially_longer, attempts: 25
def perform
ExternalAPI.call
end
end
2. Queue Weighting Mental Model
Problem:
# Sidekiq mental model: weights
# [critical, 5], [default, 1]
# = Critical gets ~83% of resources
# Solid Queue config attempt (WRONG):
workers:
- queues: [critical, default] # Both processed equally
threads: 5
Solution: Separate workers or queue order
# Option 1: Separate workers
workers:
- queues: critical
threads: 8 # 80% of resources
- queues: default
threads: 2 # 20% of resources
# Option 2: Queue order (processes critical first)
workers:
- queues: [critical, default]
threads: 10
3. Cron Duplication (Double Enqueues)
Problem:
# Both systems running same cron job
# sidekiq-cron: DailySummaryJob every day at 9am
# Solid Queue recurring.yml: DailySummaryJob every day at 9am
# Result: Users get 2 emails
Solution: Single source of truth
# Deploy N: Disable sidekiq-cron
if ENV['ENABLE_SIDEKIQ_CRON'] != 'true'
Rails.logger.info "Sidekiq-cron disabled"
# Don't load schedule
end
# Deploy N+1: Enable Solid Queue recurring jobs
# config/recurring.yml now active
4. Overusing Concurrency Controls
Problem:
# Every job has concurrency control
class Job1 < ApplicationJob
limits_concurrency to: 5, key: -> { "job1" }
end
class Job2 < ApplicationJob
limits_concurrency to: 10, key: -> { "job2" }
end
# Complex, hard to reason about, debugging nightmare
Solution: Use topology first
# Simple and clear
workers:
- queues: job1_queue
threads: 5
- queues: job2_queue
threads: 10
Only use concurrency controls for:
- Per-resource limits (e.g., one export per account)
- Protecting external APIs
- Preventing race conditions
5. Not Testing Rollback
Problem: Production issues, attempt rollback, discover:
- Rollback process unclear
- Jobs lost during transition
- Sidekiq configuration deleted
- Team doesn’t know how to re-enqueue jobs
Solution: Practice rollback in staging
- Document exact steps
- Test re-enqueueing failed jobs
- Keep Sidekiq config until fully decommissioned
- Time the rollback
6. Connection Pool Exhaustion
Problem:
# Solid Queue workers
workers:
- queues: default
threads: 25
processes: 4
# Total: 100 concurrent jobs
# But database pool:
production:
pool: 5 # Not enough!
Each thread needs a DB connection. 100 threads needs pool ≥ 100.
Solution:
# config/database.yml
production:
queue:
pool: <%= ENV.fetch("SOLID_QUEUE_POOL_SIZE", 110) %>
Production Deployment Configurations
Copy-paste configs for different deployment methods.
Systemd Service
# /etc/systemd/system/solid-queue.service
[Unit]
Description=Solid Queue Worker
After=network.target postgresql.service
[Service]
Type=simple
User=deploy
WorkingDirectory=/var/www/myapp/current
Environment=RAILS_ENV=production
Environment=SOLID_QUEUE_POOL_SIZE=50
ExecStart=/usr/local/bin/bundle exec bin/jobs
ExecReload=/bin/kill -TERM $MAINPID
# Graceful shutdown
KillSignal=SIGTERM
TimeoutStopSec=60
KillMode=mixed
# Restart on failure
Restart=on-failure
RestartSec=5
# Logging
StandardOutput=append:/var/log/solid-queue/stdout.log
StandardError=append:/var/log/solid-queue/stderr.log
[Install]
WantedBy=multi-user.target
# Enable and start
sudo systemctl enable solid-queue
sudo systemctl start solid-queue
# Check status
sudo systemctl status solid-queue
# View logs
sudo journalctl -u solid-queue -f
# Restart (graceful)
sudo systemctl reload solid-queue
# Stop
sudo systemctl stop solid-queue
Docker Compose
# docker-compose.yml
version: '3.8'
services:
web:
image: myapp:latest
command: bundle exec puma
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgresql://postgres:password@db:5432/myapp_production
- QUEUE_DATABASE_URL=postgresql://postgres:password@db:5432/myapp_queue_production
- RAILS_ENV=production
depends_on:
- db
jobs:
image: myapp:latest
command: bundle exec bin/jobs
environment:
- DATABASE_URL=postgresql://postgres:password@db:5432/myapp_production
- QUEUE_DATABASE_URL=postgresql://postgres:password@db:5432/myapp_queue_production
- RAILS_ENV=production
- SOLID_QUEUE_POOL_SIZE=50
depends_on:
- db
restart: unless-stopped
db:
image: postgres:16
environment:
- POSTGRES_PASSWORD=password
volumes:
- postgres-data:/var/lib/postgresql/data
volumes:
postgres-data:
Kamal Configuration
# .kamal/deploy.yml
service: myapp
image: username/myapp
servers:
web:
hosts:
- 192.168.1.1
options:
network: "private"
jobs:
cmd: bin/jobs
hosts:
- 192.168.1.1
options:
network: "private"
env:
clear:
SOLID_QUEUE_POOL_SIZE: 50
registry:
username: username
password:
- KAMAL_REGISTRY_PASSWORD
env:
secret:
- DATABASE_URL
- QUEUE_DATABASE_URL
- SECRET_KEY_BASE
accessories:
postgres:
image: postgres:16
host: 192.168.1.1
port: 5432
env:
secret:
- POSTGRES_PASSWORD
directories:
- data:/var/lib/postgresql/data
options:
network: "private"
# Deploy
kamal deploy
# Restart jobs only
kamal app boot --roles jobs
# View logs
kamal app logs --roles jobs
# SSH to jobs container
kamal app exec --roles jobs sh
Procfile (Heroku/Render)
# Procfile
web: bundle exec puma -C config/puma.rb
jobs: bundle exec bin/jobs
Heroku:
# Scale jobs
heroku ps:scale jobs=2
# View logs
heroku logs --ps jobs --tail
# Restart jobs
heroku ps:restart jobs
Solid Queue Tradeoffs
Latency increase: 45ms → 180ms
- Impact: Acceptable. Jobs aren’t user-facing.
- Mitigation: None needed. Users don’t notice.
Throughput reduction: 600 → 550 jobs/min
- Impact: Still well above our 350 jobs/min average.
- Mitigation: Can increase worker threads if needed.
Feature loss: No built-in unique jobs
- Impact: Added manual deduplication logic (3 jobs affected).
- Mitigation: Database-backed idempotency keys.
Engineering Wins
Simplicity: One less service to manage, monitor, upgrade
Reliability: Fewer moving parts = fewer failure modes
Developer experience: Better Rails integration, nicer dashboard
Reduced operational overhead: Single database to maintain
The Bottom Line
Migrating from Sidekiq to Solid Queue is straightforward if you:
- Plan incrementally - Per-job migration, not big bang
- Match retry semantics - Explicit Active Job configuration
- Test rollback - Practice before production
- Monitor closely - First 2 weeks are critical
- Accept trade-offs - Slightly higher latency for simpler ops
You should migrate if:
- Job volume < 1M/day
- Latency requirements > 100ms
- Team values operational simplicity
- Using PostgreSQL already
Stick with Sidekiq if:
- Job volume > 5M/day
- Need < 50ms latency
- Heavily using Pro/Enterprise features (batches, unique jobs)
- Already have mature Sidekiq setup working well
For most Rails applications, Solid Queue’s simplicity outweighs the small latency increase. The migration is less scary than it seems. Take it one job at a time, test thoroughly, and keep a rollback plan ready.
Need help migrating your Rails application to Solid Queue? I’ve successfully migrated production applications to Solid Queue and can guide your team through the process. From planning to deployment to monitoring, I’ll ensure a smooth, zero-downtime transition.
Let’s discuss your migration: nikita.sinenko@gmail.com
Further Reading
Need help with your Rails project?
I'm Nikita Sinenko, a Senior Ruby on Rails Engineer with 15+ years of experience. Based in Dubai, working with clients worldwide on contract and consulting projects.
Let's Talk