rails background-jobs migration

Migrating from Sidekiq to Solid Queue in Rails: A Zero-Downtime Guide

Ruby on Rails, Rails 8, Solid Queue, Sidekiq, Migration, Background Jobs

Step-by-step guide to migrating from Sidekiq to Solid Queue without downtime. Covers incremental rollout, retry semantics, recurring jobs, and production rollback strategies.

Migrating background job systems in production is nerve-wracking. One mistake and you’re losing jobs, double-processing payments, or watching your queue explode.

In my previous post on Solid Queue, I covered why you’d want to use it. This post is about how to actually migrate from Sidekiq without surprises. I’ll share the exact runbook I use, the gotchas that cost me debugging time, and how to roll back instantly if things go wrong.

Why switch:

  • Eliminate Redis dependency (one less service to manage)
  • First-class Rails 8 defaults with better integration
  • Simpler ops (unified database, easier monitoring)
  • Fewer moving parts in your infrastructure

What changes:

  • Queue adapter configuration
  • Retry semantics (explicit vs implicit)
  • Recurring job setup (config file vs gem)
  • Concurrency controls (different mental model)
  • Deploy and shutdown flow

Safe migration path:

  • Incremental per-job rollout
  • Run both systems side-by-side
  • Clean rollback at any step
  • Zero-downtime cutover

Before You Start: Inventory & Risk Map

Don’t touch code yet. Spend a few hours mapping what you have.

Catalogue Current Jobs

I use this script to inventory my Sidekiq jobs:

# lib/tasks/job_inventory.rake
namespace :jobs do
  desc "Inventory all background jobs"
  task inventory: :environment do
    puts "=== Active Job Classes ==="
    active_job_classes = ApplicationJob.descendants
    active_job_classes.each do |klass|
      queue = klass.queue_name
      adapter = klass.queue_adapter.class.name
      puts "#{klass.name}: queue=#{queue}, adapter=#{adapter}"
    end

    puts "\n=== Native Sidekiq::Worker Classes ==="
    sidekiq_workers = ObjectSpace.each_object(Class).select { |k| k < Sidekiq::Worker }
    sidekiq_workers.each do |klass|
      options = klass.get_sidekiq_options
      puts "#{klass.name}: #{options.inspect}"
    end

    puts "\n=== Sidekiq-Cron Jobs ==="
    if defined?(Sidekiq::Cron::Job)
      Sidekiq::Cron::Job.all.each do |job|
        puts "#{job.name}: #{job.cron} -> #{job.klass}"
      end
    end
  end
end

Look for:

  • Native Sidekiq::Worker classes - Need rewrite to Active Job
  • Custom sidekiq_options - Queues, retries, backtrace limits
  • Sidekiq middleware - Custom behavior that needs porting
  • Pro/Enterprise features - Unique jobs, rate limiting, batches
  • Complex retry logic - Death handlers, custom backoff

Map Scheduling Sources

Document everywhere jobs get scheduled:

Direct scheduling:

# Find all perform_later/perform_at calls
grep -r "perform_later\|perform_at\|perform_in" app/

Cron jobs:

# Check sidekiq-cron configuration
# config/initializers/sidekiq_cron.rb or
# config/schedule.yml

Enterprise periodic jobs:

# In Sidekiq Enterprise config
Sidekiq::Enterprise.configure do |config|
  config.periodic do |periodic|
    # Document these
  end
end

Current Ops Footprint

Document how you operate Sidekiq today:

Graceful shutdown:

# Current deploy process
kill -TSTP <sidekiq_pid>  # Quiet (stops accepting new jobs)
# Wait for jobs to finish
kill -TERM <sidekiq_pid>  # Terminate

Monitoring:

  • Sidekiq Web dashboard location
  • Alert thresholds (queue depth, latency, failure rate)
  • Metrics collection (AppSignal, New Relic, etc.)

Capacity:

# Current sidekiq.yml
:concurrency: 25
:queues:
  - [critical, 5]
  - [default, 3]
  - [mailers, 2]
  - [low_priority, 1]

Save this documentation. You’ll need it to configure Solid Queue equivalently.

Solid Queue in a Nutshell: What Changes Conceptually

Architecture Differences

Sidekiq:

  • Redis stores job data
  • Workers poll Redis
  • Fast (in-memory)
  • Separate service

Solid Queue:

  • PostgreSQL stores job data
  • Workers poll with FOR UPDATE SKIP LOCKED
  • Slower (disk-backed, but still fast)
  • Same database

Queues & Priorities

This is where mental models differ.

Sidekiq uses queue weights:

# sidekiq.yml
:queues:
  - [critical, 5]    # 5x more likely to process
  - [default, 3]     # 3x more likely
  - [low_priority, 1]

Solid Queue uses queue order OR job priority (not both):

# config/queue.yml - Queue order approach
production:
  workers:
    - queues: [critical, default, low_priority]  # Processes in order
      threads: 5

Or priority-based:

# Job-level priority (smaller number = higher priority)
class UrgentJob < ApplicationJob
  queue_with_priority 0  # Highest
end

class NormalJob < ApplicationJob
  queue_with_priority 50  # Medium
end

class BackgroundJob < ApplicationJob
  queue_with_priority 100  # Lowest
end

Recommendation: Use queue order for simplicity. Only use priorities if you need fine-grained control within a single queue.

Recurring Jobs

Sidekiq-Cron:

# config/schedule.yml
daily_summary:
  cron: "0 9 * * *"
  class: "DailySummaryJob"
  queue: mailers

Solid Queue:

# config/recurring.yml
production:
  daily_summary:
    class: DailySummaryJob
    schedule: every day at 9am
    queue: mailers

Solid Queue uses Fugit cron syntax, which is more readable.

Signals & Shutdown

Sidekiq:

TSTP  # Quiet mode (finish current jobs, don't start new ones)
TERM  # Graceful shutdown (wait for jobs, then exit)
INT   # Graceful shutdown (same as TERM)
QUIT  # Immediate shutdown (kill jobs)

Solid Queue:

TERM  # Graceful shutdown (wait for jobs)
INT   # Graceful shutdown (same as TERM)
QUIT  # Immediate shutdown (kill jobs)

Note: No “quiet mode” in Solid Queue. Just stop the supervisor.

Incremental Adoption Plan: Side-by-Side, Low Blast Radius

This is the key to safe migration. Don’t flip everything at once.

Phase 1: Install & Scaffold (No Traffic Yet)

# Add gem
bundle add solid_queue

# Install
bin/rails solid_queue:install

# This creates:
# - config/queue.yml
# - config/recurring.yml (empty)
# - db/queue_schema.rb
# - Migration to create solid_queue tables

Configure separate queue database (recommended):

# config/database.yml
production:
  primary:
    <<: *default
    database: myapp_production

  queue:
    <<: *default
    database: myapp_queue_production
    migrations_paths: db/queue_migrate
# Create queue database
RAILS_ENV=production bin/rails db:create:queue
RAILS_ENV=production bin/rails db:migrate:queue

Start Solid Queue worker (separate from Sidekiq):

# In a separate process/container
bin/jobs

At this point, Solid Queue is running but processing nothing. All jobs still go through Sidekiq.

Phase 2: Per-Job Opt-In

This is the magic. Migrate one job at a time.

# app/jobs/low_risk_job.rb
class LowRiskJob < ApplicationJob
  # Override adapter for just this job
  self.queue_adapter = :solid_queue

  queue_as :default

  def perform(user_id)
    # Job logic
  end
end

Keep global adapter as Sidekiq:

# config/application.rb
config.active_job.queue_adapter = :sidekiq  # Still default

Now LowRiskJob.perform_later(123) goes to Solid Queue. Everything else goes to Sidekiq.

Start with safe jobs:

  • Non-critical background tasks
  • Jobs that can be retried safely
  • Low-volume jobs
  • Jobs with good monitoring

Avoid migrating first:

  • Payment processing
  • Critical notifications
  • High-volume jobs
  • Jobs with complex retry logic

Phase 3: Flip Active Job Globally

After a few weeks of incremental migration:

# config/application.rb
config.active_job.queue_adapter = :solid_queue  # Now default

For jobs that must stay on Sidekiq (temporarily):

class LegacyJob < ApplicationJob
  self.queue_adapter = :sidekiq  # Explicit override

  def perform
    # Will migrate later
  end
end

Native Sidekiq::Worker classes keep running until you rewrite them:

# This still works, processes via Sidekiq
class OldSidekiqWorker
  include Sidekiq::Worker

  def perform
    # Legacy code
  end
end

Queue Naming & Routing: Keep Behavior the Same

Map your Sidekiq topology to Solid Queue.

Queue Mapping

Sidekiq queues (from earlier inventory):

:queues:
  - [critical, 5]    # ~42% of cycles
  - [default, 3]     # ~25%
  - [mailers, 2]     # ~17%
  - [low_priority, 1]  # ~8%

Equivalent Solid Queue topology:

# config/queue.yml
production:
  dispatchers:
    - polling_interval: 1
      batch_size: 500

  workers:
    # Critical: 2 processes, 5 threads each = 10 workers
    - queues: critical
      threads: 5
      processes: 2
      polling_interval: 0.1

    # Default: 2 processes, 3 threads each = 6 workers
    - queues: default
      threads: 3
      processes: 2
      polling_interval: 1

    # Mailers: 1 process, 4 threads = 4 workers (I/O bound)
    - queues: mailers
      threads: 4
      processes: 1
      polling_interval: 2

    # Low priority: 1 process, 2 threads = 2 workers
    - queues: low_priority
      threads: 2
      processes: 1
      polling_interval: 5

Capacity comparison:

  • Sidekiq: 25 concurrent jobs (from :concurrency: 25)
  • Solid Queue: 10 + 6 + 4 + 2 = 22 concurrent jobs

Adjust threads/processes to match your capacity needs.

Keep Queue Names Stable

# DON'T change queue names during migration
class ImportantJob < ApplicationJob
  queue_as :critical  # Keep existing name

  def perform
    # ...
  end
end

Changing queue names during migration causes confusion. Keep names identical.

Retries & Error Handling: Match Semantics Explicitly

This is where most migrations go wrong.

Sidekiq Default Retry Behavior

Sidekiq automatically retries failed jobs ~25 times over ~21 days:

# Sidekiq's default retry schedule (you don't write this)
# Retry delays: 15s, 16s, 31s, 96s, 271s, ... up to ~21 days
# After 25 failures, job moves to "Dead" queue

Solid Queue Has No Automatic Retries

Important: Solid Queue doesn’t retry by default. You must configure retries in Active Job.

# app/jobs/application_job.rb
class ApplicationJob < ActiveJob::Base
  # Match Sidekiq's retry behavior
  retry_on StandardError,
           wait: :exponentially_longer,
           attempts: 25

  # Or be more specific
  retry_on Timeout::Error,
           wait: :exponentially_longer,
           attempts: 5

  retry_on ActiveRecord::Deadlocked,
           wait: 5.seconds,
           attempts: 3

  # Discard jobs that shouldn't retry
  discard_on ActiveJob::DeserializationError
  discard_on ActiveRecord::RecordNotFound
end

Real-World Retry Configuration

Here’s what I use in production:

# app/jobs/application_job.rb
class ApplicationJob < ActiveJob::Base
  # Don't retry job if record was deleted
  discard_on ActiveRecord::RecordNotFound
  discard_on ActiveJob::DeserializationError

  # Retry transient errors
  retry_on ActiveRecord::Deadlocked,
           wait: 5.seconds,
           attempts: 3

  retry_on Redis::ConnectionError,
           wait: :exponentially_longer,
           attempts: 5

  retry_on Net::HTTPServerError,
           wait: :polynomially_longer,
           attempts: 10

  # Default catch-all (like Sidekiq)
  retry_on StandardError,
           wait: :exponentially_longer,
           attempts: 25

  # Logging
  before_perform do |job|
    Rails.logger.info "Starting #{job.class.name} with #{job.arguments}"
  end

  after_perform do |job|
    Rails.logger.info "Completed #{job.class.name}"
  end

  rescue_from(StandardError) do |exception|
    Rails.logger.error "Job failed: #{exception.message}"
    Rails.error.report(exception, handled: true, context: {
      job_class: self.class.name,
      job_id: job_id,
      arguments: arguments
    })
    raise  # Re-raise to trigger retry
  end
end

Per-Job Retry Overrides

# app/jobs/payment_processor_job.rb
class PaymentProcessorJob < ApplicationJob
  # Override global retry for payment-specific errors
  retry_on PaymentGateway::TemporaryError,
           wait: :exponentially_longer,
           attempts: 5

  discard_on PaymentGateway::CardDeclined  # Don't retry declined cards

  def perform(transaction_id)
    # Process payment
  end
end

Custom Retry Delays

class ApiSyncJob < ApplicationJob
  # Custom backoff: 1s, 4s, 9s, 16s, 25s...
  retry_on ApiError,
           wait: ->(executions) { executions ** 2 },
           attempts: 10

  def perform
    # Sync with external API
  end
end

Failed Job Inspection

Sidekiq Web:

  • Click “Dead” tab
  • View error and backtrace
  • Retry or delete

Mission Control - Jobs:

# Mount dashboard
mount MissionControl::Jobs::Engine, at: "/jobs"

# Visit /jobs/failed
# - View error details
# - Retry individually or in bulk
# - Discard permanently

Scheduling & Recurring Jobs: Cron Migration

Migrating from sidekiq-cron to Solid Queue recurring jobs.

From sidekiq-cron

Current setup (Sidekiq):

# config/schedule.yml
daily_summary:
  cron: "0 9 * * *"
  class: "DailySummaryJob"
  queue: mailers
  description: "Send daily summary emails"

cleanup_sessions:
  cron: "0 */6 * * *"
  class: "SessionCleanupJob"
  queue: low_priority

process_subscriptions:
  cron: "0 2 * * *"
  class: "SubscriptionChargeJob"
  queue: critical
  args:
    force: true

To Solid Queue recurring.yml

# config/recurring.yml
production:
  daily_summary:
    class: DailySummaryJob
    schedule: every day at 9am
    queue: mailers
    # description: "Send daily summary emails"  # Not supported, use comments

  cleanup_sessions:
    class: SessionCleanupJob
    schedule: every 6 hours
    queue: low_priority

  process_subscriptions:
    class: SubscriptionChargeJob
    schedule: every day at 2am
    queue: critical
    args: [{ force: true }]

Cron Syntax Translation

sidekiq-cron uses standard cron:

0 9 * * *     # Daily at 9am
*/15 * * * *  # Every 15 minutes
0 */4 * * *   # Every 4 hours

Solid Queue uses Fugit (more readable):

schedule: every day at 9am
schedule: every 15 minutes
schedule: every 4 hours
schedule: "0 9 * * *"  # Can still use cron syntax

Real-World Migration Example

# config/recurring.yml
production:
  # FinTech reconciliation (was 0 1 * * *)
  daily_reconciliation:
    class: TransactionReconciliationJob
    schedule: every day at 1am
    queue: critical

  # Report generation (was 0 6 * * 1)
  weekly_reports:
    class: WeeklyReportJob
    schedule: every monday at 6am
    queue: default

  # Cleanup old data (was 0 3 * * *)
  cleanup_old_records:
    class: DataCleanupJob
    schedule: every day at 3am
    queue: low_priority

  # Sync with external API (was */30 * * * *)
  api_sync:
    class: ExternalApiSyncJob
    schedule: every 30 minutes
    queue: default

  # Send digest emails (was 0 8 * * 1,3,5)
  digest_emails:
    class: DigestEmailJob
    schedule: "0 8 * * 1,3,5"  # Mon, Wed, Fri at 8am
    queue: mailers

One Source of Truth During Cutover

Critical: Avoid double-enqueueing during migration.

Bad approach (causes duplicates):

# Day 1: Both systems enabled
# sidekiq-cron runs DailySummaryJob at 9am
# Solid Queue also runs DailySummaryJob at 9am
# Users get TWO summary emails

Good approach:

Deploy N (disable sidekiq-cron):

# config/initializers/sidekiq_cron.rb
unless ENV['ENABLE_SIDEKIQ_CRON'] == 'true'
  # Don't load sidekiq-cron schedule
  Rails.logger.info "Sidekiq-cron disabled"
end

Deploy N+1 (enable Solid Queue scheduler):

# config/recurring.yml is now active
# Scheduler starts on next deploy

Verification:

# Check Solid Queue scheduled jobs
SolidQueue::RecurringTask.all.each do |task|
  puts "#{task.key}: #{task.schedule}"
end

Concurrency, Throttling & Uniqueness: The Gotchas

Different mental models between Sidekiq and Solid Queue.

Sidekiq Concurrency

Process-level:

# sidekiq.yml
:concurrency: 25  # 25 threads per Sidekiq process

Tuning:

# Run multiple processes for more concurrency
bundle exec sidekiq -c 25  # Process 1
bundle exec sidekiq -c 25  # Process 2
# Total: 50 concurrent jobs

Solid Queue Concurrency

Per-queue configuration:

# config/queue.yml
production:
  workers:
    - queues: default
      threads: 5      # 5 jobs at once
      processes: 3    # Across 3 processes
      # Total: 15 concurrent jobs for 'default' queue

Calculation: threads × processes = concurrent jobs per queue

Job-Level Concurrency Controls

Solid Queue offers per-job concurrency limits:

# app/jobs/report_generation_job.rb
class ReportGenerationJob < ApplicationJob
  # Only 3 report jobs run at once (across ALL workers)
  limits_concurrency to: 3, key: -> { "reports" }

  def perform(user_id, report_type)
    # CPU-intensive report generation
  end
end

Per-resource concurrency:

# app/jobs/invoice_export_job.rb
class InvoiceExportJob < ApplicationJob
  # Only 1 export per account at a time
  limits_concurrency to: 1, key: -> (account_id) {
    "invoice_export_#{account_id}"
  }

  def perform(account_id)
    account = Account.find(account_id)
    InvoiceExporter.generate_all(account)
  end
end

When to use concurrency controls:

  • Protecting external APIs from rate limits
  • Preventing database lock contention
  • Limiting resource-intensive operations
  • Avoiding race conditions on shared resources

When NOT to use:

  • General throughput control (use queue topology instead)
  • Simple prioritization (use queue order)
  • Most jobs don’t need this

Unique Jobs: The Gap

Sidekiq Enterprise has built-in unique jobs:

class UniqueJob
  include Sidekiq::Worker
  sidekiq_options unique_for: 10.minutes

  def perform(user_id)
    # Only one instance per user_id in 10 minutes
  end
end

Solid Queue doesn’t have built-in uniqueness yet. Workarounds:

Option 1: Concurrency controls

class ProcessUserJob < ApplicationJob
  limits_concurrency to: 1, key: -> (user_id) { "process_user_#{user_id}" }

  def perform(user_id)
    # Only one job per user at a time
  end
end

Option 2: Database-backed idempotency

class ProcessPaymentJob < ApplicationJob
  def perform(payment_id)
    # Use database lock
    payment = Payment.lock.find(payment_id)

    return if payment.processed?  # Already done

    process_payment(payment)
    payment.update!(processed: true)
  end
end

Option 3: Redis-backed deduplication

class DeduplicatedJob < ApplicationJob
  def perform(user_id)
    key = "job:#{self.class.name}:#{user_id}"

    # Try to acquire lock
    acquired = Redis.current.set(key, "1", nx: true, ex: 600)
    return unless acquired  # Another job already running

    begin
      # Do work
      process_user(user_id)
    ensure
      Redis.current.del(key)
    end
  end
end

This is an area where Sidekiq Enterprise is more mature. Plan accordingly.

Observability & Dashboards

Replace Sidekiq Web with Mission Control.

Mounting Dashboards

Before (Sidekiq Web):

# config/routes.rb
require 'sidekiq/web'

authenticate :user, ->(user) { user.admin? } do
  mount Sidekiq::Web, at: '/sidekiq'
end

After (Mission Control - Jobs):

# Gemfile
gem 'mission_control-jobs'

# config/routes.rb
authenticate :user, ->(user) { user.admin? } do
  mount MissionControl::Jobs::Engine, at: '/jobs'
end

Dashboard Features Comparison

Feature Sidekiq Web Mission Control
Active jobs
Failed jobs
Scheduled jobs
Retry/delete
Real-time stats
Historical graphs Limited
Job details
Recurring jobs Via sidekiq-cron ✓ Built-in

Metrics Integration

AppSignal (works with both):

# Gemfile
gem 'appsignal'

# Automatically tracks Active Job metrics:
# - Job duration
# - Success/failure rates
# - Queue depth
# - Error details

Custom metrics:

# app/jobs/application_job.rb
class ApplicationJob < ActiveJob::Base
  around_perform do |job, block|
    start_time = Time.current

    begin
      block.call

      # Track success
      ActiveSupport::Notifications.instrument(
        'job.success',
        job_class: job.class.name,
        duration: Time.current - start_time
      )
    rescue => error
      # Track failure
      ActiveSupport::Notifications.instrument(
        'job.failure',
        job_class: job.class.name,
        error: error.class.name,
        duration: Time.current - start_time
      )
      raise
    end
  end
end

# Subscribe to notifications
ActiveSupport::Notifications.subscribe('job.success') do |name, start, finish, id, payload|
  # Send to your metrics system
  Metrics.increment('jobs.success', tags: ["job:#{payload[:job_class]}"])
  Metrics.histogram('jobs.duration', payload[:duration], tags: ["job:#{payload[:job_class]}"])
end

Rolling Deploys & Zero-Downtime Cutovers

How to deploy without losing jobs or causing downtime.

Current Sidekiq Deploy Process

Typical flow:

# 1. Quiet Sidekiq (stop accepting new jobs)
kill -TSTP $(cat tmp/pids/sidekiq.pid)

# 2. Wait for current jobs to finish (with timeout)
timeout 60 bash -c 'while kill -0 $(cat tmp/pids/sidekiq.pid) 2>/dev/null; do sleep 1; done'

# 3. Deploy new code
git pull
bundle install
# ... restart app

# 4. Start new Sidekiq
bundle exec sidekiq -d -C config/sidekiq.yml

# 5. Terminate old Sidekiq (if still running)
kill -TERM $(cat tmp/pids/sidekiq.pid.oldbin)

Solid Queue Deploy Process

Simpler flow:

# 1. Send TERM to supervisor (graceful shutdown)
kill -TERM $(cat tmp/pids/solid_queue.pid)

# Wait for shutdown (respects shutdown_timeout)
# Default: 60 seconds

# 2. Deploy new code
git pull
bundle install

# 3. Start new Solid Queue
bin/jobs

Configure shutdown timeout:

# config/queue.yml
production:
  workers:
    - queues: default
      threads: 5
      shutdown_timeout: 60  # Wait 60s for jobs to finish

Blue/Green Migration Strategy

Run both systems during transition:

Week 1-2: Sidekiq primary, Solid Queue testing

# Most jobs on Sidekiq
config.active_job.queue_adapter = :sidekiq

# Test jobs on Solid Queue
class TestJob < ApplicationJob
  self.queue_adapter = :solid_queue
end

Week 3: Split traffic

# Move 50% of jobs to Solid Queue
class ApplicationJob < ActiveJob::Base
  # Default to Solid Queue
end

# Keep critical jobs on Sidekiq temporarily
class PaymentJob < ApplicationJob
  self.queue_adapter = :sidekiq
end

Week 4: Solid Queue primary

# All jobs on Solid Queue
config.active_job.queue_adapter = :solid_queue

# Disable sidekiq-cron
# Keep Sidekiq running to drain old jobs

Week 5+: Decommission Sidekiq

# Verify no jobs in Sidekiq
Sidekiq::Queue.all.each do |queue|
  puts "#{queue.name}: #{queue.size}"
end

# Verify no scheduled jobs
puts "Scheduled: #{Sidekiq::ScheduledSet.new.size}"
puts "Retries: #{Sidekiq::RetrySet.new.size}"
puts "Dead: #{Sidekiq::DeadSet.new.size}"

# All zeros? Safe to shut down Sidekiq
kill -TERM $(cat tmp/pids/sidekiq.pid)

# Remove from systemd/Docker/Procfile

Puma Plugin Caveat

Don’t use Puma plugin in production for Solid Queue:

# config/puma.rb
# DON'T DO THIS IN PRODUCTION
plugin :solid_queue  # Doesn't support phased restarts

Why: Puma’s phased restart doesn’t gracefully shut down Solid Queue workers.

Better: Run bin/jobs as separate service (systemd, Docker, Kamal).

Step-by-Step Migration Runbook

Copy-paste this into your migration plan.

Preparation

  • Run job inventory script
  • Document all Sidekiq queues and concurrency settings
  • List all sidekiq-cron jobs
  • Identify jobs with unique/rate-limit requirements
  • Review retry and error handling logic
  • Plan rollback strategy
  • Set up staging environment for testing

Phase 1: Install

# Install Solid Queue
bundle add solid_queue
bin/rails solid_queue:install

# Configure separate database (optional but recommended)
# Edit config/database.yml

# Create database and run migrations
RAILS_ENV=production bin/rails db:create:queue
RAILS_ENV=production bin/rails db:migrate:queue

# Configure worker topology
# Edit config/queue.yml

# Start Solid Queue (separate process)
bin/jobs

Verify:

# Check Solid Queue is running
ps aux | grep solid_queue

# Check database
rails console
SolidQueue::Job.count  # Should be 0

Phase 2: Migrate Low-Risk Jobs

Pick 2-3 non-critical jobs:

# app/jobs/cleanup_job.rb
class CleanupJob < ApplicationJob
  self.queue_adapter = :solid_queue  # Add this line

  queue_as :low_priority

  def perform
    # Existing logic
  end
end

Deploy and verify:

# Enqueue test job
CleanupJob.perform_later

# Check Mission Control
# Visit /jobs and verify job appears

Monitor for a week or two:

  • Check error rates
  • Verify jobs complete successfully
  • Compare performance with Sidekiq

Phase 3: Align Retry Semantics

Add explicit retry configuration:

# app/jobs/application_job.rb
class ApplicationJob < ActiveJob::Base
  # Match Sidekiq behavior
  retry_on StandardError,
           wait: :exponentially_longer,
           attempts: 25

  discard_on ActiveJob::DeserializationError
  discard_on ActiveRecord::RecordNotFound

  # Add logging and error reporting
  rescue_from(StandardError) do |exception|
    Rails.error.report(exception, handled: true, context: {
      job_class: self.class.name,
      job_id: job_id,
      arguments: arguments
    })
    raise
  end
end

Test failure scenarios:

# Create job that fails
class TestFailureJob < ApplicationJob
  self.queue_adapter = :solid_queue

  def perform
    raise "Test error"
  end
end

TestFailureJob.perform_later

# Check Mission Control /jobs/failed
# Verify retry behavior
# Verify error reporting

Phase 4: Migrate Recurring Jobs

Create config/recurring.yml:

production:
  daily_summary:
    class: DailySummaryJob
    schedule: every day at 9am
    queue: mailers

  cleanup_sessions:
    class: SessionCleanupJob
    schedule: every 6 hours
    queue: low_priority

Deploy with sidekiq-cron disabled:

# config/initializers/sidekiq_cron.rb
if ENV['ENABLE_SIDEKIQ_CRON'] == 'true'
  # Load schedule
else
  Rails.logger.info "Sidekiq-cron disabled, using Solid Queue recurring jobs"
end

Verify recurring jobs:

SolidQueue::RecurringTask.all.each do |task|
  puts "#{task.key}: next run at #{task.next_time}"
end

Monitor for a week:

  • Verify jobs run at correct times
  • Check for duplicates (should be none)
  • Verify no missed executions

Phase 5: Match Throughput

Tune config/queue.yml to match Sidekiq capacity:

production:
  workers:
    # Calculate: Sidekiq concurrency = 25
    # Distribute across Solid Queue workers

    - queues: critical
      threads: 5
      processes: 2  # 10 workers

    - queues: default
      threads: 5
      processes: 2  # 10 workers

    - queues: [mailers, low_priority]
      threads: 5
      processes: 1  # 5 workers

    # Total: 25 concurrent jobs (matches Sidekiq)

Load test:

# Enqueue 1000 jobs
1000.times do |i|
  SomeJob.perform_later(i)
end

# Monitor processing rate
# Compare with Sidekiq baseline

Phase 6: Flip Global Adapter

# config/application.rb
config.active_job.queue_adapter = :solid_queue  # Change from :sidekiq

Keep override for critical jobs (if needed):

class CriticalPaymentJob < ApplicationJob
  self.queue_adapter = :sidekiq  # Temporary, migrate later
end

Deploy and monitor closely:

  • Watch error rates
  • Monitor queue depths
  • Check job latency
  • Verify no jobs stuck

Phase 7: Decommission Sidekiq

After 1-2 weeks of stable Solid Queue operation:

# 1. Verify Sidekiq queues empty
Sidekiq::Queue.all.map(&:size).sum  # Should be 0

# 2. Verify no scheduled jobs
Sidekiq::ScheduledSet.new.size +
Sidekiq::RetrySet.new.size +
Sidekiq::DeadSet.new.size  # Should be 0

# 3. Stop Sidekiq
systemctl stop sidekiq
# or
kill -TERM $(cat tmp/pids/sidekiq.pid)

# 4. Remove from deploy config
# - Remove from Procfile/systemd
# - Remove sidekiq.yml
# - Remove config/initializers/sidekiq.rb

# 5. Remove gems
# Gemfile
# gem 'sidekiq'
# gem 'sidekiq-cron'

bundle install

Archive Sidekiq metrics and configuration for reference.

Rollback Plan: Practice It Once

You need a tested rollback plan. Practice before migration.

Immediate Rollback

Scenario: Solid Queue is causing issues, need to revert NOW.

# 1. Revert adapter change
git revert <commit-hash>  # Revert queue adapter change

# 2. Deploy immediately
git push
# Trigger deploy

# 3. Restart Sidekiq (if stopped)
systemctl start sidekiq
# or
bundle exec sidekiq -d -C config/sidekiq.yml

# 4. Keep Solid Queue running
# Let it drain already-enqueued jobs
# Or explicitly fail and re-enqueue later

Re-enqueue failed Solid Queue jobs to Sidekiq:

# In Rails console
SolidQueue::Job.failed.find_each do |job|
  # Extract job info
  job_class = job.class_name.constantize
  arguments = job.arguments

  # Re-enqueue to Sidekiq
  job_class.set(queue: job.queue_name).perform_later(*arguments)
end

Graceful Rollback

Scenario: Issues discovered, want controlled rollback.

Phase 1:

# Move jobs back to Sidekiq one by one
class SomeJob < ApplicationJob
  self.queue_adapter = :sidekiq  # Add override
end

# Deploy incrementally

Phase 2:

# Revert global adapter
config.active_job.queue_adapter = :sidekiq

# Re-enable sidekiq-cron
ENV['ENABLE_SIDEKIQ_CRON'] = 'true'

# Stop Solid Queue
kill -TERM $(cat tmp/pids/solid_queue.pid)

Practice Rollback in Staging

Before production migration:

# 1. Set up staging with both systems
# 2. Migrate to Solid Queue
# 3. Run production-like load
# 4. Practice rollback
# 5. Verify all jobs processed correctly

Measure rollback time. Should be quick and reliable.

Testing & CI Safety Nets

Automated tests to catch migration issues.

Active Job Test Helpers

# spec/jobs/my_job_spec.rb
require 'rails_helper'

RSpec.describe MyJob, type: :job do
  describe '#perform' do
    it 'enqueues job to correct queue' do
      MyJob.perform_later(123)

      expect(MyJob).to have_been_enqueued.with(123)
      expect(MyJob).to have_been_enqueued.on_queue('default')
    end

    it 'schedules job for future' do
      MyJob.set(wait: 1.hour).perform_later(123)

      expect(MyJob).to have_been_enqueued.at(1.hour.from_now).with(123)
    end

    it 'retries on errors' do
      allow_any_instance_of(MyJob).to receive(:perform).and_raise(StandardError)

      MyJob.perform_later(123)

      perform_enqueued_jobs

      # Should retry based on retry_on configuration
      expect(MyJob).to have_been_enqueued.at_least(:twice)
    end
  end
end

Migration-Specific Tests

# spec/jobs/migration_spec.rb
require 'rails_helper'

RSpec.describe 'Job migration to Solid Queue' do
  before do
    # Ensure using Solid Queue adapter
    ActiveJob::Base.queue_adapter = :solid_queue
  end

  it 'processes jobs successfully' do
    expect {
      MyJob.perform_later(123)
      perform_enqueued_jobs
    }.not_to raise_error
  end

  it 'retries failed jobs correctly' do
    allow_any_instance_of(MyJob).to receive(:perform).and_raise(StandardError).once
    allow_any_instance_of(MyJob).to receive(:perform).and_call_original

    MyJob.perform_later(123)

    perform_enqueued_jobs

    # Should succeed on retry
    expect(MyJob).to have_been_performed
  end

  it 'respects concurrency limits' do
    # Test job-level concurrency controls
    jobs = 5.times.map { ConcurrencyLimitedJob.perform_later }

    # Only configured number should run simultaneously
    # Implementation depends on your concurrency setup
  end
end

Canary Job

Add a recurring canary to verify scheduler health:

# config/recurring.yml
production:
  canary_health_check:
    class: CanaryJob
    schedule: every 5 minutes
    queue: default
# app/jobs/canary_job.rb
class CanaryJob < ApplicationJob
  queue_as :default

  def perform
    # Record successful execution
    Rails.cache.write(
      'canary_last_run',
      Time.current,
      expires_in: 10.minutes
    )

    # Send metric
    ActiveSupport::Notifications.instrument(
      'canary.success',
      timestamp: Time.current
    )
  end
end

Monitor canary in production:

# Health check endpoint
def jobs_health
  last_canary = Rails.cache.read('canary_last_run')

  if last_canary && last_canary > 10.minutes.ago
    render json: { status: 'ok', last_canary: last_canary }
  else
    render json: { status: 'unhealthy', last_canary: last_canary }, status: 503
  end
end

Alert if canary hasn’t run in > 10 minutes.

Common Pitfalls & How to Avoid Them

Issues I encountered during migrations.

1. Assuming Sidekiq Retry Semantics Carry Over

Problem:

# This job worked in Sidekiq (automatic retries)
class ImportantJob < ApplicationJob
  def perform
    ExternalAPI.call  # Sometimes fails
  end
end

# In Solid Queue: fails once, goes to failed queue, never retries

Solution: Explicit retry configuration

class ImportantJob < ApplicationJob
  retry_on StandardError, wait: :exponentially_longer, attempts: 25

  def perform
    ExternalAPI.call
  end
end

2. Queue Weighting Mental Model

Problem:

# Sidekiq mental model: weights
# [critical, 5], [default, 1]
# = Critical gets ~83% of resources

# Solid Queue config attempt (WRONG):
workers:
  - queues: [critical, default]  # Both processed equally
    threads: 5

Solution: Separate workers or queue order

# Option 1: Separate workers
workers:
  - queues: critical
    threads: 8  # 80% of resources

  - queues: default
    threads: 2  # 20% of resources

# Option 2: Queue order (processes critical first)
workers:
  - queues: [critical, default]
    threads: 10

3. Cron Duplication (Double Enqueues)

Problem:

# Both systems running same cron job
# sidekiq-cron: DailySummaryJob every day at 9am
# Solid Queue recurring.yml: DailySummaryJob every day at 9am
# Result: Users get 2 emails

Solution: Single source of truth

# Deploy N: Disable sidekiq-cron
if ENV['ENABLE_SIDEKIQ_CRON'] != 'true'
  Rails.logger.info "Sidekiq-cron disabled"
  # Don't load schedule
end

# Deploy N+1: Enable Solid Queue recurring jobs
# config/recurring.yml now active

4. Overusing Concurrency Controls

Problem:

# Every job has concurrency control
class Job1 < ApplicationJob
  limits_concurrency to: 5, key: -> { "job1" }
end

class Job2 < ApplicationJob
  limits_concurrency to: 10, key: -> { "job2" }
end

# Complex, hard to reason about, debugging nightmare

Solution: Use topology first

# Simple and clear
workers:
  - queues: job1_queue
    threads: 5

  - queues: job2_queue
    threads: 10

Only use concurrency controls for:

  • Per-resource limits (e.g., one export per account)
  • Protecting external APIs
  • Preventing race conditions

5. Not Testing Rollback

Problem: Production issues, attempt rollback, discover:

  • Rollback process unclear
  • Jobs lost during transition
  • Sidekiq configuration deleted
  • Team doesn’t know how to re-enqueue jobs

Solution: Practice rollback in staging

  • Document exact steps
  • Test re-enqueueing failed jobs
  • Keep Sidekiq config until fully decommissioned
  • Time the rollback

6. Connection Pool Exhaustion

Problem:

# Solid Queue workers
workers:
  - queues: default
    threads: 25
    processes: 4
# Total: 100 concurrent jobs

# But database pool:
production:
  pool: 5  # Not enough!

Each thread needs a DB connection. 100 threads needs pool ≥ 100.

Solution:

# config/database.yml
production:
  queue:
    pool: <%= ENV.fetch("SOLID_QUEUE_POOL_SIZE", 110) %>

Production Deployment Configurations

Copy-paste configs for different deployment methods.

Systemd Service

# /etc/systemd/system/solid-queue.service
[Unit]
Description=Solid Queue Worker
After=network.target postgresql.service

[Service]
Type=simple
User=deploy
WorkingDirectory=/var/www/myapp/current
Environment=RAILS_ENV=production
Environment=SOLID_QUEUE_POOL_SIZE=50

ExecStart=/usr/local/bin/bundle exec bin/jobs
ExecReload=/bin/kill -TERM $MAINPID

# Graceful shutdown
KillSignal=SIGTERM
TimeoutStopSec=60
KillMode=mixed

# Restart on failure
Restart=on-failure
RestartSec=5

# Logging
StandardOutput=append:/var/log/solid-queue/stdout.log
StandardError=append:/var/log/solid-queue/stderr.log

[Install]
WantedBy=multi-user.target
# Enable and start
sudo systemctl enable solid-queue
sudo systemctl start solid-queue

# Check status
sudo systemctl status solid-queue

# View logs
sudo journalctl -u solid-queue -f

# Restart (graceful)
sudo systemctl reload solid-queue

# Stop
sudo systemctl stop solid-queue

Docker Compose

# docker-compose.yml
version: '3.8'

services:
  web:
    image: myapp:latest
    command: bundle exec puma
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgresql://postgres:password@db:5432/myapp_production
      - QUEUE_DATABASE_URL=postgresql://postgres:password@db:5432/myapp_queue_production
      - RAILS_ENV=production
    depends_on:
      - db

  jobs:
    image: myapp:latest
    command: bundle exec bin/jobs
    environment:
      - DATABASE_URL=postgresql://postgres:password@db:5432/myapp_production
      - QUEUE_DATABASE_URL=postgresql://postgres:password@db:5432/myapp_queue_production
      - RAILS_ENV=production
      - SOLID_QUEUE_POOL_SIZE=50
    depends_on:
      - db
    restart: unless-stopped

  db:
    image: postgres:16
    environment:
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres-data:/var/lib/postgresql/data

volumes:
  postgres-data:

Kamal Configuration

# .kamal/deploy.yml
service: myapp

image: username/myapp

servers:
  web:
    hosts:
      - 192.168.1.1
    options:
      network: "private"

  jobs:
    cmd: bin/jobs
    hosts:
      - 192.168.1.1
    options:
      network: "private"
    env:
      clear:
        SOLID_QUEUE_POOL_SIZE: 50

registry:
  username: username
  password:
    - KAMAL_REGISTRY_PASSWORD

env:
  secret:
    - DATABASE_URL
    - QUEUE_DATABASE_URL
    - SECRET_KEY_BASE

accessories:
  postgres:
    image: postgres:16
    host: 192.168.1.1
    port: 5432
    env:
      secret:
        - POSTGRES_PASSWORD
    directories:
      - data:/var/lib/postgresql/data
    options:
      network: "private"
# Deploy
kamal deploy

# Restart jobs only
kamal app boot --roles jobs

# View logs
kamal app logs --roles jobs

# SSH to jobs container
kamal app exec --roles jobs sh

Procfile (Heroku/Render)

# Procfile
web: bundle exec puma -C config/puma.rb
jobs: bundle exec bin/jobs

Heroku:

# Scale jobs
heroku ps:scale jobs=2

# View logs
heroku logs --ps jobs --tail

# Restart jobs
heroku ps:restart jobs

Solid Queue Tradeoffs

Latency increase: 45ms → 180ms

  • Impact: Acceptable. Jobs aren’t user-facing.
  • Mitigation: None needed. Users don’t notice.

Throughput reduction: 600 → 550 jobs/min

  • Impact: Still well above our 350 jobs/min average.
  • Mitigation: Can increase worker threads if needed.

Feature loss: No built-in unique jobs

  • Impact: Added manual deduplication logic (3 jobs affected).
  • Mitigation: Database-backed idempotency keys.

Engineering Wins

Simplicity: One less service to manage, monitor, upgrade

Reliability: Fewer moving parts = fewer failure modes

Developer experience: Better Rails integration, nicer dashboard

Reduced operational overhead: Single database to maintain

The Bottom Line

Migrating from Sidekiq to Solid Queue is straightforward if you:

  1. Plan incrementally - Per-job migration, not big bang
  2. Match retry semantics - Explicit Active Job configuration
  3. Test rollback - Practice before production
  4. Monitor closely - First 2 weeks are critical
  5. Accept trade-offs - Slightly higher latency for simpler ops

You should migrate if:

  • Job volume < 1M/day
  • Latency requirements > 100ms
  • Team values operational simplicity
  • Using PostgreSQL already

Stick with Sidekiq if:

  • Job volume > 5M/day
  • Need < 50ms latency
  • Heavily using Pro/Enterprise features (batches, unique jobs)
  • Already have mature Sidekiq setup working well

For most Rails applications, Solid Queue’s simplicity outweighs the small latency increase. The migration is less scary than it seems. Take it one job at a time, test thoroughly, and keep a rollback plan ready.


Need help migrating your Rails application to Solid Queue? I’ve successfully migrated production applications to Solid Queue and can guide your team through the process. From planning to deployment to monitoring, I’ll ensure a smooth, zero-downtime transition.

Let’s discuss your migration: nikita.sinenko@gmail.com

Further Reading

N

Need help with your Rails project?

I'm Nikita Sinenko, a Senior Ruby on Rails Engineer with 15+ years of experience. Based in Dubai, working with clients worldwide on contract and consulting projects.

Let's Talk