rails database performance

PostgreSQL Database Optimization in Rails: How to Cut Query Times by 95%

15 min read

Deep dive into database optimization techniques that reduced query times by 95% in production applications. Practical strategies for indexing, query optimization, and eliminating N+1 queries.

Database Optimization in Rails

Database optimization is where you get the most dramatic performance wins in Rails applications. A well-optimized database can mean the difference between a 5-second page load and a 50ms one.

This guide covers techniques used to optimize databases in production applications processing millions of records, handling thousands of concurrent users, and managing high-volume transactions. These aren’t theoretical concepts—they’re battle-tested patterns that have saved millions of database queries in production.

The Foundation: Understanding Query Cost

Before optimizing, you need to measure. Here’s the recommended workflow:

1. Enable Query Logging in Development

# config/environments/development.rb
config.active_record.logger = Logger.new(STDOUT)
config.log_level = :debug

# Or use the bullet gem
gem 'bullet', group: :development

# config/environments/development.rb
config.after_initialize do
  Bullet.enable = true
  Bullet.alert = true
  Bullet.console = true
end

2. Use EXPLAIN ANALYZE

# In Rails console
User.joins(:transactions).where(created_at: 1.year.ago..).explain

# Or raw SQL
ActiveRecord::Base.connection.execute("
  EXPLAIN ANALYZE
  SELECT users.*, COUNT(transactions.id)
  FROM users
  LEFT JOIN transactions ON transactions.user_id = users.id
  GROUP BY users.id
").to_a

Look for:

  • Sequential scans on large tables (bad)
  • High execution time
  • Rows scanned vs. rows returned (massive difference = inefficient)

3. Monitor in Production

# Use rack-mini-profiler
gem 'rack-mini-profiler'

# Or APM tools
gem 'skylight'  # My favorite for Rails apps
gem 'newrelic_rpm'

Problem #1: The N+1 Query Epidemic

This is the #1 performance killer in Rails apps.

The Problem

# app/controllers/users_controller.rb
def index
  @users = User.all
end
<!-- app/views/users/index.html.erb -->
<% @users.each do |user| %>
  <tr>
    <td><%= user.name %></td>
    <td><%= user.transactions.count %></td>  <!-- N+1! -->
    <td><%= user.latest_transaction&.amount %></td>  <!-- N+1! -->
  </tr>
<% end %>

Database queries: 1 (fetch users) + N (count transactions for each) + N (fetch latest transaction) = 1 + 2N queries

For 100 users: 201 queries. For 1000 users: 2001 queries.

The Solution: Eager Loading

# app/controllers/users_controller.rb
def index
  @users = User
    .includes(:transactions)  # Load transactions
    .left_joins(:transactions)
    .select('users.*, COUNT(transactions.id) as transactions_count')
    .group('users.id')
end

Database queries: 2 (users + transactions). Always 2, regardless of user count.

Advanced Pattern: Preload with Scope

class User < ApplicationRecord
  has_many :transactions
  has_one :latest_transaction,
          -> { order(created_at: :desc) },
          class_name: 'Transaction'
end

# Controller
def index
  @users = User.includes(:transactions, :latest_transaction)
end
<!-- View -->
<% @users.each do |user| %>
  <tr>
    <td><%= user.name %></td>
    <td><%= user.transactions.size %></td>  <!-- No query! -->
    <td><%= user.latest_transaction&.amount %></td>  <!-- No query! -->
  </tr>
<% end %>

Even Better: Counter Caches

For counts, use counter caches:

# Migration
class AddTransactionsCountToUsers < ActiveRecord::Migration[8.0]
  def change
    add_column :users, :transactions_count, :integer, default: 0, null: false

    # Backfill existing data
    User.find_each do |user|
      User.reset_counters(user.id, :transactions)
    end
  end
end

# Model
class Transaction < ApplicationRecord
  belongs_to :user, counter_cache: true
end

Now user.transactions.count is instant—it just reads a column.

Problem #2: Missing Indexes

Indexes are your database’s superpower. Without them, queries scan entire tables.

Identify Missing Indexes

# This query is slow
Transaction.where(user_id: 123, status: 'completed').order(created_at: :desc)

# Check execution plan
Transaction.where(user_id: 123, status: 'completed').order(created_at: :desc).explain

# Look for "Seq Scan" (bad) vs "Index Scan" (good)

Add the Right Index

class AddIndexToTransactions < ActiveRecord::Migration[8.0]
  def change
    # Compound index for WHERE + ORDER BY
    add_index :transactions,
              [:user_id, :status, :created_at],
              name: 'index_transactions_on_user_status_date'
  end
end

Rule of thumb: Index columns used in:

  • WHERE clauses
  • JOIN conditions
  • ORDER BY clauses
  • GROUP BY clauses

Index Column Order Matters

# If you query by user_id AND status
add_index :transactions, [:user_id, :status]

# But if you also query by user_id alone
add_index :transactions, [:user_id, :status]  # Works for both!

# But NOT if you query by status alone
# This index won't be used for: where(status: 'completed')
# You'd need a separate index on [:status]

Left-prefix rule: An index on [a, b, c] can be used for queries filtering:

  • a
  • a, b
  • a, b, c

But NOT for: b, c, or b, c alone.

Partial Indexes

For columns with many NULLs or specific values you query frequently:

# Only index completed transactions
add_index :transactions,
          [:user_id, :created_at],
          where: "status = 'completed'",
          name: 'index_completed_transactions'

# Much smaller index, faster queries on completed transactions

Unique Indexes

Enforce uniqueness at the database level:

add_index :users, :email, unique: true
add_index :transactions, [:user_id, :external_id], unique: true

Problem #3: Inefficient Queries

Sometimes the query itself is the problem.

Use EXISTS Instead of COUNT

# Slow: counts all matching records
if user.transactions.where(status: 'pending').count > 0
  # ...
end

# Fast: stops at first match
if user.transactions.where(status: 'pending').exists?
  # ...
end

Use SELECT to Limit Columns

# Loads all columns (including JSONB, TEXT, etc.)
@users = User.all

# Loads only what you need
@users = User.select(:id, :name, :email)

For large tables or wide rows, this saves significant memory and network bandwidth.

Use LIMIT

# Scans entire table
User.where('created_at > ?', 1.year.ago).to_a

# Stops after finding 100
User.where('created_at > ?', 1.year.ago).limit(100).to_a

Always use LIMIT for:

  • Autocomplete dropdowns
  • “Recent items” lists
  • Preview queries

Batch Processing

# BAD: Loads all records into memory
User.all.each do |user|
  user.process_data
end

# GOOD: Processes in batches of 1000
User.find_each(batch_size: 1000) do |user|
  user.process_data
end

# For custom queries
User.where(active: true).in_batches(of: 500) do |batch|
  batch.update_all(last_checked: Time.current)
end

Problem #4: Slow Aggregations

Use Database Aggregations

# Slow: Loads all records into Ruby memory
transactions = user.transactions.to_a
total = transactions.sum(&:amount)
average = transactions.map(&:amount).sum / transactions.size

# Fast: Database does the math
total = user.transactions.sum(:amount)
average = user.transactions.average(:amount)
count = user.transactions.count

PostgreSQL is optimized for aggregations. Use it.

Materialized Views for Complex Reports

For reports that are expensive to calculate:

# db/migrate/20250101_create_user_statistics_view.rb
class CreateUserStatisticsView < ActiveRecord::Migration[8.0]
  def up
    execute <<-SQL
      CREATE MATERIALIZED VIEW user_statistics AS
      SELECT
        users.id AS user_id,
        users.name,
        COUNT(DISTINCT transactions.id) AS transaction_count,
        SUM(transactions.amount) AS total_amount,
        AVG(transactions.amount) AS average_amount,
        MAX(transactions.created_at) AS last_transaction_at
      FROM users
      LEFT JOIN transactions ON transactions.user_id = users.id
      WHERE transactions.created_at > NOW() - INTERVAL '1 year'
      GROUP BY users.id, users.name;

      CREATE UNIQUE INDEX ON user_statistics (user_id);
    SQL
  end

  def down
    execute "DROP MATERIALIZED VIEW IF EXISTS user_statistics"
  end
end
# app/models/user_statistic.rb
class UserStatistic < ApplicationRecord
  self.primary_key = 'user_id'

  # Refresh materialized view
  def self.refresh
    connection.execute('REFRESH MATERIALIZED VIEW CONCURRENTLY user_statistics')
  end
end

# Schedule refresh (e.g., every hour)
# config/schedule.rb (with whenever gem)
every 1.hour do
  runner "UserStatistic.refresh"
end

Now complex reports are instant—just SELECT * FROM user_statistics.

Problem #5: JSON Columns Performance

JSON columns are powerful but can be slow if not indexed properly.

# Schema
create_table :transactions do |t|
  t.jsonb :metadata, default: {}, null: false
end

# Add GIN index for JSONB
add_index :transactions, :metadata, using: :gin

# Now these queries are fast
Transaction.where("metadata @> ?", { payment_method: 'credit_card' }.to_json)
Transaction.where("metadata -> 'payment_method' = ?", 'credit_card')
Transaction.where("metadata ->> 'amount' = ?", '100.00')

Specific Key Indexes

# If you frequently query a specific JSON key
add_index :transactions, "(metadata -> 'payment_method')", name: 'index_transactions_on_payment_method'

Problem #6: Connection Pool Exhaustion

# config/database.yml
production:
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>

Symptoms:

  • could not obtain a database connection within 5 seconds
  • Requests timing out under load

Solutions:

1. Increase Pool Size

production:
  pool: 20  # Match Puma workers * threads

2. Use PgBouncer

PgBouncer pools connections at the database level:

# pgbouncer.ini
[databases]
myapp = host=localhost dbname=myapp_production

[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20

3. Close Long-Running Connections

# config/initializers/active_record.rb
ActiveRecord::Base.connection_pool.with_connection do |conn|
  conn.execute("SET statement_timeout = 30000")  # 30 seconds
end

Production Case Study: Trading Dashboard

Here’s a complete optimization example for a trading dashboard.

Before Optimization

# app/controllers/dashboards_controller.rb
def show
  @user = current_user
  @transactions = @user.transactions.order(created_at: :desc).limit(10)
  @total_value = @user.portfolio.sum(&:current_value)
  @total_gain = @user.portfolio.sum(&:total_gain)
  @watchlist = @user.watchlist_items.map(&:symbol)
end

Performance:

  • Page load: 4.2 seconds
  • Database queries: 127
  • Database time: 3.8 seconds

After Optimization

# app/models/user.rb
class User < ApplicationRecord
  has_many :transactions
  has_many :portfolio_items
  has_many :watchlist_items

  # Cached calculations
  def portfolio_summary
    Rails.cache.fetch("user_#{id}_portfolio_summary", expires_in: 5.minutes) do
      {
        total_value: portfolio_items.sum(&:current_value),
        total_gain: portfolio_items.sum(&:total_gain),
        items_count: portfolio_items.count
      }
    end
  end
end

# app/controllers/dashboards_controller.rb
def show
  @user = current_user

  # Eager load with limits
  @transactions = @user.transactions
                       .includes(:related_order)
                       .order(created_at: :desc)
                       .limit(10)

  # Cached aggregation
  @portfolio_summary = @user.portfolio_summary

  # Single query for watchlist
  @watchlist = @user.watchlist_items.pluck(:symbol)
end

# Add indexes
add_index :transactions, [:user_id, :created_at]
add_index :portfolio_items, :user_id
add_index :watchlist_items, :user_id

Performance After:

  • Page load: 180ms (23x faster)
  • Database queries: 4
  • Database time: 45ms

Monitoring & Ongoing Optimization

1. Identify Slow Queries

# PostgreSQL slow query log
# postgresql.conf
log_min_duration_statement = 100  # Log queries > 100ms

# Or use pg_stat_statements
CREATE EXTENSION pg_stat_statements;

# Then query
SELECT
  query,
  calls,
  total_time,
  mean_time,
  max_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 20;

2. Set Up Alerts

# config/initializers/notifications.rb
ActiveSupport::Notifications.subscribe('sql.active_record') do |name, start, finish, id, payload|
  duration = (finish - start) * 1000

  if duration > 1000  # Queries over 1 second
    Rails.logger.warn "Slow Query (#{duration.round}ms): #{payload[:sql]}"

    # Optional: Send to error tracking
    Rollbar.warning("Slow database query", {
      duration: duration,
      sql: payload[:sql]
    })
  end
end

3. Regular VACUUM

PostgreSQL needs maintenance:

-- Manual
VACUUM ANALYZE;

-- Automatic (recommended)
-- postgresql.conf
autovacuum = on

Optimization Checklist

Before deploying to production, check:

  • All foreign keys have indexes
  • Queries use eager loading (no N+1)
  • Counter caches for counts
  • Indexes on WHERE/ORDER BY columns
  • Partial indexes where appropriate
  • Database-level aggregations (not Ruby)
  • SELECT only needed columns
  • LIMIT on all lists
  • Batch processing for large datasets
  • Connection pool sized correctly
  • Slow query logging enabled
  • EXPLAIN ANALYZE on critical queries

Advanced Techniques

1. Composite Primary Keys (Rails 7.1+)

# For many-to-many tables
create_table :user_roles, primary_key: [:user_id, :role_id] do |t|
  t.integer :user_id, null: false
  t.integer :role_id, null: false
end

2. CTE (Common Table Expressions)

User.with(
  active_users: User.where(active: true),
  recent_transactions: Transaction.where('created_at > ?', 30.days.ago)
).joins('INNER JOIN active_users ON...')

3. Window Functions

# Rank users by transaction volume
User.select('
  users.*,
  RANK() OVER (ORDER BY transaction_count DESC) as rank
').joins(:transactions)

The Bottom Line

Database optimization isn’t about premature optimization—it’s about:

  1. Measuring before optimizing
  2. Fixing the biggest bottlenecks first
  3. Testing impact of changes
  4. Monitoring continuously

The techniques in this post have:

  • Reduced page load times from seconds to milliseconds
  • Cut database costs by 60-70%
  • Enabled handling 10x more users with same hardware
  • Prevented production outages

Start with the basics:

  • Eliminate N+1 queries
  • Add proper indexes
  • Use database aggregations

Then iterate based on production data.

What’s Next?

Want to dive deeper? Potential future topics include:

  • Redis caching strategies for Rails
  • Scaling PostgreSQL with read replicas
  • Database sharding for multi-tenant apps
  • Time-series data optimization with TimescaleDB

Vote in the comments or reach out!


Tools & Resources