Beginners Guide to Ruby on Rails Performance: Part 2

01 Mar, 2025

This is the continuation of my post about simple changes to improve the performance of Ruby on Rails application.

Part 1 is here Beginners Guide to Ruby on Rails Performance: Part 1

Part 1 dealt with fairly straight forward fixes. Part 2 is a bit more complicated, but not much more.

Nested loops

When you first learn the Rails Way™ you’re shown to write nested loops because of the way associations are defined on Models. For example if we have set of models with has_many associations

class Customer < ActiveRecord::Base
  has_many :invoices
end

class Invoice < ActiveRecord::Base
  has_many :products
  belongs_to :customer
end

class Product < ActiveRecord::Base
  belongs_to :invoice
  belongs_to :refund, optional: true
  has_many :charges
end

class Charge < ActiveRecord::Base
  belongs_to :product
  scope :discounted, -> { where(discount: true) }
end

class Refund < ActiveRecord::Base
  has_one :product
end

It’s easy and tempting (and kind of encouraged) to write code like this

discounts = []
Customer.find_by(name: 'Jane').invoices.each do |invoice|
  invoice.products.each do |product|
    product.charges.each do |charge|
      discounts << charge if charge.discount?
    end
  end
end

We iterate through each association looking at each record and their associations as we go. It’s easy to understand because it aligns with the models, but it’s really inefficient.

[Lots of queries]
Memory Usage:      41.44 MB
GC Runs:           12
Objects Created:   245,969

The garbage collector ran thirteen times! That’s not great. This code loads every record in the database to find what could be a small sub-set of data. To fix this we want to turn the loops into joins. This can be a bit hard to reason about but the way to think about it is we turn the loops inside out. That is, start with the object “type” that you are looking for, usually the deepest loop, and join your way back up to the outer loop. An example will make it clearer what I mean. For the above code we want to return a list of charges that have discounts so we start from there and join each association as we move out of the loops adding any where clauses if needed as we go.

Charge.where(discount: true)
      .joins(product: { invoice: :customer })
      .where(customers: { name: 'Jane' }) # <---

I’d argue that this version is more readable. It’s definitely more declarative though it requires you to understand SQL more. One thing to note when doing joins like this is the table name in the where clause. The association from Invoice to Customer is customer but we have to use the table name customers. This is especially important if the table name doesn’t follow the Rails convention.

Charge Load (6.8ms)  SELECT "charges".* FROM "charges" INNER JOIN "products" ON "products"."id" = "charges"."product_id" INNER JOIN "invoices" ON "invoices"."id" = "products"."invoice_id" INNER JOIN "customers" ON "customers"."id" = "invoices"."customer_id" WHERE "charges"."discount" = ? AND "customers"."name" = ?  [["discount", 1], ["name", "Jane"]]
Memory Usage:      2.88 MB
GC Runs:           0
Objects Created:   12,643

Comparison:
                  query:      195.4 i/s
nest loop with includes:        5.9 i/s - 32.99x  slower
              nest loop:        1.7 i/s - 113.85x  slower

Whereas the first version loads every Charge, even if we immediately throw them away, the new version only loads Charges we know we need. It executes a single SQL query and will be much faster even for small data sets. At the risk of spoiling my next post I benchmarked a version of the nested loop code using includes. This made four queries rather than one hundred thousand plus that the naive version makes. Memory consumption didn’t change. See my next post for details.

This isn’t exactly a nested loop problem, but don’t forget about update_all, insert_all, delete_all and destroy_all

  Customer.find_by(name: 'Jane').invoices.each do |invoice|
    invoice.products.each do |product|
      product.charges.each do |charge|
        charge.discount_amount = 0 if charge.discount?
      end
    end
  end

  Charge.where(discount: true)
        .joins(product: { invoice: :customer })
        .where(customers: { name: 'Jane' })
        .update_all(discount_amount: 0)

This will make a single UPADTE query. Not always possible but useful when it is.

Queries

Earlier I had this code example.

warm_hues = Colour.where(name: %w|red orange yellow|).pluck(:id)
warm_records = TestRecord.where(colour_id: warm_hues)

Colour Pluck (0.3ms)  SELECT "colours"."id" FROM "colours" WHERE "colours"."name" IN (?, ?, ?)  [["name", "red"], ["name", "orange"], ["name", "yellow"]]
TestRecord Load (344.4ms)  SELECT "test_records".* FROM "test_records" WHERE "test_records"."colour_id" IN (?, ?, ?)  [["colour_id", 1], ["colour_id", 2], ["colour_id", 3]]
Memory Usage:      83.25 MB
GC Runs:           6
Objects Created:   3,000,272

This executes two queries when it can be easily reduced to one

warm_records = TestRecord.joins(:colour).where(colours: {name: %w|red orange yellow|})

TestRecord Load (425.0ms)  SELECT "test_records".* FROM "test_records" INNER JOIN "colours" ON "colours"."id" = "test_records"."colour_id" WHERE "colours"."name" IN (?, ?, ?)  [["name", "red"], ["name", "orange"], ["name", "yellow"]]
Memory Usage:      88.98 MB
GC Runs:           5
Objects Created:   3,000,281

We’re not saving anything here, but not all ActiveRecord query chains are this simple.

Here’s a more complex example that will benefit more from a bit of tuning . This is actual code from a system that I worked on. The names have been changed to protect the innocent (me). This code was spread across several functions but inlining it like this hasn’t changed the behaviour.

count_objects do
  Customer.where(name: 'Jane')
          .invoices
          .flat_map(&:products)
          .select { |product| product.refund_id.nil? }
          .uniq
          .flat_map(&:charges)
          .select(&:discount?)
          .sum(&:total_price_incl_gst)
end

[Lots of queries]
Memory Usage:      38.05 MB
GC Runs:           14
Objects Created:   279,756

To optimise these sorts of complex queries we use the same technique as we did with nested loops. Turn the query inside out. What we are after is the charges so we start with those and join in reverse to the customer, adding the where clauses as you go.

count_objects do
  Charge.where(discount: true)
        .joins(product: { invoice: :customer })
        .where(products: { refund_id: nil }) # <--- table name
        .where(customers: { name: 'Jane' }) # <--- table name
        .sum(&:total_price_incl_gst)
end

Charge Load (7.2ms)  SELECT "charges".* FROM "charges" INNER JOIN "products" ON "products"."id" = "charges"."product_id" INNER JOIN "invoices" ON "invoices"."id" = "products"."invoice_id" INNER JOIN "customers" ON "customers"."id" = "invoices"."customer_id" WHERE "charges"."discount" = ? AND "products"."refund_id" IS NULL AND "customers"."name" = ?  [["discount", 1], ["name", "Jane"]]
Memory Usage:      1.42 MB
GC Runs:           0
Objects Created:   15,449

Comparison:
               query:      156.2 i/s
             iterate:        1.8 i/s - 85.71x  slower

This time we have made significant savings. It should be possible to get the database to sum the fields too. I’ll leave that as an exercise for the reader. I mentioned that the original code was spread across multiple functions. There’s nothing stopping you doing that here either. You just have to pass ActiveRecord::Relations around instead of arrays. There’s an example of that below.

Here are a couple more examples to help you spot patterns.

Filters are a common place where these sorts of improvements can be made. Once again here’s some real production code. Names have been changed.

  scope = customer.invoices
                  .order(processed_at: :desc)
                  .filter { |invoice| invoice.processed_at.present? }
  selected = scope
             .filter { |invoice| invoice.processed_at >= 5.days.ago }
             .filter { |invoice| invoice.status == params[:filter][:status] }
  selected.map { |invoice| Invoice.find_by(id: invoice.id) }

In this case all the records will be loaded by the first filter so the next two will be operating on records in memory. Then we load them again on the last line.

[Lots of queries]
Memory Usage:      0.13 MB
GC Runs:           0
Objects Created:   8,367

And the refactored code

Invoice.where(customer: customer)
       .order(processed_at: :desc)
       .where.not(processed_at: nil)
       .where(processed_at: 5.days.ago..)
       .where(status: params[:filter][:status])

Invoice Load (0.5ms)  SELECT "invoices".* FROM "invoices" WHERE "invoices"."customer_id" = ? AND "invoices"."processed_at" IS NOT NULL AND "invoices"."processed_at" >= ? AND "invoices"."status" = ? ORDER BY "invoices"."processed_at"  [["customer_id", 1], ["processed_at", "2025-02-22 23:38:49.125166"], ["status", "pending"]
Memory Usage:      0.0 MB
GC Runs:           0
Objects Created:   164

Comparison:
              scopes:     3355.8 i/s
              filter:      156.5 i/s - 21.45x  slower

Memory consumption isn’t that different on this data set but the improved version is much faster.

Often we inadvertently pass arrays around as well as load too many records. The new_invoices method returns a potential large array.

def new_invoices
  invoices.select { |invoice| invoice.status == 'processed' }
end

def recalculate_estimated_invoices(reason)
  new_invoices.each do |invoice|
    invoice.recalculate!(reason) if invoice.estimated?
  end
end

select and each
Invoice Load (15.6ms)  SELECT "invoices".* FROM "invoices"
Memory Usage:      1.86 MB
GC Runs:           0
Objects Created:   8,523

This can be fixed a couple of ways. We could write this as a single expression

Invoice.where(status: 'processed', estimated: true).each do |invoice|
  invoice.recalculate!(:reason)
end

Or, if you want to keep the methods we can passActiveRecord::Relations instead of arrays. Remember ActiveRecord::Relations aren’t evaluated until they are accessed so they can be passed around with (almost) zero cost.

def new_invoices
  invoices.where(status: 'processed')
end

def recalculate_estimated_invoices(reason)
  new_invoices.where(estimated: true).each do |invoice|
    invoice.recalculate!(reason)
  end
end

single query
Invoice Load (3.6ms)  SELECT "invoices".* FROM "invoices" WHERE "invoices"."status" = ? AND "invoices"."estimated" = ?  [["status", "processed"], ["estimated", 1]]
Memory Usage:      0.0 MB
GC Runs:           0
Objects Created:   1,016

Even better is to write scopes, but now we’re getting a bit off topic

def recalculate_estimated_invoices(reason)
  invoices.processed.estimated.each do |invoice|
    invoice.recalculate!(reason)
  end
end

N+1

Some of you are probably shouting at the screen “What about N+1 and include!?” I haven’t forgotten. I have a separate post about that coming soon

Stuff to think about when deciding to make these changes

How many objects do you expect?

If you know that the collection will always be small then there is no harm in grabbing the entire set. PostgreSQL (and probably other databases) does this. If your dataset is small PostgreSQL will often ignore an index because it’s quicker to just scan the table.

TestRecord.limit(10).length

TestRecord Load (0.3ms)  SELECT "test_records".* FROM "test_records" LIMIT ?  [["LIMIT", 10]]
Memory Usage:      22.63 MB
GC Runs:           0
Objects Created:   216

Just make sure you consider the future possibility that your app goes viral!

Has the data already been loaded?

As we saw in the discussion about size and blank? If the data has already been loaded then it might be quicker to use Ruby to process the data. Otherwise you might trigger a second unnecessary trip to the database. It’s worth tracking backwards to find out why the data has been loaded. Can you defer the load? Should you memoize the collection to make sure it doesn’t get reloaded?

Conclusion

Hopefully you got this far and learnt something along the way. As I said in the introduction I think the goal is to make this way of coding your default so you don’t have to think about it. You might get some raised eyebrows at code review time because even now folks still code the Rails Way™. But if you show the stats there’s really no argument. And, with the exception of the nested loops and queries, these are easy to find in your code. A simple grep will find uses of enumerable methods. Nested loops and queries can be a bit harder to track down especially if you’re are passing arrays around without realising it. Then you want to start using more dedicated performance techniques like profilers and application performance monitoring (APM) tools like Sentry

Don’t just look in your application code for these improvements. Tests are a place where people take a lot of short cuts and your tests are probably run a lot more than production code. You might save a lot of developer time be speeding up your tests. On the other hand your tests probably aren’t dealing with millions of rows so YMMV.

When this way of coding becomes second nature you will be start from a position of strength and you will save a lot of time in the future, both for you and your customers.

Code

Here is the code I used to generate the results in this post.

First the script to generate the database of sample data. I asked Claude to write this for me. This is something LLMs are quite good at. Using SQLite makes the pure SQL versions look a bit better because there’s no network latency, but not by much.

# frozen_string_literal: true

require 'sqlite3'

# Create a new SQLite database (or open if it exists)
db = SQLite3::Database.new "test.db"

# Drop the table if it exists
db.execute("DROP TABLE IF EXISTS test_records;")
db.execute("DROP TABLE IF EXISTS colours;")

# Create the table
db.execute(<<~SQL)
  CREATE TABLE test_records (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    data INTEGER,
    colour_id INTEGER
  );
SQL

# Create the colours table
db.execute(<<~SQL)
  CREATE TABLE IF NOT EXISTS colours (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT,
    code TEXT
  );
SQL

# Use a transaction to speed up insertion
db.execute("BEGIN TRANSACTION;")

COLOURS = {
  'red' => '#FF0000',
  'orange' => '#FFA500',
  'yellow' => '#FFFF00',
  'green' => '#008000',
  'blue' => '#0000FF',
  'indigo' => '#4B0082',
  'violet' => '#8F00FF'
}.freeze

colour_ids = COLOURS.map do |name, hex|
  db.execute("INSERT INTO colours (name, code) VALUES (?, ?);", [name, hex])
  db.get_first_value("SELECT last_insert_rowid()")
end

1_000_000.times do
  db.execute("INSERT INTO test_records (data, colour_id) VALUES (?, ?);", [rand(1..1_000_000), colour_ids.sample])
end

# Commit the transaction
db.execute("COMMIT;")

db.close

puts "Database and table created, 1,000,000 rows inserted."

I had a second script to generate the data for the nested loop and query tests. That’s just more of the same but more.

This is the code used to measure the memory consumption. This isn’t a good way of gathering performance metrics. To do it properly you should run the tests in a loop multiple times and take the averages (see the benchmarking section below). You should also be more rigorous with recording the memory consumption. For this post we’re not measuring per se, we are just looking at comparisons between different methods and we expect the differences to be orders of magnitude different, not a few percentage points so this code is fine. I did run the test multiple times to get the stats I used in the post. I picked the stats that were roughly in the mean and the object counts calculation was verified using a more rigorous method¹. You can use this to test any Ruby. If you are testing Rails you’ll need to define your models and the database connection somewhere.

Update 14 Mar 2025: The original code had an object counting bug. This is the fixed version. I also changed the GC.start for ObjectSpace.garbage_collect. It seems to give more stable numbers.

# frozen_string_literal: true

require 'active_record'
require 'benchmark/ips'

ActiveRecord::Base.logger = Logger.new($stdout)

EXCLUDED_TYPES = %i[FREE TOTAL T_IMEMO].freeze

def format_number_with_commas(number)
  number.to_s.gsub(/(\d)(?=(\d{3})+(?!\d))/, '\1,')
end

def memory_usage
  `ps -o rss= -p #{Process.pid}`.strip.to_i # Get memory usage in KB
end

def count_objects(message = nil)
  puts message if message

  # Run GC multiple times to ensure a clean baseline
  3.times { ObjectSpace.garbage_collect }
  sleep(1)

  # Capture baseline metrics
  start_objects = ObjectSpace.count_objects
  start_memory = memory_usage
  start_gc_count = GC.count
  start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)

  result = yield

  # Measure after execution
  end_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  end_objects = ObjectSpace.count_objects
  end_memory = memory_usage

  # Calculate differences
  stats = {
    wall_time: end_time - start_time,
    memory_diff_mb: (end_memory - start_memory).to_f / 1024,
    gc_runs: GC.count - start_gc_count
  }

  object_diffs = end_objects.to_h do |key, count|
    [key, count - (start_objects[key] || 0)]
  end
  p object_diffs
  stats[:total_objects_created] = object_diffs
                                  .except(*EXCLUDED_TYPES)
                                  .sum { |_, diff| [diff, 0].max }

  print_stats(stats)
  result
end

def print_stats(stats)
  puts <<~STATS
    Time:              #{stats[:wall_time].round(6)} seconds
    Memory Usage:      #{stats[:memory_diff_mb].round(2)} MB
    GC Runs:           #{stats[:gc_runs]}
    Objects Created:   #{format_number_with_commas(stats[:total_objects_created])}
  STATS
end

This is a handy function to have lying around. Feel free to add it to your tool bag. You use it like this

count_objects "map(&:id)" do
  TestRecord.where(colour_id: red_id).map(&:id)
end

This function will take some timing information but it’s better to use a proper tool for that like the excellent benchmark-ips gem.

Benchmark.ips do |x|
  x.report('map')    { TestRecord.where(colour_id: red_id).map(&:id) }
  x.report('pluck')  { TestRecord.where(colour_id: red_id).pluck(:id) }
  x.report('ids')    { TestRecord.where(colour_id: red_id).ids }
  x.report("select") { TestRecord.where(colour_id: red_id).select(:id).load }
  x.compare!
end

ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
Warming up --------------------------------------
                 map     1.000 i/100ms
               pluck     1.000 i/100ms
                 ids     1.000 i/100ms
              select     1.000 i/100ms
Calculating -------------------------------------
                 map      2.868 (± 0.0%) i/s  (348.66 ms/i) -     15.000 in   5.244710s
               pluck     15.291 (± 6.5%) i/s   (65.40 ms/i) -     77.000 in   5.051319s
                 ids     15.689 (± 6.4%) i/s   (63.74 ms/i) -     79.000 in   5.043278s
              select      3.534 (± 0.0%) i/s  (283.00 ms/i) -     18.000 in   5.186966s

Comparison:
                 ids:       15.7 i/s
               pluck:       15.3 i/s - same-ish: difference falls within error
              select:        3.5 i/s - 4.44x  slower
                 map:        2.9 i/s - 5.47x  slower

This gem runs the benchmark through a couple of phases. It "warms up" the code and gets a rough measurement of speed. Then it runs the code for a fixed period of time to get the timings. Finally it spits out the comparison which is the part that I posted in the main post. It’s a solid benchmarking technique. The gem has a heap more functionality so if you’re interested in benchmarking Ruby have a read of the docs.

I verified the object counts and memory consumption with this gem MemoryProfiler ↩

#performance #rails #ruby