Benchmarking Ruby on Rails End-to-End Page Performance

There’s a lot of information available about ways to improve performance in Ruby on Rails applications. I’ve even written about it myself. But one thing that’s often skipped over is exactly how you are supposed to benchmark your performance improvements, particularly if you want to do a full end-to-end speed comparison.

Benchmarking vs Profiling

A quick refresher; to profile code (in performance terms) is to figure out which parts of the code are causing a slowdown. This is how you figure out which part of your code needs attention - is it the view, application, or database layer? The rack-mini-profiler gem is a nice way to add some of this to a Rails application, and you can configure it to produce flamegraphs as well for more detailed analysis.

Benchmarking, on the other hand, enables you to compare two different versions of the code to see what, if any, difference has been made. Imagine laying a piece of lumber on your benchtop, taking a pen, and drawing a line on the bench at the end of the lumber. This line then becomes your “bench mark” which you can compare other pieces of lumber to see if they are longer or shorter. The shape of the lumber (its “profile”, if you will) does not matter during the comparison.

Benchmarking requirements

When benchmarking code I have some requirements that I try to achieve. These are not hard-and-fast rules but rather my “ideal” scenario for benchmarking:

It should be fast to execute, we’ll be running it often
It should require no (or minimal) setup
It should be consistent
Results should be easy to understand without much analysis

Benchmark-ips

Using the benchmark-ips gem takes care of some of these for us. It handles the warmup to prep any OS or language caching, runs for a set amount of time, and prints out results in an easy “A is X times slower than B” format. A simple example of its usage:

require 'benchmark/ips'

@a = "abc"
@b = "def"
Benchmark.ips do |x|
  x.report("Concatenation") { @a + @b }
  x.report("Interpolation") { "#{@a}#{@b}" }
  x.compare!
end

Warming up --------------------------------------
       Concatenation   316.022k i/100ms
       Interpolation   282.422k i/100ms
Calculating -------------------------------------
       Concatenation     10.353M (± 7.4%) i/s -     51.512M in   5.016567s
       Interpolation      6.615M (± 6.8%) i/s -     33.043M in   5.023636s

Comparison:
       Concatenation: 10112435.3 i/s
       Interpolation:  6721867.3 i/s - 1.50x  slower

This ticks off requirements 3 and 4. The warmup and runtime give us fairly consistent answers (the gem will even handle ignoring differences that are within the margin of error), and the results are easy to understand.

Side-by-side method testing

As noted in the requirements, I want this to be fast and to execute with no setup (ideally). My standard approach is to duplicate the method I’m trying to improve, so I have an old un-touched method and the new-and-improved one side-by-side. This makes it easy to run both variations without having to change anything in the test. For example:

class User < ApplicationRecord
  # original method
  def name
    "#{first_name} #{last_name}"
  end

  # new method, optimistically prefixed 'fast'
  def fast_name
    ""
  end

Pro-tip: Replacing your known-or-suspected-slow method with a No-op (e.g. return a fixed value) can give you an absolute best-case performance scenario. This can help confirm that the method you’re looking at is definitely the one causing the performance issue.

You can then run a simple comparison with benchmark-ips:

# intentionally done outside the block so it's not part of the test
@user = User.first

Benchmark.ips do |x|
  x.report("Old") { @user.name }
  x.report("New") { @user.fast_name }
  x.compare!
end

For models or isolated application code, I would often just run this via the Rails console to compare the two methods. If other methods are calling your method-under-test and you need to redirect them, you can just turn the original method into a shim:

class User < ApplicationRecord
  def name
    if $use_new_method
      fast_name
    else
      old_name
    end
  end

  def old_name
    ... #contents of original name method
  end

In simple cases like this, we can just use a global $use_new_method and set it inside the benchmark block before calling the method. To save myself even more work I will often create a throwaway self.benchmark method inside the object I’m testing that has the Benchmark.ips setup in it. This way I just need to run reload! and Thing.benchmark in the Rails console to test my latest change.

Testing end-to-end page performance

The simplest way I can think of testing page performance is: Run the Rails server, open the page in a browser, and record how long it takes. Of course, we don’t want to do this manually as it’s time-consuming and error-prone, rather we want to script it. Luckily we can use Capybara to do the dirty work for us:

require 'pry'
require 'benchmark/ips'
require 'webdrivers/chromedriver'
require 'capybara/dsl'
include Capybara::DSL

... # may need to set up Capybara driver here

Capybara.app_host = "http://localhost:3000"

Benchmark.ips do |x|
  x.time = 30

  x.report("Homepage") { visit("/") }
  x.report("Cached") { visit("/?cache=true") }

  x.compare!
end

This is a simple script that tries to load the homepage for 30 seconds and then tries to load it for another 30 seconds (you could drop that down lower, say 10s) with a cached=true parameter and compare the results. This is where things start to get a bit hacky.

Remember, while the performance changes we make are production code, the setup we are creating to test it does not have to be. What do I mean? I mean we can add whatever throwaway code we need so that when the cached=true parameter exists in params, we call the ’new and improved’ method. If we want to access this setting everywhere we can add a simple $use_cached = params[:cache].present? in the controller. If the page needs a logged in user, we can add a current_user method that returns User.first, etc.

External system tests

As a bonus of this approach, you can change Capybara.app_host to be whatever you want. So if you need to get more ‘real-world’ results you can push up the code to a temporary or staging server and point the benchmark at it instead (though you will be including network latency into the mix at that point).

Conclusion

For most benchmarking work I’ve done, running in the console or via a throwaway rake task is enough to get the job done. Even with this approach, I would still tend to do that for the faster feedback cycle when I’ve zeroed in on a particular method that needs optimisation. If you have an overall target, however, like “I want the homepage to load within 100ms”, then it makes sense to have an overall benchmark so you can monitor your progress as you chip away at it method-by-method.