There’s a lot of information available about ways to improve performance in Ruby on Rails applications. I’ve even written about it myself. But one thing that’s often skipped over is exactly how you are supposed to benchmark your performance improvements, particularly if you want to do a full end-to-end speed comparison.
Benchmarking vs Profiling
A quick refresher; to profile code (in performance terms) is to figure out which parts of the code are causing a slowdown. This is how you figure out which part of your code needs attention - is it the view, application, or database layer? The rack-mini-profiler gem is a nice way to add some of this to a Rails application, and you can configure it to produce flamegraphs as well for more detailed analysis.
Benchmarking, on the other hand, enables you to compare two different versions of the code to see what, if any, difference has been made. Imagine laying a piece of lumber on your benchtop, taking a pen, and drawing a line on the bench at the end of the lumber. This line then becomes your “bench mark” which you can compare other pieces of lumber to see if they are longer or shorter. The shape of the lumber (its “profile”, if you will) does not matter during the comparison.
Benchmarking requirements
When benchmarking code I have some requirements that I try to achieve. These are not hard-and-fast rules but rather my “ideal” scenario for benchmarking:
- It should be fast to execute, we’ll be running it often
- It should require no (or minimal) setup
- It should be consistent
- Results should be easy to understand without much analysis
Benchmark-ips
Using the benchmark-ips gem takes care of some of these for us. It handles the warmup to prep any OS or language caching, runs for a set amount of time, and prints out results in an easy “A is X times slower than B” format. A simple example of its usage:
require 'benchmark/ips'
@a = "abc"
@b = "def"
Benchmark.ips do |x|
x.report("Concatenation") { @a + @b }
x.report("Interpolation") { "#{@a}#{@b}" }
x.compare!
end
Warming up --------------------------------------
Concatenation 316.022k i/100ms
Interpolation 282.422k i/100ms
Calculating -------------------------------------
Concatenation 10.353M (± 7.4%) i/s - 51.512M in 5.016567s
Interpolation 6.615M (± 6.8%) i/s - 33.043M in 5.023636s
Comparison:
Concatenation: 10112435.3 i/s
Interpolation: 6721867.3 i/s - 1.50x slower
This ticks off requirements 3 and 4. The warmup and runtime give us fairly consistent answers (the gem will even handle ignoring differences that are within the margin of error), and the results are easy to understand.
Side-by-side method testing
As noted in the requirements, I want this to be fast and to execute with no setup (ideally). My standard approach is to duplicate the method I’m trying to improve, so I have an old un-touched method and the new-and-improved one side-by-side. This makes it easy to run both variations without having to change anything in the test. For example:
class User < ApplicationRecord
# original method
def name
"#{first_name} #{last_name}"
end
# new method, optimistically prefixed 'fast'
def fast_name
""
end
Pro-tip: Replacing your known-or-suspected-slow method with a No-op (e.g. return a fixed value) can give you an absolute best-case performance scenario. This can help confirm that the method you’re looking at is definitely the one causing the performance issue.
You can then run a simple comparison with benchmark-ips
:
# intentionally done outside the block so it's not part of the test
@user = User.first
Benchmark.ips do |x|
x.report("Old") { @user.name }
x.report("New") { @user.fast_name }
x.compare!
end
For models or isolated application code, I would often just run this via the Rails console to compare the two methods. If other methods are calling your method-under-test and you need to redirect them, you can just turn the original method into a shim:
class User < ApplicationRecord
def name
if $use_new_method
fast_name
else
old_name
end
end
def old_name
... #contents of original name method
end
In simple cases like this, we can just use a global $use_new_method
and set it inside the benchmark block before calling the method. To save myself even more work I will often create a throwaway self.benchmark
method inside the object I’m testing that has the Benchmark.ips
setup in it. This way I just need to run reload!
and Thing.benchmark
in the Rails console to test my latest change.
Testing end-to-end page performance
The simplest way I can think of testing page performance is: Run the Rails server, open the page in a browser, and record how long it takes. Of course, we don’t want to do this manually as it’s time-consuming and error-prone, rather we want to script it. Luckily we can use Capybara to do the dirty work for us:
require 'pry'
require 'benchmark/ips'
require 'webdrivers/chromedriver'
require 'capybara/dsl'
include Capybara::DSL
... # may need to set up Capybara driver here
Capybara.app_host = "http://localhost:3000"
Benchmark.ips do |x|
x.time = 30
x.report("Homepage") { visit("/") }
x.report("Cached") { visit("/?cache=true") }
x.compare!
end
This is a simple script that tries to load the homepage for 30 seconds and then tries to load it for another 30 seconds (you could drop that down lower, say 10s) with a cached=true
parameter and compare the results. This is where things start to get a bit hacky.
Remember, while the performance changes we make are production code, the setup we are creating to test it does not have to be. What do I mean? I mean we can add whatever throwaway code we need so that when the cached=true
parameter exists in params
, we call the ’new and improved’ method. If we want to access this setting everywhere we can add a simple $use_cached = params[:cache].present?
in the controller. If the page needs a logged in user, we can add a current_user
method that returns User.first
, etc.
External system tests
As a bonus of this approach, you can change Capybara.app_host
to be whatever you want. So if you need to get more ‘real-world’ results you can push up the code to a temporary or staging server and point the benchmark at it instead (though you will be including network latency into the mix at that point).
Conclusion
For most benchmarking work I’ve done, running in the console or via a throwaway rake task is enough to get the job done. Even with this approach, I would still tend to do that for the faster feedback cycle when I’ve zeroed in on a particular method that needs optimisation. If you have an overall target, however, like “I want the homepage to load within 100ms”, then it makes sense to have an overall benchmark so you can monitor your progress as you chip away at it method-by-method.