benchmark | Technical Debt

[notice]
If you use this sort of technique, you may want to read on.

node = nodes.first rescue return

[/notice]

[important]

Nov 2012 Update:

Though this post was about the performance cost of using a ‘rescue’ statement, there is a more insidious problem with the overall impact of such syntax. The pros and cons of using a rescue are well laid out in Avdi’s free RubyTapas: Inline Rescue

[/important]

Code like this:

unless nodes.nil?
  nodes.first
else
  return
end

Can be written using the seemingly more elegant approach with this ruby trick:

node = nodes.first rescue return

But then, that got me to thinking… In many languages I have used in the past (e.g., Java and C++), Exception handling is an expensive endeavor.

So, though the rescue solution works, I am thinking I should explore whether there are any pros/cons to allowing a “rescue” to act as logic. So I did just that…

Here are the two methods I benchmarked, one with “if” logic, and one with “rescue” logic:

def without_rescue(nodes)
  return nil if nodes.nil?
  node = nodes.first
end

def with_rescue(nodes)
  node = nodes.first rescue return
end

Using method_1, below, I got the following results looping 1 million times:

                  user     system      total        real
W/out rescue  0.520000   0.010000   0.530000 (  0.551359)
With rescue  22.490000   0.940000  23.430000 ( 26.487543)

Yikes. Obviously, rescue is an expensive choice by comparison!

But, if we look at just one or maybe 10 times, the difference is imperceptible.

Conclusion #1 (Normal Usage)

It doesn’t matter which method you chose to use if the logic is invoked infrequently.

Looking a bit Deeper

But being a curious engineer at heart, there’s more… The above results are based on worst-case, assuming nodes is always nil. If nodes is never nil, then the rescue block is never invoked. Yielding this (rather obvious) timing where the rescue technique (with less code) is faster:

                  user     system      total        real
W/out rescue  0.590000   0.000000   0.590000 (  0.601803)
With rescue   0.460000   0.000000   0.460000 (  0.461810)

However, what if nodes were only nil some percentage of the time? What does the shape of the performance curve look like? Linear? Exponential? Geometric progression? Well, it turns out that the response (see method_2, below) is linear (R2= 0.99668):

Rescue Logic is Expensive

Conclusion #2 (Large Data Set):

In this example use of over a million tests, the decision on whether you should use “rescue” as logic boils down to this:

If the condition is truly rare (like a real exception), then you can use rescue.
If the condition is going to occur 5% or more, then do not use rescue technique!

In general, it would seem that there is considerable cost to using rescue as pseudo logic over large data sets. Caveat emptor!

Sample Code:

My benchmarking code looked like this:

require 'benchmark'

include Benchmark

def without_rescue(nodes)
  return nil if nodes.nil?
  node = nodes.first
end

def with_rescue(nodes)
  node = nodes.first rescue return
end

TEST_COUNT = 1000000

def method_1
  [nil, [1,2,3]].each do |nodes|
    puts "nodes = #{nodes.inspect}"
    GC.start
    bm(12) do |test|
      test.report("W/out rescue") do
        TEST_COUNT.times do |n|
          without_rescue(nodes)
        end
      end
      test.report("With rescue") do
        TEST_COUNT.times do |n|
          with_rescue(nodes)
        end
      end
    end
  end
end

def method_2
  GC.start
  bm(18) do |test|
    nil_nodes = nil
    real_nodes = nodes = [1,2,3]
    likely_pct = 0
    10.times do |p|
      likely_pct += 10
      test.report("#{likely_pct}% W/out rescue") do
        TEST_COUNT.times do |n|
          nodes = rand(100) > likely_pct ? real_nodes : nil_nodes
          without_rescue(nodes)
        end
      end
      test.report("#{likely_pct}% With rescue") do
        TEST_COUNT.times do |n|
          nodes = rand(100) > likely_pct ? real_nodes : nil_nodes
          with_rescue(nodes)
        end
      end
    end
  end
end

method_1
method_2

Sample Output

                  user     system      total        real
W/out rescue  0.520000   0.010000   0.530000 (  0.551359)
With rescue  22.490000   0.940000  23.430000 ( 26.487543)
nodes = [1, 2, 3]
                  user     system      total        real
W/out rescue  0.590000   0.000000   0.590000 (  0.601803)
With rescue   0.460000   0.000000   0.460000 (  0.461810)
                        user     system      total        real
10% W/out rescue    1.020000   0.000000   1.020000 (  1.087103)
10% With rescue     3.320000   0.120000   3.440000 (  3.825074)
20% W/out rescue    1.020000   0.000000   1.020000 (  1.036359)
20% With rescue     5.550000   0.200000   5.750000 (  6.158173)
30% W/out rescue    1.020000   0.010000   1.030000 (  1.105184)
30% With rescue     7.800000   0.300000   8.100000 (  8.827783)
40% W/out rescue    1.030000   0.010000   1.040000 (  1.090960)
40% With rescue    10.020000   0.400000  10.420000 ( 11.028588)
50% W/out rescue    1.020000   0.000000   1.020000 (  1.138765)
50% With rescue    12.210000   0.510000  12.720000 ( 14.080979)
60% W/out rescue    1.020000   0.000000   1.020000 (  1.051054)
60% With rescue    14.260000   0.590000  14.850000 ( 15.838733)
70% W/out rescue    1.020000   0.000000   1.020000 (  1.066648)
70% With rescue    16.510000   0.690000  17.200000 ( 18.229777)
80% W/out rescue    0.990000   0.010000   1.000000 (  1.099977)
80% With rescue    18.830000   0.800000  19.630000 ( 21.634664)
90% W/out rescue    0.980000   0.000000   0.980000 (  1.325569)
90% With rescue    21.150000   0.910000  22.060000 ( 25.112102)
100% W/out rescue   0.950000   0.000000   0.950000 (  0.963324)
100% With rescue   22.830000   0.940000  23.770000 ( 25.327054)

Technical Debt

Jon Kern's ramblings on software development

Tag Archives: benchmark

The Cost of Using Ruby’s Rescue as Logic

Conclusion #1 (Normal Usage)

Looking a bit Deeper

Conclusion #2 (Large Data Set):

Sample Code:

Sample Output