Performance benchmarks

The project is getting more and more visible, so I feel it’s important that people have an accurate assessment of where we are. Most of us have seen these old benchmarks, so lets see some more recent results.

I’ve only done rubinius versus MRI. This was performed running on my Powerbook G4 1.67Ghz laptop. I’ll run these again soon on a big beefy server so we can all compare and contrast.


                Test                  MRI             Rubinius
    bm_app_answer.rb             4.727393             0.619218 
 bm_app_factorial.rb             1.703541             1.268409 
       bm_app_fib.rb            18.739502             5.309634 
bm_app_mandelbrot.rb            11.175115                Error 
 bm_app_pentomino.rb              Timeout              Timeout 
     bm_app_raise.rb             6.825073             5.744597 
 bm_app_strconcat.rb              4.46694             4.086749 
       bm_app_tak.rb            24.502033             7.291604 
     bm_app_tarai.rb            21.071624            11.594766 
    bm_loop_times.rb            16.028508              Timeout 
bm_loop_whileloop.rb            29.844976             4.767554 
bm_loop_whileloop2.rb            5.232881             1.103926 
  bm_so_ackermann.rb              Timeout             6.626395 
      bm_so_array.rb            18.276808              Timeout 
bm_so_concatenate.rb             5.597246              Timeout 
bm_so_count_words.rb             0.099502                Error 
  bm_so_exception.rb             9.262875             7.553756 
      bm_so_lists.rb             3.372947              Timeout 
     bm_so_matrix.rb             5.605184            23.462452 
bm_so_nested_loop.rb            17.162697            54.288111 
     bm_so_object.rb            19.170284            21.458258 
     bm_so_random.rb             7.168005            24.018174 
      bm_so_sieve.rb             4.870225            15.966473 
     bm_vm1_block.rb              Timeout              Timeout 
     bm_vm1_const.rb            47.509964             11.05456 
    bm_vm1_ensure.rb            45.855905             5.395634 
    bm_vm1_length.rb            49.408877            24.633311 
    bm_vm1_rescue.rb            35.682676             6.107035 
bm_vm1_simplereturn.rb          47.935916            15.641316 
      bm_vm1_swap.rb              Timeout             8.041324 
     bm_vm2_array.rb            18.284155            12.393373 
    bm_vm2_method.rb             34.70641            16.282999 
bm_vm2_poly_method.rb            45.63907            23.423347 
bm_vm2_poly_method_ov.rb        11.462504             6.403511 
      bm_vm2_proc.rb            22.432265            27.886953 
    bm_vm2_regexp.rb            11.580949              Timeout 
      bm_vm2_send.rb            11.223254            21.448888 
     bm_vm2_super.rb            12.599854             4.998904 
     bm_vm2_unif1.rb            10.726272             3.185878 
    bm_vm2_zsuper.rb            14.236729             4.948068 
bm_vm3_thread_create_join.rb     0.086081              Timeout 


There are a few trends in the data I’d like to point out.

  • Of tests that did not error or timeout, rubinius was faster in 24 of 31. Wow first off, thats a huge improvement over the previous run.
  • The slowest section for bm_so. Rubinius was only faster in 2 of 11, and actually error or timeout on 4. If you look at those benchmarks, you’ll see that they are basically tests of a few core methods, mainly things like String#<<. So it makes sense that at this stage, we’re slower on those. We haven’t yet tuned those at all.
  • One big trend is that tests that only stressed the VM architecture came out WAY faster. 2 examples are bm_vm1_swap and bm_vm1_simplereturn. The first swaps two local variables using a, b = b, a a few million times. This is a good example where the bytecode VM is much faster than the tree walker in MRI. Next, bm_vm1_simplereturn shows off rubinius’s ability to create a method context quickly and return to the sender quickly. I’m thrilled about this number because even though rubinius MethodContext’s are first class, they’re still 3 times faster with no programming power loss.

All in all, I’m very happy with these results. They show that the project is advancing and is a viable ruby implementation, not just a toy.

Update: More Data

Here’s the data from running on a 64bit xeon. The rations are not the same because I haven’t yet got direct threading working properly on 64bit platforms, so that impacts rubinius performance negatively in this case.

                Test                  MRI             Rubinius
    bm_app_answer.rb             0.674141             0.357815 
 bm_app_factorial.rb                Error              0.40302 
       bm_app_fib.rb             6.023813              2.86666 
bm_app_mandelbrot.rb             2.305716                Error 
 bm_app_pentomino.rb              Timeout              Timeout 
     bm_app_raise.rb             1.634094             2.681252 
 bm_app_strconcat.rb             1.541677             1.466644 
       bm_app_tak.rb             7.749194              4.02251 
     bm_app_tarai.rb             6.194152             6.621082 
    bm_loop_times.rb             3.520025            32.848938 
bm_loop_whileloop.rb             8.091596             2.464447 
bm_loop_whileloop2.rb            1.736418             0.609515 
  bm_so_ackermann.rb                Error             3.579584 
      bm_so_array.rb              4.89737              Timeout 
bm_so_concatenate.rb             1.573779              Timeout 
bm_so_count_words.rb             0.145074                Error 
  bm_so_exception.rb             3.179525             3.461771 
      bm_so_lists.rb             1.429547              Timeout 
     bm_so_matrix.rb             1.842544            10.748483 
bm_so_nested_loop.rb             5.337045            18.885963 
     bm_so_object.rb             5.428432             9.856728 
     bm_so_random.rb             2.612983            11.789056 
      bm_so_sieve.rb             0.711854             5.268267 
     bm_vm1_block.rb            26.471025             37.74646 
     bm_vm1_const.rb            14.004854             5.930651 
    bm_vm1_ensure.rb            14.199208             2.854205 
    bm_vm1_length.rb            16.117594            13.692691 
    bm_vm1_rescue.rb            11.509271             2.859993 
bm_vm1_simplereturn.rb           14.78014             8.154143 
      bm_vm1_swap.rb             22.52124             5.419691 
     bm_vm2_array.rb             6.238171             4.938015 
    bm_vm2_method.rb             9.747336             9.017925 
bm_vm2_poly_method.rb           12.513751            11.709814 
bm_vm2_poly_method_ov.rb         3.961969             2.150468 
      bm_vm2_proc.rb             6.393898            10.224857 
    bm_vm2_regexp.rb             3.224304            54.639602 
      bm_vm2_send.rb             3.375969            12.034005 
     bm_vm2_super.rb             4.012679             2.795978 
     bm_vm2_unif1.rb             3.005257             1.716368 
    bm_vm2_zsuper.rb             4.336752              2.83723 
bm_vm3_thread_create_join.rb     0.14954                Error 

9 thoughts on “Performance benchmarks

  1. That’s is seriously really impressive.
    I tried to contribute a little in the beginning, but had trouble and since I was working in an odd environment it was hard for me to post patches. I’d forgotten since then and worked on some interpreters of my own, but the progress is really inspiring, and I’m pretty impressed with, so I’m going to make a point of taking another swing at helping out.

    Keep up the good work!

  2. These numbers look great. For a first cut this is very acceptable. Now if it could only run Rails apps. But, that will come in time.

  3. It would be nice to see a Rubinius/MRI ratio column. Also I’m guessing that the result units are seconds (vs. iterations, work done, etc.) and smaller numbers are better, but this isn’t stated anywhere (on this page).

  4. I’ll probably start running these more often, and I’ll add a ratio column. Yep, the unit is seconds to run and smaller is better. I’ll remember to put a legend up next time.

  5. Any chance you could just add the script to generate the comparison to the rubinius distribution?

    I guess actually it might belong in whatever shared repositories of specifications that exists between rubinius and jruby.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s