evan.musing << current

life and tech stuff by Evan Phoenix

Archive for August 2007

Performance benchmarks

with 9 comments

The project is getting more and more visible, so I feel it’s important that people have an accurate assessment of where we are. Most of us have seen these old benchmarks, so lets see some more recent results.

I’ve only done rubinius versus MRI. This was performed running on my Powerbook G4 1.67Ghz laptop. I’ll run these again soon on a big beefy server so we can all compare and contrast.

Data

                Test                  MRI             Rubinius
    bm_app_answer.rb             4.727393             0.619218
 bm_app_factorial.rb             1.703541             1.268409
       bm_app_fib.rb            18.739502             5.309634
bm_app_mandelbrot.rb            11.175115                Error
 bm_app_pentomino.rb              Timeout              Timeout
     bm_app_raise.rb             6.825073             5.744597
 bm_app_strconcat.rb              4.46694             4.086749
       bm_app_tak.rb            24.502033             7.291604
     bm_app_tarai.rb            21.071624            11.594766
    bm_loop_times.rb            16.028508              Timeout
bm_loop_whileloop.rb            29.844976             4.767554
bm_loop_whileloop2.rb            5.232881             1.103926
  bm_so_ackermann.rb              Timeout             6.626395
      bm_so_array.rb            18.276808              Timeout
bm_so_concatenate.rb             5.597246              Timeout
bm_so_count_words.rb             0.099502                Error
  bm_so_exception.rb             9.262875             7.553756
      bm_so_lists.rb             3.372947              Timeout
     bm_so_matrix.rb             5.605184            23.462452
bm_so_nested_loop.rb            17.162697            54.288111
     bm_so_object.rb            19.170284            21.458258
     bm_so_random.rb             7.168005            24.018174
      bm_so_sieve.rb             4.870225            15.966473
     bm_vm1_block.rb              Timeout              Timeout
     bm_vm1_const.rb            47.509964             11.05456
    bm_vm1_ensure.rb            45.855905             5.395634
    bm_vm1_length.rb            49.408877            24.633311
    bm_vm1_rescue.rb            35.682676             6.107035
bm_vm1_simplereturn.rb          47.935916            15.641316
      bm_vm1_swap.rb              Timeout             8.041324
     bm_vm2_array.rb            18.284155            12.393373
    bm_vm2_method.rb             34.70641            16.282999
bm_vm2_poly_method.rb            45.63907            23.423347
bm_vm2_poly_method_ov.rb        11.462504             6.403511
      bm_vm2_proc.rb            22.432265            27.886953
    bm_vm2_regexp.rb            11.580949              Timeout
      bm_vm2_send.rb            11.223254            21.448888
     bm_vm2_super.rb            12.599854             4.998904
     bm_vm2_unif1.rb            10.726272             3.185878
    bm_vm2_zsuper.rb            14.236729             4.948068
bm_vm3_thread_create_join.rb     0.086081              Timeout

Analysis

There are a few trends in the data I’d like to point out.

  • Of tests that did not error or timeout, rubinius was faster in 24 of 31. Wow first off, thats a huge improvement over the previous run.
  • The slowest section for bm_so. Rubinius was only faster in 2 of 11, and actually error or timeout on 4. If you look at those benchmarks, you’ll see that they are basically tests of a few core methods, mainly things like String#<<. So it makes sense that at this stage, we’re slower on those. We haven’t yet tuned those at all.
  • One big trend is that tests that only stressed the VM architecture came out WAY faster. 2 examples are bm_vm1_swap and bm_vm1_simplereturn. The first swaps two local variables using a, b = b, a a few million times. This is a good example where the bytecode VM is much faster than the tree walker in MRI. Next, bm_vm1_simplereturn shows off rubinius’s ability to create a method context quickly and return to the sender quickly. I’m thrilled about this number because even though rubinius MethodContext’s are first class, they’re still 3 times faster with no programming power loss.

All in all, I’m very happy with these results. They show that the project is advancing and is a viable ruby implementation, not just a toy.

Update: More Data

Here’s the data from running on a 64bit xeon. The rations are not the same because I haven’t yet got direct threading working properly on 64bit platforms, so that impacts rubinius performance negatively in this case.

                Test                  MRI             Rubinius
    bm_app_answer.rb             0.674141             0.357815
 bm_app_factorial.rb                Error              0.40302
       bm_app_fib.rb             6.023813              2.86666
bm_app_mandelbrot.rb             2.305716                Error
 bm_app_pentomino.rb              Timeout              Timeout
     bm_app_raise.rb             1.634094             2.681252
 bm_app_strconcat.rb             1.541677             1.466644
       bm_app_tak.rb             7.749194              4.02251
     bm_app_tarai.rb             6.194152             6.621082
    bm_loop_times.rb             3.520025            32.848938
bm_loop_whileloop.rb             8.091596             2.464447
bm_loop_whileloop2.rb            1.736418             0.609515
  bm_so_ackermann.rb                Error             3.579584
      bm_so_array.rb              4.89737              Timeout
bm_so_concatenate.rb             1.573779              Timeout
bm_so_count_words.rb             0.145074                Error
  bm_so_exception.rb             3.179525             3.461771
      bm_so_lists.rb             1.429547              Timeout
     bm_so_matrix.rb             1.842544            10.748483
bm_so_nested_loop.rb             5.337045            18.885963
     bm_so_object.rb             5.428432             9.856728
     bm_so_random.rb             2.612983            11.789056
      bm_so_sieve.rb             0.711854             5.268267
     bm_vm1_block.rb            26.471025             37.74646
     bm_vm1_const.rb            14.004854             5.930651
    bm_vm1_ensure.rb            14.199208             2.854205
    bm_vm1_length.rb            16.117594            13.692691
    bm_vm1_rescue.rb            11.509271             2.859993
bm_vm1_simplereturn.rb           14.78014             8.154143
      bm_vm1_swap.rb             22.52124             5.419691
     bm_vm2_array.rb             6.238171             4.938015
    bm_vm2_method.rb             9.747336             9.017925
bm_vm2_poly_method.rb           12.513751            11.709814
bm_vm2_poly_method_ov.rb         3.961969             2.150468
      bm_vm2_proc.rb             6.393898            10.224857
    bm_vm2_regexp.rb             3.224304            54.639602
      bm_vm2_send.rb             3.375969            12.034005
     bm_vm2_super.rb             4.012679             2.795978
     bm_vm2_unif1.rb             3.005257             1.716368
    bm_vm2_zsuper.rb             4.336752              2.83723
bm_vm3_thread_create_join.rb     0.14954                Error 

Written by evanphx

August 24, 2007 at 11:10 am

Posted in rubinius, ruby

Maintaining an svn mirror directly from git

with 4 comments

When rubinius switched to git recently, we wanted the ability to keep a read-only svn repo running that had the same changes in it. This would let casual people who don’t wish to use git to at least check out the latest code easily. So with some jiggering, I came up with the follow recipe.

1) Setup tailor

Tailor is a python program with is used to translate changes in one version control system into another. It does this by using the native tools for the systems and working copy of code. When there is a change, it simple updates the working copy from the source, then checks them into the target. It’s basically a brute force way, but works quite well.

I use the following tailor config file for rubinius:


[DEFAULT]
verbose = True

[rbx]
target = svn:target
start-revision = c6f4d90df72b103884fa5470a433f5513d2c524d
root-directory = /home/evan/work/tailor/output
state-file = tailor.state
source = git:source
subdir = .

[git:source]
repository = /var/cache/git/code

[svn:target]
module = /rubinius/trunk
repository = file:///home/evan/work/rbx-git-tailor

  • start-revision was about 10 commits back from HEAD at the time I did the import. I did this so I didn’t have to wait for tailor to replay all of the commits in git, but still included the last 10. Moving forward, it does all commits.
  • root-directory is the working copy directory tailor uses to pull in and commit changes. Make sure it’s an empty directory.
  • /var/cache/git/code is a bare git repo, so be sure that repository points to a bare git. In fact, most people will tell you to only use a bare git repo (not one that also contains a working copy) on servers which you push to. push does not update the working copy and it can get quite confusing otherwise.
  • The svn target repository should a path that doesn’t exist. Tailor will create the repo the first time it runs. Do NOT point it at an existing one!

2) Git hook

Next, I used the post-update hook in git to automatically run tailor. Heres my post-update hook currently:


echo ""
echo "Updating http://git.rubini.us/svn for the less git inclined"

lockfile -1 ~/tailor.lock

/usr/bin/tailor -c /home/evan/work/tailor/config > ~/tailor.log

sudo -u evan /bin/kill -USR2 `cat /home/evan/work/matzbot/matzbot.pid` > /dev/null 2>&1 || true

rm -f ~/tailor.lock

exec git-update-server-info

The output of the post-update hook goes to the client doing the push, so the echo’s are for the git developers benefit (they’ll probably wonder why their commit pauses at the end if git is sooooo fast otherwise).

I use the lockfile program so that 2 commits don’t try and run tailor at the same time. I don’t know what would happen and personally don’t want to know. Better safe than sorry.

The kill -USR2 tells an irc bot we run in #rubinius that there are new commits to show people. Thats available in git.

Caveats

I have all my git developers pushing via ssh, all as the git user, ie git clone git@git.rubini.us/code. This means the post-update hook is run as the git user. So while the tailor working copy is in my home directory, I’ve chgrp it to git and run chmod g+r -R so the git user can properly use it.

I have the ouput from running tailor redirected into a file, so I can monitor it. So far, the only problem I’ve had with this is that my git user didn’t have a name in /etc/passwd, so git complained about not being able to properly form the author field.

This is read only. Do NOT let people check directly into the svn repo tailor is updating.

Written by evanphx

August 17, 2007 at 1:11 pm

Posted in git, rubinius

Switched to wordpress.com

leave a comment »

So, I got fed up trying to get wordpress to behave nicely on my server, so I’ve moved my blog to wordpress.com, where they’ve got everything already setup nicely.

Let me know if anything isn’t working.

Written by evanphx

August 13, 2007 at 3:51 pm

Posted in Uncategorized

another rubinius preview release

leave a comment »

I’ve cut 0.8, another developer preview for people to play with.

It’s available at on rubin.us

Rubinius still very much still a work in progress, so there are a lot of rough corners you’re bound to experience.

Written by evanphx

August 13, 2007 at 2:12 pm

Posted in rubinius

An overdue update

with one comment

A long overdue update about my big project rubinius.

Scheduling

The team is still moving along nicely, still aiming for a 1.0 release by RubyConf 2007. We’ve still got a ways to go, but I’m confident. A part of the team is getting together for a sprint in mid September. Other projects have used sprints to really pull away, productivity wise, and I’m hoping we can do the same.

Git

In the last few days, I decided to migrate the project off Subversion to Git, the DSCM. While I’m certain some will see this as a complete waste of time, I feel that it’s important for the project in the long term, and the developers in the short term.

Long Term

As many people are aware, the mainline ruby interpreter (MRI) suffers a lack of transparency. The perception (I can only speak for rubyists in the US, and perhaps a few in Europe) is that ruby-core team works at their own pace and doesn’t accept much input into the process, nor does it report on the process much.
Now, whether or not you agree with that assessment is not what I’m concerned about. It’s the long term perception and possibility that this same thing could happen to rubinius. So rather than wait and see, I’ve decided that the best way to avert this is to make it as easy as possible to contribute and progress rubinius. Again, some will argue that git was not required to reach this goal, and that is a valid argument. Part of it was just a irrational decision, I’ve been interested in git for a while and wanted to play with it more.

Short Term

Local branches, sane merging, a toolkit interface, oh my! I’ve already fell in love with the parts of git that svn lacks, which in my book, was a reason to switch anway. The tools are richer and more powerful. The code is cleaner. Nuff said.

Application Push

We’re currently in a phase of development I’ve been calling Application Push, which is just a fancy term for try to run shit. The existing body of ruby code is quite dense and provides an excellent proving ground to flesh out rubinius. The project has finally progressed enough that this level of proving can be done. Charles has talked about how this style of dev is what really pushed JRuby to 1.0, so we’re hoping to follow in those same footsteps.

Currently playing in iTunes: The Outernationalist by Thievery Corporation

Written by evanphx

August 13, 2007 at 12:44 am

Posted in git, rubinius

A return to blogging

with one comment

It’s again been too long since the last update. If anyone has a good way to keep yourself disciplined on writing regular posts, please comment and let me know the secret.

I’m going to keep this one personal, then write another with tech stuff. Abby and I are adjusting to life in LA nicely. Lots of restaurants to explore as always, and our new favorite game: Spot the celebrity! I’m ahead in the standings due to an uncanny ability to pick up the subtle clues they give off.

Most recently, I spotted Jane Lynch (40 Year Old Virgin) at Urth Caffe with courtney. Abby has recently started to do well, spotting Cassidy Lehrman (Sarah Gold, Entourage) at the Santa Monica pier, and Ben McKenzie (Ryan Atwood, The OC) at Stardust last night.

Hopefully we don’t seem vapid and fame-obsessed, but this is LA, where this happens quite a bit. Think of it like a post-modern Slug Bug.

As always, Fog, our muse around the house, has provided much entertainment. Since I’ve begun working at home, we’ve started to get on a routine: Feed her, play a little, she sleeps for 4 hours, we play more, watch some TV, sleep a bit more, feed again, repeat. It’s tough being a Felis domesticus.

Update: I have evidence

Written by evanphx

August 13, 2007 at 12:06 am

Posted in life, los angeles

Follow

Get every new post delivered to your Inbox.