header image
 

Welcome to the Club

I’ve like to formally welcome the maglev development team over at Gemstone to the Ruby environment club.

For those of you that haven’t yet heard of maglev, it’s a brand new Ruby VM being developed by the folks over at Gemstone. Gemstone is the makers of probably the most advanced object-oriented database used today, and have traditionally been a Smalltalk shop till recently.
With the tide rising on Ruby, I’m happy to see another player enter the field. This only means that Ruby is continuing to mature and see that the community is healthy.

I was personally excited to read an interview with Bob Walker and Avi Bryant concerning maglev, because Rubinius is mentioned more than a few times. They’re looking at Rubinius for a couple of reasons. For one, the RubySpec suite we’ve developing and are about to spin off. The more people that we see using the suite and depending on it, the more mature it will become. Not having a spec for Ruby is commonly touted as a reason that it’s a toy, immature language, and anything we can do to dispel that thinking is good for the community.

The other reason that I’m excited about maglev is that they’re taking a very similar approach to the problem of building a Ruby environment. Like Rubinius, the VM is minimal and most of the kernel is implemented in Ruby.
My hope is that the kernel of Rubinius can be refactored and developed to be generic enough for other environments to use. While I know little about maglev’s current environment, they’re a natural build off the work in the Rubinius kernel. I’d hate to see people develop the code functionality of a ruby environment yet again (I count 5 code bases to this effect currently: MRI, JRuby, Ruby.NET, IronRuby, and Rubinius).

Being able to use a generic Ruby kernel is not unique to a smalltalk style VM. With some luck, it could be used by the folks in other environments as well. In my eyes, this is a big win for everyone. For one, this would mean a common code base that consists of the primary Ruby functionality, and thus would mean a vastly reduced worry of fragmentation. Plus it would alleviate the need for this code to be written again, letting future environment developers focus on taking Ruby to the next level in terms of platform integration, performance, etc.

Rubinius Retort

By now, a good deal of you have read Charles breakdown of Ruby implementations.
If you haven’t please go read at least the Rubinius section before reading the rest of this post, as it is largely a response to that.

Now, on to Charles section on Rubinius:

Evan Phoenix’s Rubinius project is an effort to implement Ruby using as much Ruby code as possible. It is not, as professed, “Ruby in Ruby” anymore. Rubinius started out as a 100% Ruby implementation of Ruby that bootstrapped and ran on top of MatzRuby. Over time, though the “Ruby in Ruby” moniker has stuck, Rubinius has become more or less half C and half Ruby. It boasts a stackless bytecode-based VM (compare with Ruby 1.9, which does use the C stack), a “better” generational, compacting garbage collector, and a good bit more Ruby code in the core libraries, making several of the core methods easier to understand, maintain, and implement in the first place.

A little background is in order, to put things straight. Rubinius began as a hobby, back in February of 2006 (Same month I got married, that’s how I recall).
At RubyConf 2006, I gave a presentation on what was then the initial work, which at that point constitute 3 bodies of work.

  1. A VM written in ruby, using RubyInline to access some raw operations. More slow that you can imagine.
  2. A VM written in C, created by hand translating the ruby code into C. Parts of this work were originally done using a translator program I’d written, which tried to convert the VM in ruby into C mechanically. This proved beyond my skill and time level, thus I felt it was more important to have something running.
  3. A kernel of ruby code, implementing 95% of the core library / kernel / class library of 1.8. The terminology for this part has always been fuzzy in the Ruby community. Rubinius calls this the kernel, some call it the standard library, some the class library. It’s the implementations of the builtin classes such as Array, Hash, etc.

It’s plainly true that today, the VM is about 22,000 lines, the kernel 23,000 lines. I’ve never hidden this fact from anyone; in fact I’ve put those numbers directly into presentations. That’s been true for pretty much the entire life of the project in the public. The initial ruby prototype was only even run by me.

I do though believe that it still can claim “Ruby in Ruby”. When I present on Rubinius or am asked about this, the response I give is:
What is Ruby?
The typically response is that Ruby is 3 things:

  • A syntax
  • An execution model
  • A kernel

Again, lets have some context. When I began this project, there was buzz about improving things like String and Array. In 1.8, this requires diving down into C right off the bat. Plus, consider languages such as C++ and Java. Java largely claims to be written in Java, since almost the entire class library is written in Java. This lets it evolve faster, because there is no mismatch between Java user code and the Java class library.
It is this that we typically talk about “Ruby in Ruby”. If I’ve not explained this well enough in person and in type, I take full responsibility for this misunderstanding.
There is the long term goal of having a VM which is mechanically generated from Ruby code, in the same way Squeak’s VM is written. But after that RubyConf 2006, there has been no additional work on this, but there is a very good reason for that.

Rubinius today has around 150 people who have received commit rights. The vast, vast majority of their work has been in the kernel, because this is the largest part of the whole system. And probably 95% of that work has been writing Ruby code. This means that for pretty much all contributers, helping with Rubinius means writing Ruby code. And thus to them, it is Ruby in Ruby.

The promise of Rubinius is pretty large. If it can be made compatible, and made to run fast, it might represent a better Ruby VM than YARV. Because a fair portion of Rubinius is actually implemented in Ruby, being able to run Ruby code fast would mean all code runs faster. And the improved GC would solve some of the scaling issues Ruby 1.8 and Ruby 1.9 will face.

Rubinius also brings some other innovations. The one most likely to see general visibility is Rubinius’s Multiple-VM API. JRuby has supported MVM from the beginning, since a JRuby runtime is “just another Java object”. But Evan has built simple MVM support in Rubinius and put a pretty nice API on it. That API is the one we’re currently looking at improving and making standard for user-land MVM in JRuby and Ruby 1.9. Rubinius has also shown that taking a somewhat more Smalltalk-like approach to Ruby implementation is feasible.

But here be dragons.

In the 1.5 years since Rubinius was officially named and born into the Ruby world, it has not yet met any of these promises. It is not generally faster than Ruby 1.8, though it performs pretty well on some low-level microbenchmarks. It is not implemented in Ruby: the current VM is written in C and the codebase hosts as much C code as it does Ruby code. Evan’s work on a C++ rewrite of the VM will make Rubinius the first C++-based Ruby implementation. It has not reached the Rails singularity yet, though they may achieve it for RailsConf (probably in the same cobbled-together state JRuby did at JavaOne 2006…or maybe a bit better). And the second Rails inflection point–running Rails faster than Ruby 1.8–is still far away.

Charles once again gets my hackles up, thought his points are true. We’ve yet to run rails. We’ve yet to run significant Ruby code faster than 1.8. I am finishing up a C++ rewrite of the VM.

I’ve addressed the Ruby in Ruby phraseology above, so lets move past that.

Performance is improving at a slow, regular pace. This is because of 2 factors:

  • VM improvements. Adding more caches, more VM logic to make it’s constructs faster. This happens far more infrequently than:
  • Ruby code improvements. This happens quite often, because we have so many people working in the kernel. These kinds of improvements will get us a long way, but not the entire way to 1.8 performance. That’s where VM improvements help.

Again, he brings up the sizes of the VM in comparison to the kernel. This will be the last time I address this in this post. Ruby is a dynamic language, which boasts a very rich, featureful kernel. It’s syntax and constructs allow for short, succinct algorithms.
So while, yes, the kernel is the same number of lines as the VM, it’s not unreasonable to say that it probably constitutes 10x the functionality. This is because the written Ruby code is shorter and easier to understand. That’s the whole point of this project, to make the core of it easier to work on and evolve.

Compatibility is not going to be a problem for Rubinius. They’ve worked very hard from the beginning to match Ruby behavior, even launching a Ruby specification suite project to officially test that behavior using Ruby 1.8 as the standard. I have no doubt Rubinius will be able to run Rails and most other Ruby apps people throw at it. And despite Evan’s frequent cowboy attitude to language compatibility (such as his early refusal to implement left-to-right evaluation ordering, a fatal decision that led to the current VM rework), compatibility is likely to be a simple matter of time and effort, driven by the spec suite and by actual applications, as people start running real code on Rubinius.

A quick personal response to a personal attack. I never once refused to implement left-to-right evaluation ordering, this is a bald faced lie. It’s totally true that Rubinius today is right-to-left, because that was much easier to implement way back in the day when the project began. As we started to work on ActiveRecord, we found that there was code that appear to depend on left-to-right ordering, so I brought it up with matz. And now I’m in the midst of changing it. Truth be told, I should have done my research back when the project started, it would have been easier to fix this then than now.

But I take issue with Charles statement that I’m operating fast and loose with language compatibility. We have an awesome team working on RubySpecs, which will end up being a definitive reference for 1.8 behavior. I will always be the first one to defend their behavior, and get Rubinius implementing it properly.

That’s not to say that Rubinius in the past has made temporary pragmatic decisions in implementation. We absolutely have, and in time those are corrected.
Perhaps Charles mistakes my pragmatism and Montana upbringing for a cowboy attitude.

Performance is going to be a much harder problem for Rubinius. In order for Rubinius to perform well, method invocation must be extremely fast. Not just faster than Ruby 1.8 or Ruby 1.9, but perhaps an order of magnitude faster than the fastest Ruby implementations. The simple reason for this is that with so much of the core classes implemented in Ruby, Rubinius is doing many times more dynamic invocations than any other implementation. If a given String method represents one or two dynamic calls in JRuby or Ruby 1.8, it may represent twenty in Rubinius…and sometimes more. All that dispatch has a severe cost, and on most benchmarks involving heavily Ruby-based classes Rubinius has absolutely dismal performance–even with call-site optimizations that finally pushed JRuby’s performance to Ruby 1.9 levels. A few benchmarks I’ve run from JRuby’s suite must be ratcheted down a couple orders of magnitude to even complete.

He’s absolutely correct. We have a ways to go, but I don’t believe we can’t get there. Others before us have made it work, and I think so shall we.

And the Rubinius team knows this. Over the past few months, more and more core methods have been reimplemented in C as “primitives”, sometimes because they have to be to interact with C-level memory and VM constructs, but frequently for performance reasons. So the “Ruby in Ruby” implementation has evolved away from that ideal rather than towards it, and performance is still not acceptable for most applications. In theory, none of this should be insurmountable. Smalltalk VMs run significantly faster than most Ruby implementations and still implement all or most of the core in Smalltalk. Even the JVM, largely associated with the statically-typed Java language, is essentially an optimized dynamic language VM, and the majority of Java’s core is implemented in Java…often behind interfaces and abstractions that require a good dynamic runtime. But these projects have hundreds of man-years behind them, where Rubinius has only a handful of full-time and part-time enthusiastic Rubyists, most with no experience in implementing high-performance language runtimes. And Evan is still primarily responsible for everything at the VM level.

Of course, it would be folly to suggest that the Rubinius team should focus on performance before compatibility. The “Ruby in Ruby” meme needs to die (seriously!), but other than that Rubinius is an extremely promising implementation of Ruby. Its performance is terrible for most apps, but not all that much worse than JRuby’s performance was when we reached the Rails singularity ourselves. And its design is going to be easier to evolve than comparable C implementations, assuming that people other than Evan learn to really understand the VM core. I believe the promise of Rubinius is certainly great enough to continue the project, even if the perils are going to present some truly epic challenges for Evan and company to overcome.

Thank you for the kind works of encouragement Charles. We’re getting there.
I want to say briefly as well that Charles and I are good friends, I just wanted to clear the air slightly and get everyone on the same page.

super is your friend

Sitting here in Copenhagen, at RubyFools, I thought I’d share a technique that I’ve known about for some time, but seems to not have gotten into the normal ruby vernacular.

This trick is the use of super in methods contained in a Module. Consider the following code:

module N
  def go
    puts "N#go"
    super
  end
end

class B
  def go
    puts "B#go"
  end
end

class A < B
  include N
end

A.new.go

Will print:

N#go
B#go

This is HIGHLY useful implementing rails plugins, where normally you’d use alias_method_chain to change a method directly inside ActiveRecord::Base. Instead, simply call super in the method that provides the new functionality, and when your module is included in your module class (which is a subclass of ActiveRecord::Base), and the main, ActiveRecord::Base implementation will be called.

NOTE: This trick only works if the method you wish to wrap is located in a superclass of the class you have defined the module in. IE, if N were included in B directly rather than A, N#go would never be called.

Ode to Airport Security

Oh how I love, to stand in line
I stood here so long, I came up with this rhyme

We curse and we shuffle, from left to right
Seems like we’ll be here all damn night

The problem is hard, it’s NP complete
But after seven years, you’d think it’d be beat

I’m finally through, waiting here at the gate
When I arrive, I won’t know the date.

Apple TV 2.0

I have to gush about my favorite feature of the Apple TV upgrade. It’s a feature I’ve wanted since Apple TV came out. You can now use an Apple TV as remote AirTunes speakers!!

I love it.

Day to day rubinius

The venerable Eric Hodel (drbrain) has whipped the rubinius team into blogging more, so this should be the first of many posts to come.

Development front

We continue to push forward, getting rubinius running everyday ruby code. I continue to use irb under rubinius daily, and it’s proved quite stable.

Though, we’ve begun to hit the standard lib code that is quite tricky. Case in point, mathn.rb. mathn.rb adds some new methods to Fixnum and Bignum, as well as redefining some very core methods, like, Fixnum#/ (divide). We’re currently working through how to support this, because as is, when mathn redefines that method to start returning Floats, most of the rubinius system goes south. Thats because the kernel currently assumes that if it uses Fixnum#/, a Fixnum is returned, and it can pass that to other primitives and such.
We’re still unsure about the long term solution for this, but it does bring up a problem with having a very open, dynamic language, where the kernel of the language uses that same open, dynamic runtime. A user can change the behavior of a core, system method, and effect a lot more than they could in MRI. This is the famous double edged sword.
My hope is that we’re able to code the kernel a little defensively and really stabilize the kernel to the point that these kinds of changes wont cause the whole system to go south.

Conference front

Seems that 2008 is going to be the year of conferences for me.

  • On Feb 8th and 9th, Ezra and I will be at acts_as_conference, giving a charity tutorial talk about how to be a better ruby developer.
  • Next, I’ve been invited to give at talk about rubinius, as well as the “Party” keynote (not sure what the party part is actually) at RubyFools over in Copenhagen, Denmark. This should be a lot of fun for a few reasons.
    1. I’ll be giving a non-Rubinius specific talk for once (the keynote), which is something I’ve wanted to do for a while.
    2. I’ve never been to Denmark, so it’s always fun to visit a new city.
    3. Abby is coming too! She’s going to sightsee and such while I’m at the conference, and we’ll have a few days before and after to take day trips and such. She loves europe, it should be a great time.
  • There a bunch after this, but I’m not yet sure which I’m going to. The total is something like 5 or 6 before June 1st, so it’s going to be busy busy busy no matter what.

More to come!

p = mv (aka Momentum)

There have been some amazing things happening in the rubinius world in the last week.

  • compiler2 (aka. c2) became stable enough that we began using it for everyday usage. This is big in my mind because c2 is vastly superior to compiler1. Also, it shows the power of the approach we’re taking. To my knowledge, we’re the only project to write 2 completely separate compilers for our project. We were able to do this primarily because both compilers have been written entirely in ruby.
  • Springboarding off c2 being stable, Kernel#eval() got implemented in really an incredibly small number of lines and in a short amount of time. I began working on it at midnight one night. Worked til 3am that night, then again the rest of the next day. So in about 12 hours, we had a eval that passes the craziest of eval specs. Most of the reason for that was again the approach. Having MethodContext’s be first class objects made getting all the proper info for eval simple. The architecture for c2 made it super easy to add in the custom logic to make eval operate properly. Again, all done 100% in ruby.
  • Again, coming off eval working, irb became functional. Getting it running flushed out a few bugs (5 I think), and it runs perfectly. This is big for anyone that has looked at the irb codebase. It contains a lot of meta-magic as well as a large, complicated ruby tokenizer for parsing the code the user types in. This is giant news for me because it means that the project has finally built the codebase and momentum to run real world ruby code. I almost wet my pants when I fired up irb and it worked.
  • Galvanized by the success of irb running, I started in on rake. Rake contains a great set of tests to use to make sure rubinius is actually working properly, so I started in on those. I first found that rake’s test use flexmock, another library that jim wrote. So I started on flexmock’s tests instead. That exposed a few bugs, but after about an hour, it was running at 100%. Flexmock uses a TON of meta-magic, so having it work validates a lot of work we’ve been doing. On to rake’s test, again, flushing out bugs. I managed to get everything working except for a few things that depend on our currently broken Dir.glob.
    But in the end, I was able to have rake process our Rakefile properly and spit out the list of tasks and run a few of them. So awesome.
  • Lastly, we removed compiler1 completely and renamed compiler2 to compiler. The compiler is dead! Long live the compiler!

Man, it’s been a busy last 7 days. I’m so excited about these developments because as I said earlier, it means that the momentum on the project has built up the point that we’re making real progress running real world code. This is a seminal moment for the project. Up to now, we’ve been running uphill, slogging through specs and trying to get enough code in place. We’ve finally reached the top of the first hill and now we’re running downhill, headed for 1.0. There are a lot more hills in front of us, but clearing the first one means that there is no going back.

My hat is off to every single rubinius committer and person that has helped out on the project, most of all EngineYard. We’d never have reached this milestone in the project this early if it weren’t for all of you. Find me on the street and I’ll buy you a coffee, or in a bar and I’ll buy you a beer.

Enjoy!

EY Rubinius sprint

Brian, Wilson, Ryan, Eric, myself, and a rotating cast of dozens are having the first of many Rubinius sprints here in beautiful San Francisco.

In addition to getting all kinds of great code done, we’ve been filling out a lot of paperwork, because I’m proud to official announce that Ryan Davis and Eric Hodel are now officially EY employees, working on Rubinius!!!
This has been a few months in the works, and I’m thrilled that it was worked out as well as it has. Eric has been working hard on getting rubygems running, while Ryan has continue to plug along on his new ruby parser. I can’t say enough about the skills that these guys bring to the project.

In addition to Ryan and Eric, Wilson Bilkovich and Brian Ford will be starting with EY to be paid to work on Rubinius in the January. Again, I’m so amazed and thrilled that EY is providing Rubinius with the funds to let these guys work on a project they love fulltime.

More to come on the details of the sprint. We’re deep into day 3 and getting a lot done.

Rubyconf wrap-em-up

I’ve arrived back in Los Angeles after another amazing RubyConf. Everything that people say about the conference is true, it’s really a wonderful atmosphere and I enjoyed every minute. Tons of great talks and it’s so much fun to see all the fun ruby people all at the same time.

We (Wilson, Brian, and myself) didn’t get a lot of rubinius code written, but thats OK. We discussed our next steps and gave a lot of demos to people interested in the project.

Off hand, I’d guess that 6 or 7 patches were submitted by people at RubyConf (drnic, Tom Macklin, and I’m sure more), which is awesome. Last year, it took a week or so before people began contacting me about getting into the project. My hope is that that happens again this year.

I know I’ve said it before, but I have to say it again. I’m so happy with the momentum of the project and the community response. It means to me that the project will be a success because there is enough will and brain power behind it to get it over any potential barriers.

Well, thanks to everyone for the generous praise on the talk and the project.

Happy Tsar Bomba + 1 day!

I missed that yesterday was Tsar Bomba day.

The fact that humanity created 1% of the suns output for 39 nanoseconds is.. scary and interesting at the same time. Now if we could only use something like that for good rather than killing the planet (humans are included under that), we’d be set.

It’s really too bad that the Cold War was about who could have the biggest, baddest weapons arsenal and not which country had the highest standard of living. What a bizarre and wonderful world that would have been.

That concludes today’s science/humanity lesson today, now back to your regularly scheduled quagmire.