1. We are here at Ruby conf 2007 in Charlotte North Carolina. I am here with Evan Phoenix of the Rubinius Project. How about we get started with you introducing yourself, what you do, how you got to Rubinius? Sure. Right now I am employed by Engineyard who has graciously allowed me to work on Rubinius. Originally I was working about 50% of the time on Rubinius and 50% of the time on support issues for them. In the intervening months I moved more and more working on Rubinius, currently it's about 100% Rubinius right now.

2. How did you get started with Rubinius? Did you get started with Smalltalk? It all started a few years ago when I was working at a project called Sydney, which got me interested in how the Ruby interpreter was working. I went and actually did a big clean up of the existing interpreter, learnt the internals and found that there was a lot of hairy corners, and there were a lot of things that I felt like I wanted to experiment with. So I started to write little experiments here and there. I got to a certain point where the experiments outgrew the existing architecture, and it was about then that I thought maybe I could write my own Ruby interpreter and that's where I took it from there. I had been aware of Smalltalk and so I used it for a spring board during research, getting Smalltalk books, and immersing myself in that world trying to figure out how they solved similar problems. Since then I ironically have never written any Smalltalk code for work, just playing around here and there, but I have a merely text book knowledge of it.

3. Were you using Squeak Smalltalk or any other Smalltalks? Yes, I have experimented with Squeak, but that's pretty much it. In doing research and looking for ways for improving Rubinius I looked at other things: Smalltalk X is one, there are a lot of other Smalltalk implementations that you can find.

4. Strongtalk? Yes, Strongtalk. Unfortunately the Strongtalk source is very hard to get into so I have used it not so much for taking code out of there but rather using it for ideas about how they solved some problems.

5. Could you quickly explain the basics of Rubinius. How does it relate to Smalltalk? What does it do? Sure, it uses the same architecture decisions as the first Smalltalk VMs, a lot of the current Smalltalk VMs use. There are a couple of things: basically make everything first class, keep the internals of it very small and provide a very small number of primitive operations and build everything on top of it. It's not so much like a real Smalltalk, it's architecturally similar.

6. I think you based most of your work on Squeak Smalltalk. Are other Smalltalks implement the same way, as I think they are implemented mostly in C, C++? There are some differences and unfortunately for the Smalltalk world, it is hard to find the source for a lot of Smalltalk VMs. There are some that have been abandoned and you can find the source from Squeak being open source it is really easy to find a source for it. But I have used documentation from things like Visual Works to describe , doing certain optimizations like stack frame allocation, and that kind of things, because they write those documents out. And I have used those to get ideas for how to build up Rubinius.

7. The basic VM of Rubinius is written in C. Is that true? In the beginning there were plans for writing even the shotgun VM which is the VM written in C and Ruby, with a language called CUBY. Yes it is written in C. We never came with a good name for it, it was like CUBY and we talked about having a name for it. Actually the eventual goal is to reduce the amount of C code that we have hand written and start to generate more C code. We already do some kind of C meta programming in that we do things like we have two files that are in shotgun that when run as Ruby files output large snippets of C code which are included and that's mainly for stuff like the primitives and for the instructions, which allows us to keep them organized and also to do some preprocessing on the actual code before it gets compiled. A lot more pre processing that you can do with CPP and normal C preprocessor. But as we move forward into the end of 1.0 and into 2.0, whatever that might be, the aim is actually to stop writing C and build up that tool so we can really begin to write in it as much as you can in Ruby or Ruby-esque, and have it then output to C.

8. Cuby or whatever it is called, would that be Ruby with features cut? That's funny, part of the reason we haven't used it is because I haven't decided on a really good implementation for it. We have gone back and forth on how we think it should work. Initially there was the idea that you would write some code and it will look just like normal Ruby and it would function exactly the same as normal Ruby, you could run it. It could use all that core class stuff array bracket and fixnum plus and all that kind of thing, and we may still do that. Another approach that I have been banging around is the idea that it would be syntactically correct Ruby, but it will in fact use a completely different hierarchy of classes, what you would actually do is call methods, do normal operations, but there is no real expectation that they are functioning on normal Ruby strings or Ruby arrays or fixnums they would actually be operating on totally different class hierarchy. In that way you can go ahead and implement that class hierarchy in normal Ruby and still run this code, but then you can change the assumptions so that it is easier to generate the C code. One thing that I struggled with initially about doing this was the ability to do type inference, the way I have written it required some fairly sophisticated type inference to try and generate the code and I couldn't really get it working really well and that's why I decided to handwrite the C for the time being. So anything that we can do to make the type inference stage simpler is better. Things like having Ruby code that looks like Ruby but has completely different assumptions lets us do things like assuming types about method names and maybe no operator overloading, so you can assume certain aspects of what the code implies.

9. We just recently heard here at Ruby conf that you rewrote the Rubinius compiler that compiles Ruby source code to Rubinius byte code. Can we talk about the motivations for that? Sure. The compiler that we use now, which is sort of like 'Compiler One', grew out of my original prototype compiler. That one ran on 1.8 and the idea was to just run and generate some kind of bytecode that we could then load into the shotgun VM that we were writing separately. For about 6 months the only way to get code into Rubinius was actually to compile it in 1.8 and have it saved to a file and you could load that file into Rubinius, the compiler wasn't working yet inside Rubinius. It was cobbled together with my initial idea of what the compiler should look like. They were all my initial assumptions and all my initial ideas. In terms of a prototype means that it probably that more than half of them were wrong, just because it is a prototype. And as we have gone further down the road I realized I have been uneasy with the stage that the compiler is at. As we worked on more syntax and get to more edge cases, the compiler has started to increase in complexity none linear to the number of lines of code. So every feature we add seems to add, like if there were 3 lines that we needed to add seems that the complexity has gone up like 2x or 3x. There is something wrong with initial architecture. So I took a step back and decided to reevaluate the architecture and that was why I have started to rewrite it. Being a rewrite of the prototype it has gone fairly quickly because I can use the existing one as the basis for the new one.

10. Talking about compliers: there has been a discussion about supporting different languages on top of Rubinius. What's your impression of that, what's your idea about that? I would love to. In fact I have been talking with Ola Bini a few times about the ability of developing a simple Lisp that would run on top of the Rubinius VM. I think that there is a real big, market isn't the right word, mindshare maybe, for the idea that you can do this based on the fact that the compiler is very approachable being all written in Ruby and if you can write some kind of parser for your language is very simple to use the built in compiler classes: we have some classes called "Assemblers" and "Encoders" to basically drive these two classes to generate bytecode for you. So we are doing things like if you are compiling some new language and you decide "Ok, I need to call this method" it is very simple, we have a whole class hierarchy, we have a very fairly well defined interface of saying "Ok, I want to generate some bytecode that does these operations". That is very simple it's all in Ruby. So I think there is a big opportunity there for that to be pushed forward.

11. How easy would it be, what would you have to do to create a new compiler? Could you do that at runtime? With irb? Or do you have to rebuild Rubinius? Either really. I mean it being a normal built in class you have to require it, to bring in a complier which might not be available yet. Then you can do things like if you decide that you want to create a system like a method directly, let's say you decided that there are some operations that you want, some order of operations that you can't represent properly in Ruby, you can actually drive the assembler directly and say: "Send this method to this object and if that doesn't work then jump to this bytecode". You can drive it all directly right there. Because it is so simple it is very easy to write. If you had a parser for a new language, we saw a demonstration of Packrat at Ruby Conf which is a PEG parser, you could easily have a Packrat kind of parser and as it was completing each thing it was actually calling the Rubinius assembler and say "Ok, I am doing this operation, I need you to call these methods". And it would build it up as it went through it. Being fairly simple, it is not a very complicating process.

13. So there is no Ruby intepreter, there's a bytecode interpreter? Yes I have played around with the idea of having an interpreter where it was actually one that was written entirely in Ruby and it would just basically live on top of the VM, the bytecode VM.

14. One of the interesting features you added is very fast debugging support. Could you talk about how that works? Sure. One thing is that Rubinius community unfortunately doesn't have very good track record of using a debugger unlike like other programming languages and some people have attributed that fact to the debugger itself while debugging things, slows down the program you are debugging, like orders of magnitude. And I know for a fact that other runtimes, even such as C, which is actually where I got the idea for the Rubinius debugger infrastructure, is able to do things, and even other Smalltalks ones can do things at totally false speed. That is where I got the inspiration for it. The way that it works is as follows: the current debugger in 1.8, 1.9 sets a function to be called every time a certain operation occurs so that you can do things like every time a line is hit you can check "Oh am I supposed to break on this line? If so then run some stuff". Unfortunately that means that every line you have to be checking, should have been a breaking here. The way around that is actually with a thing called bytecode replacement.



There is a special bytecode in Rubinius called "Invoked debugger". And whenever you want to set a break point you basically go find the method in question that you want to set, that is a completely different class, how to find it is a completely different problem. Once you found it you can easily figure out which operations correspond to which line, or which method call or whatever you want to locate it. And then you swap out the current instruction with the invoke debugger one and then you sit back and wait. And what happens is when the VM is executing the bytecode and hits an invoke debugger it will actually send - Rubinius has this concept of channels in the VM, the channels control when threads run if a channel is empty and the thread is receive on it, then the thread will go to sleep and the VM will wait and if someone sends something down that channel it will wake up the thread so that it can receive that object. That's exactly how the debugger works.



The thread that has the code that is being debugged, when it does invoke debugger it sends the method context object from where it is to over a channel to the debugger. So the debugger wakes up and says "Oh someone is being debugged now", it gives us a method context object and then it can figure out from there "Oh, ok, I am here, I am at this location, this is the instruction pointer, and that means I am at this line" and then you can debug from there. It works very well because you don't do any work until you need to. Just with the bytecode replacement.

15. Is this also how you implemented the profiler? No, although we had plans of implementing right now the only profiler you have is the sampling profiler. We didn't implement an instrumenting profiler but our plans actually are: we can go ahead and implement the same way that 1.8 has its profiler which is to go ahead and have the ability to add hooks, so whenever you enter a method or exit a method you can run some code. Something better that I'm thinking about now, again works the same way as C does, which is you compile a file you have to indicate "Yes, I want to go ahead and compile this for profiling and it actually puts in some code before the method runs and when the method exits. You don't really want a profile the kernel you want to profile just your library. You compile your library with profiling support and then you go run your library and it calls the profiler as it needs to, and then later on you can easily dump the results. That's for the future.



What we have now is a sampling profiler which is very simple. All you do is you turn it on and you say "How often would you like the sampling profiler to see what's going on?". Say 10 milliseconds. Every 10 milliseconds it says "Ok" and it looks at what the VM is doing and it saves the current method context object in an array and does that every 10 milliseconds, 100 milliseconds however long you want. And then when you call "Stop" on it, it gives you back this giant list of method context objects. And then the method context objects are really rich, you can go and look at them all and figure out "Ok I got hundreds out of thousand samples, 500 of them where in one method. Obviously I am spending a lot of time with that method and in that method I was in one section of code 490 of them. So obviously that section of code is the one that we are spending a lot of time in". But you can work from there. It is not as accurate as an instrumenting one but it provides a real low impact way of profiling because the fact that the method context objects are already first class. The only thing the VM has to do is to record what was going on, in C I used some research that I have found about how to make sure that basically it's just one memory write. Every time it comes around it just does one memory right and then it is done.

16. Is there also a way to save the context object? I think context objects are serializable in Rubinius. Is that true? The plan is to make them. I haven't done it yet just because we had other issues but they are architected to be serializable, so that you can dump an entire spaghetti stack of what was executing one time to disc, read it back and continue executing.

17. Would this be kind of a bit like Smalltalk images in a way? Yes, you could actually engineer your application that way so you could start it up, let it get to a certain point and then save it out to disk. And the next time you un-serialize it, you just continue where you saved it out to disc. So if you have an application, well there are some kind of issues with that, right now it is just a stack, not the whole heap that gets saved out to disk, and that is partially what you need in order to do stuff like fixed initialization, say your program takes 10 minutes to initialize but once it is running it runs great. Unfortunately unless we can serialize the heap you can't really save the whole thing after 10 minutes, but I have read the code and done some research on how to dump the heap properly, how to read it back in so you can do stuff like implementing Smalltalk images and I have every indication to think that eventually Rubinius will have an image based option. Simply because it handles a certain class of problems better than the file based one.

18. There is a company called Gemstone which provides object oriented databases, has done so for a long time, they do them for Smalltalk, they have their own Smalltalk version where build their objects in the database, and they have expressed an interest in Rubinius and using Rubinius. Could you elaborate on that? Yes, sure. They approached me, I think at RailsConf 2007 about the idea of getting Gemstone into Ruby. We had some talks and they sent some announcements that they want to go ahead and build in the Gemstone C++ library which is this large object logic persistence library that they have been working on for 25 years. And they wanted to go ahead and hook that into Rubinius. We are not done yet, they have had their engineers busy, they wanted to wait until we were close to 1.0 which actually works best for us. What we have done is done a little bit of leg work getting the object memory subsystem ready for them, writing a few hooks in place so that when they are ready to start they can do it. I think it is a really big thing because it is going to mean that Rails applications running under Rubinius will have access to the Gemstone object persistence layer to store everything which is a really attractive alternative to using a relational database.

19. Do you know anything about how they are going to implement this, how they are going to hook into Rubinius? Yes, we have done a little bit of leg work for them, we allocated some space called the literal "tag range" in Rubinius object memory subsystem, so that when they are ready to hook into it they can. I am not completely up on Gemstone's technology, vocabulary, or even exactly the whole thing, but my understanding is that when an object is on disk it is represented in the system with just like an integer where that integer indicates either a listing in a table or an address that the object is really located at on disk something to that effect, a serialized form, or a pointer to that object. And we have allocated space for them to go ahead and put those in, and then it seems like they have a couple of different ways, I actually asked them this, they have a Java product as well, and where they hook in with the Java product, and they said they actually get a license from Sun to write their own Java product. So they have their own JVM because they had to hook in really low layer to Java to get the object persistence layer in, with Rubinius because of the way that the objects work, for the most part it requires one hook in the VM, which is the ability to read an instance variable. So whenever you read an instance variable you may or may not have to fetch that thing off disk. We basically provide them with those hooks in those places to pull things in off disk.

20. Talking about the heap, there is an interesting feature in Ruby that is called "objects space" that allows Ruby developers to iterate all the objects on the heap. What's the current status of that in Rubinius? In Rubinius the iteration part each object which is a main part of object space is not implemented. I want to have some kind of implementation for it by 1.0. I am really interested in how to implement it in an efficient way. I know that JRuby implements it in a kind of a brute force way, because that's the only way you can do it in Java and I am exploring different ways of doing it efficiently. The reason it has become difficult is that each object is really tailored to the memory model of 1.8. And that is that objects once they are allocated in memory they are stuck at that address. So it is very easy to walk all the objects from one side to the other, because they are always going to live in the same place. Rubinius uses more modern techniques for object memory mainly the ability to compact and move objects from one place of memory to another. In fact that is the only way that the young generation in the object memory space works is to actually copy things a lot.



Consequently if you are iterating over all the objects and half way through you decide you need to copy everything from one section of memory to another to compact them, it seems like it's non deterministic to figure out where you were in the first place and continue on in a linear way. And so I need to do more research. If it turns out that the only way to do it is to do it the same way that Java does it, which is basically that the VM keeps a giant hash of weak references to all the objects that are in the systems and then just walks over the weak references, like JRuby does it, then we will implement it that way. But I am hoping that we can figure out some more efficient way to do it. One option that I have considered although not efficient at all is basically putting a mode that the garbage collector could be in where it no longer collects. So turn off the garbage collector while you are iterating over the object so that things don't move. And then once you are done, go ahead and give things back. That has some kind of nasty consequences, because all objects get allocated in the old object space then, and then you basically have to just continually allocate more and more memory because you can't move anything around. That's an option, but that's more of a last resort.

21. You said effectively remove the new generation? Yeah, if you stop the garbage collector the object memory subsystem will let the young generation fill to capacity and then they will start allocating objects in the old generation. And then the old generation will then just continue all new objects will go to the old generation until you are done iterating, until each object had returned and then you could actually say "That took too long, go out and do a full garbage collection sweep over all objects and try to get everything back into order". It plays havoc with the garbage collector but I am confident that we will figure out a fairly decent way of implementing it.

22. Talking about the garbage collector, you added an interesting feature for a figure called the "fork-friendly garbage collector". What do you mean by that? 1.8 has a current limitation in that the garbage collector stores information, a garbage collector specific information about each object inside the object, the mark bits. And that makes sense from a data locality issue of storing it there. Unfortunately it is very bad for the virtual memory subsystems of operating systems. When you fork, pretty much all objects, all virtual memory subsystem in OSs these days, don't copy the memory from a parent to a child, they go ahead and let the child continue to use the pages, the sections of memory that the parent was holding on to. In a normal C program, when you fork, 90% of the time you continually used the pages of the parent, it could be years that you continue to use them.



With Ruby 1.8 the big problem isonce you forked and the garbage collector runs again it is going to walk over all the objects again and it is going to be basically dirty or set new pieces of information on every page. So the minute the garbage collector runs to a child, the operating system has to double the amount of memory it is using. The way that we approach it in Rubinius is: the young generation is, by the way that it is architected has to be copied, because it is a copy compacting collector you are always going to have to basically duplicate those pages. The young generation is only about 3 MB, that's very easy. The old generation on the other end, the objects are allocated in one section of heap. And then the garbage collector information about those objects is located in a completely different section of memory. So when the garbage collector decides "I need to make sure whether or not these objects are still good" it uses this mark table to catalog each object. So without having the right into each object it can keep track of what it needs to be doing. And that allows the OS to continually share these pages between the parents and their children.



That's a big win because in Rubinius and MRI as well, the implementation of all the methods is stored in the heap. The code for actually what to perform in those cases is stored in the heap. When you fork from a parent to a child, there is a 90-some % chance that you are going to be able to share 90% of your data between parents and children, just because they have the same implementations for methods and for classes and stuff. Those will continue to be shared between them. You can fork off a 100 different children whereas in 1.8 that causes you incredible memory pain, in Rubinius it doesn't really cause you any memory pain, because those will continue to be shared.

23. This is a problem that exists in Java and other systems where you can't really share application level code. You mentioned the spaghetti stack. That's used for implementing continuations. Partially yes. In a normal implementation, the stack of execution or when you go to execute some functions the data about how that execution is working is stored in a stack. The parent that calls the child is stored higher than the child and it continues down. With a spaghetti stack instead, the information about the execution is stored as a object, and that object has just a field called sender that points to the person who called this method. It is called a spaghetti stacks because it is a chain, they are connected from sender to object, to sender to object, back all the way to the top to some context that has no sender which means you reached the top. And the great thing about those is that those all live virtually in the heap, there are some optimizations underneath the hood that allows them to be created and destroyed very quickly in the case they don't actually need to live in the heap, but for the most part they can always be put in the normal Ruby heap, they can be normal Ruby objects, so continuations all they have to do is store the most current method context about that whole chain of things in time. And then later on when it wants to go ahead and continue to run it just has to make that method context object the current one and then it continues to run. So you can duplicate method context objects to copy continuations that continue to run and you can do all that stuff to them.

24. How would you compare your continuation performance or implementation to other Ruby versions? It is certainly better than other Ruby implementations just by the fact that the way that it is implemented it allows itself to be implemented in a much simpler way. Rubinius is totally stackless which means that as you are executing Ruby methods you never advance the C stack underneath the hood. Although people have found creative ways of saving the C stack, it is a very problematic thing - saving it. Instead Rubinus just doesn't use any C stack to run things. You can always dump the entire Rubinius stack which is completely associated with the C stack. From that respective the continuations are very cheap, lightweight and simple.

25. Moving away a bit from technical information here - you've recently moved to the version management system, git which was written by Linus Torvalds for the Linux kernel What are your experiences with it? We have liked it a lot. There are places where it is not as user friendly as subversion, but the fact that it is all tool based, and we were able to do things like really getting an incredible about of information about the code and about how the development process is working out of git, is very simple. And that has proved to be incredibly nice. I personally have become fairly addicted to using local branches which is something where because every time you receive a "git clone" or "git archive" you have all the history about it, you can do things like create local branches and work on things. I have 8 or so branches, at a time usually for small fixes, for experiments, for things like I am working on. Sometimes I get tired with working on some experiment and I just go ahead and leave it in a branch and leave the checked in and committed, but just leave it locally. Continue to work on some other bug fixes or things that are more pressing. I can always go back to that code that is stored on disk on my laptop and work on it then. I love that feature of it. One thing that has proved to be invaluable here at the conference is that the upstream bandwidth to the Internet is fairly limited. All I had to do was set up git to serve out to git repository located on my laptop people can clone it locally or earlier I just started my git repo put it on a flash drive and get it to somebody else. They have all the history then, it is very simple to share then. It is actually really nice for these conferences/hack-fest scenarios where the locality of programmers is better so it is nice to have the locality of the code in source control.

27. Are there any issues with running git on Windows? It seems like that is being worked on. I know there are some issues, people can't run under Cygwin which is as Windows people tell me not a real solution, although the people working on it running under MinGW which is a GCC for native Windows and I think they are making good strides with that. There is a worry when we initially moved to git that we will be alienating some people that we are upping in the bar that they would have to jump over in order to contribute. I have set up a read only as subversion copy of our git repository that we used Tailor for that whenever someone commits to git those changes are automatically propagated over to this read only subversion archive. If people want to check it out, see it, and compile it, maybe write a couple of patches and send them in to us, they can do that with a subversion environment that they are accustomed to. And as they get more into the project we insist that they move to git. But that give us that new user friendly smell that they like.

29. What was the reason for git? Just flipping a coin? A little bit. I don't want to say "Because it is in Python and I am a Ruby guy". I had heard great things about Mercurial At the time, and maybe this is not true anymore but my understanding was that Mercurial didn't have local branches and that was something that I had become really accustomed to, actually originally with SVK I became really accustomed to it with SVK that was a feature I was determined to have. And so we decided to go down the git road.

30. To finish up, we want to ask you about, let's say people have watched this interview and are itching to contribute to Rubinius what's your management style. Are you the big benevolent dictator like Matz? I try not to be. There are certain things, like being founder of the project I can put my foot down on occasion, but I actually try to delegate as much work as possible. Not just because I am lazy, although I am a programmer so I am inherently lazy, but because that lets other people own parts of the code and I feel like code ownership equates to the desire to make it better. I have Brian Ford, who is the guy who works on his piece, the spec admiral, the spec general, I actually don't make any decision with relation to specs I decided early on that I ask him "Should we do it this way?" and he says yes or no and his opinion is the opinion. We have other people who have come on who have proven to be invaluable in certain aspects and I basically made them the guru, the owners of those sections of code. And I think that worked out really well, I am new player to software project management so I am doing what seems like the thing to do. And so far it has worked pretty well. We haven't had to cut off any committers because they got pissed of with the project. We had certainly have committers who have shown up done some work and then disappeared but that's normal for an open source project.