The Road to 2.2

From: Andy Wingo Subject: The Road to 2.2 Date: Fri, 17 May 2013 23:49:07 +0200 User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

Friends! Schemers! Gentle Guilefolk! The time has come to begin in earnest on the road to Guile 2.2! I know all of you are wondering, "did that paragraph really need four exclamation marks?" Well the answer is yes, yes it did, and the rest of this mail lays out the reasons. So, what does 2.2 mean, why is it appropriate now? 2.2 means many things on many levels, but on its most basic level, it will result in a new series of Guile deployments. This series will be installed in parallel with Guile 2.0 [0]. It shouldn't disrupt people that develop or distribute C extensions to Guile or programs written in Guile. True, the binary named "guile" on your system might be 2.0 or 2.2, depending on what you have installed, though one can get around that with passing --program-suffix to ./configure. In total, the disruptive impact of a new series is low. [0] http://www.gnu.org/software/guile/manual/html_node/Parallel-Installations.html On the other hand, creating a new development series does have a cost in splitting our development effort away from the stable branch, and so we have to justify that. This is the second part of the question, "why now?" The answer is that we have important improvements to make to the Guile implementation that can't be made in a strictly binary-compatible way. We think these things are so exciting as to be worth the inconvenience of a new stable series. We might still make a few more 2.0 maintenance releases, but when I describe what we have on tap for 2.2 you will see why we're excited to move to the next phase. RTL VM ====== Guile 2.0 added a compiler and a virtual machine to what was previously a pure-interpreter implementation of Scheme. I know we all are pretty stoked about that, but some time has passed and it's time to look at what we have and see how we can do better. The Guile 2.0 VM is a stack machine. That means that its instructions usually take their values from the stack, and produce values (if appropriate) by pushing values onto the stack. The problem with stack machines is that they penalize named values. If I realize that a computation is happening twice and I factor it out to a variable, that means in practice that I allocate a stack frame slot to the value. So far so good. However, to use the value, I have to emit an instruction to fetch the value for use by some other instruction; and to store it, I likewise have to have another instruction to do that. For example, in a stack machine, compiling "z = x + y" might mean: (local-ref 0) ; x (local-ref 1) ; y (add) (local-set! 2) ; z The cost of interpreting a bytecode is largely dispatch cost, which is linear in the number of instructions executed. So factoring out a computation to a local variable can sometimes actually cause code to run slower, depending on the relative cost of the computation versus accessing its operands. Pushing and popping values can also be expensive as we are updating a stack pointer all the time, possibly checking for overflow as well. The solution to this problem is to use a "register machine". I use scare quotes because in fact this is a virtual machine, so unlike a CPU, the number of "registers" is unlimited, and in fact they are just stack slots accessed by index. So in a register machine, "z = x + y" might be: (add 2 0 1) ; register 2 = register 0 + register 1; Of course the code for a register VM has to be bigger: instead of _byte_ code that doesn't need to encode its operands because they are expected on the stack, register VM code has to encode the "register names", so it ends up being better to implement it as _word_ code. But this cost is much less significant than the dispatch cost. Anyway, that's the back-story. For Guile 2.2, I have implemented a new register-based virtual machine. For no good reason I called it the "RTL" VM, for "register transfer language". RTL is a name for a particular kind of intermediate representation in compilers, but perhaps it isn't the most appropriate denomination for what we're doing. In any case the name seems to have stuck. Perhaps we need a marketing guru here to find some other compelling name :) Given that we're changing the bytecode... ========================================= So RTL is a new bytecode, a new VM, and will ultimately need a new compiler. I'll talk a bit more about the compiler in a minute, but I want to mention another aspect of RTL. Besides being faster than the Guile 2.0 VM, RTL opens up other interesting possibilities. One of the most exciting is the ability to statically allocate all kinds of data. In Guile 2.0, when a .go file loads from disk, it allocates all of the constants that it needs on the heap. But that's sub-optimal in many ways. For example, a pair of immediates like '(1 . 2) doesn't actually need to be allocated at all -- it can just be part of the loaded image. Likewise a pair with non-immediates like ("foo" . "bar") can also be allocated in the image, but it needs fixing up at runtime so that its fields point to the "foo" and "bar" values. In this particular case the "foo" and "bar" would be values also in the image. What I mean to say is that we can, at compile time, allocate and lay out the memory for the constants that a program will need, compute the minimal set of initializations that those constants will need, and make the right links from the compiled code to the constants. This is exactly like what the C compiler, linker, and dynamic loader do -- so we take pointers from them and do exactly the same thing. The RTL branch writes out compiled data to ELF files. When a .go file is loaded, it runs a thunk to do the run-time initializations, then returns the "main" thunk for the .go (which the caller can run or not). Note that we use ELF for all platforms. Since we don't rely on the platform's linker or dynamic loader, there's no sense in using the format of each individual platform. That said, the files that Guile produces are readable with readelf, and it should be possible to operate on them with standard ELF tools. Guile's linker and loader are careful to separate read-only data like bytecode and writable data like those pairs that need runtime initialization. In this way we can maximize sharing between processes. Also, Guile puts a bunch of data that's not needed in normal execution at the end of the file, in the read-only section -- like docstrings, for example. If a program never asks for procedure-documentation of a procedure, those docstrings will never be paged in by the kernel; but if a program does do a lot of documentation grovelling, they are readily accessible. Separating the different metadata into different sections also makes it possible to strip metadata. For example when deploying to systems with not much disk space, you might strip docstrings or line number information. You can use standard ELF tools to do this, or do it using Guile's ELF tools. Finally, using a well-structured format like ELF brings in the possibility of linking together a number of modules in one file, and possibly even an entire application. We don't have this working yet, and it's not in the immediate plans, but it is a great possibility to keep in mind. What about the compiler? ======================== So we have a new VM, assembler, disassembler, and all the things! It's pretty close to finished: it's just missing the part that lets a debugger know about line numbers and local variables. But how do we actually produce RTL assembly from Scheme? Well, there we don't have a finished story. Noah Lavine made a branch to compile tree-il, our high-level intermediate language, to a "continuation-passing-style" (CPS) form. This is a useful transformation for many reasons, but among them it is convenient for giving a "name" to all values -- a property that a register VM prizes. His CPS compiler isn't done yet, though when I last looked at it, it was looking great. So the path goes Scheme -> Tree-IL (something we already have), then -> CPS (something Noah did), then -> RTL Assembly (again, Noah), then -> bytecode (the piece I did). It doesn't yet work with all of Guile but it's getting there. How to get there? ================= However, I think we've done all we can in branches. I think we should bless this RTL experiment as the way to do Guile 2.2. (Thoughts or objections welcome.) To that end, I think we need to start merging wip-rtl into master. What I propose is that, if we agree, I start merging in trivial stuff. When I get to something interesting that's not just a refactoring of existing things, I'll submit that patch to the list. We have a few days to look at it, we go through some review, and then we figure out how to move forward -- usually merging some version of that patch. Then we repeat. Once the RTL branch is all merged in, we can start doing the same with Noah's wip-rtl-cps branch. Then eventually some glorious day comes and we start using the CPS/RTL toolchain, everything is working great and fast, and we start deleting the old code. What do you all think? I love it when a plan comes together! Cheers, Andy -- http://wingolog.org/

reply via email to

