Hello, internets! Did you know that today marks two years of Guile 2? Word!

Guile 2 was a major upgrade to Guile's performance and expressiveness as a language, and I can say as a user that it is so nice to be able to rely on such a capable foundation for hacking the hack.

To celebrate, we organized a little birthday hack-feast -- a communal potluck of programs that Guilers brought together to share with each other. Like last year, we'll collect them all and publish a summary article on Planet GNU later on this week. This is an article about the hack that I worked on.

figl

Figl is a Guile binding to OpenGL and related libraries, written using Guile's dynamic FFI.

Figl is a collaboration between myself and Guile super-hacker Daniel Hartwig. It's pretty complete, with a generated binding for all of OpenGL 2.1 based on the specification files. It doesn't have a good web site yet -- it might change name, and probably we'll move to savannah -- but it does have great documentation in texinfo format, built with the source.

But that's not my potluck program. My program is a little demo particle simulation built on Figl.

Click to download video

Source available here. It's a fairly modern implementation using vertex buffer objects, doing client-side geometry updates.

To run, assuming you have some version of Guile 2.0 installed:

git clone git://gitorious.org/guile-figl/guile-figl.git cd guile-figl autoreconf -vif ./configure make # 5000 particles ./env guile examples/particle-system/vbo.scm 5000

You'll need OpenGL development packages installed -- for the .so links, not the header files.

performance

If you look closely on the video, you'll see that the program itself is rendering at 60 fps, using 260% CPU. (The CPU utilization drops as soon as the video starts, because of the screen capture program.) This is because the updates to the particle positions and to the vertices are done using as many processors as you have -- 4, in my case (2 real and 2 threads each). I've gotten it up to 310% or so, which is pretty good utilization considering that the draw itself is single-threaded and there are other things that need to use the CPU like the X server and the compositor.

Each particle has a position and a velocity, and is attracted to the center using a gravity-like force. If you consider that updating a particle's position should probably take about 100 cycles (for a square root, some divisions, and a few other floating point operations), then my 2.4GHz processors should allow for a particle simulation with 1.6 million particles, updated at 60fps. In that context, getting to 5K particles at 60 fps is not so impressive -- it's sub-optimal by a factor of 320.

This slowness is oddly exciting to me. It's been a while since I've felt Guile to be slow, and this is a great test case. The two major things to work on next in Guile are its newer register-based virtual machine, which should speed things up by a factor of 2 or so in this benchmark; and an ahead-of-time native compiler, which should add another factor of 5 perhaps. Optimizations to an ahead-of-time native compiler could add another factor of 2, and a JIT would probably make up the rest of the difference.

One thing I was really wondering about when making this experiment was how the garbage collector would affect the feeling of the animation. Back when I worked with the excellent folks at Oblong, I realized that graphics could be held to a much higher standard of smoothness than I was used to. Guile allocates floating-point numbers on the heap and uses the Boehm-Demers-Weiser conservative collector, and I was a bit skeptical that things could be smooth because of the pause times.

I was pleasantly surprised that the result was very smooth, without noticeable GC hiccups. However, GC is noticeable on an overloaded system. Since the simulation advances time by 1/60th of a second at each frame, regardless of the actual frame rate, differing frame processing times under load can show a sort of "sticky-zipper" phenomenon.

If I ever got to optimal update performance, I'd probably need to use a better graphics card than my Intel i965 embedded controller. And of course vertex shaders are really the way to go if I cared about the effect and not the compiler -- but I'm more interested in compilers than graphics, so I'm OK with the state of things.

Finally I would note that it has been pure pleasure to program with general, high-level macros, knowing that the optimizer would transform the code into really efficient low-level code. Of course I had to check the optimizations, with the ,optimize, command at the REPL. I even had to add a couple of transformations to the optimizer at one point, but I was able to do that without restarting the program, which was quite excellent.