How I feel about Scheme’s performance.

I came across this post written earlier today, How fast is Scheme? Well…. which states:

I don’t know much about Scheme […] but it seems that the Scheme compilers produce quite sluggish code, at least looking through the grainy, distorted lens that is the Computer Language Benchmarks Game.

That seems to make enough sense to me. For one-off heavily numerical tests, Scheme pretty much sucks. Especially considering that to compile the code it often has to go through C.

He goes on to say:

SBCL comes out very much on top; in general the Gambit programs take two or three times as long to do their job (although I haven’t looked at memory usage). But as far as Scheme compilers go, Gambit seems to be improving things.

With the right declarations, SBCL has been known to outperform straight C as well. Its compiler is really quite something. And (IIRC) it doesn’t have the “handicap” of using C as an intermediate language.

Personally however, I tend to consider C generating Scheme compilers more practical for issues of portability and/or easier FFI than performance. Chicken’s a good example of braindead-easy FFI. And I feel where Gambit really shines is when you need many, many threads. It’ll handle hundreds of thousands of threads in a breeze. Termite is a good example of exploiting its threading capabilities. RScheme does real-time GC. SCSH has replaced every use I’ve had for shell/perl/etc scripts. There’s a decent chance that any random Linux box already has Guile installed. SISC and Kawa leverage Java’s JIT machiery and provide trivially easy Java FFI. Bigloo can compile to Java bytecode as well. MzScheme runs its own JIT compiler by default where it can, and is prefered to its C output. And there’s a JIT option for all the C-generating Scheme’s as well, compile with LLVM instead of GCC. I think it’d be especially interesting to compare MzScheme’s JIT to MzScheme’s C + LLVM. And if you’re going the Scheme->C route for performance reasons, I’d think Stalin to be the obvious Scheme to use, whether with GCC or LLVM depending on its expected run time (assuming there’s even noticeable startup overhead with LLVM, it’s embedded VM is really quite minimal). Many of them have really nice OO systems built in as well

But none of these tests run long enough to let any of the JIT options really shine. Basically, I say throw any of the JIT options into the mix, and make the tests long enough to really let them do their magic.

I very strongly suspect the same holds for Java, having run a Freenet node for quite some time it gets noticeably snappier once the JVM has had a chance to see it run for a while, especially for recent versions of Java. I’d also argue that’s exactly (other than its relatively primitive GC) what makes Java so incredibly horrid for client-side stuff such as applets, you’re loading a JIT compiler and GC you’ll hardly get a chance to even use before you’ve closed the window. But its JIT has been making leaps and bounds lately, as mentioned on Good Math, Bad Math:

About a year later, testing a new JIT for Java, the Java time was down to 0.7 seconds to run the code, plus about 1 second for the JVM to start up. (The startup times for C, C++, and Ocaml weren’t really measurable – they were smaller than the margin of error for the measurements.)

This is from the previous measurement of 1 minute and 20 seconds for Java. As I said before, SISC, Kawa, and Bigloo will happily use the Java VM. Straight C scored 0.8 seconds. OCaml kicked all their asses even before compilation, but that’s not the point here. If you really need every last bit of performance you can get though, OCaml seems to be worth looking into.

So yeah, I’m *very* interested in what the performance possibilities for Scheme really are, if nothing else out of shear curiosity. Maybe I’ll wind up running a few benchmarks of my own at some point, reiterating the tests, say, a hundred or a thousand times each… but in the end, even if this does increase their performance relative to SBCL… you don’t need to use an obscure implementation or do JIT tricks with SBCL in the first place. A lot of people havn’t heard of Stalin or LLVM. A lot of people don’t want to load one language (Java) to run another (Scheme). Although again, I’d question whether MzScheme’s performance is really so bad in the long run.

And I’d question whether it was really worth it in the first place. Fluxus and Impromtu are two obvious examples which come to mind, both heavy graphics/audio livecoding systems, which solve many problems the same way you would in, say, Python. Offload much of the heavy work onto libraries. There’s a PDF floating on the net somewhere about MzScheme controlling an array of telescopes, and of course there’s the US Navy’s Metcast project. SchemDoc, Scheme Elucidator, the LAML they’re both based on that you can just feed any XML DTD into and get a Scheme representation of that XML language in. SCWM and Orion window managers. MetaModeler for dealing with many/large databases. For web stuff there’s TeX2page, SISCweb, BRL, WiLiKi, the Hop framework, HtmlPrag, SXML, it goes on. There’s a lot of uses which don’t demand every last bit of performance from the Scheme implementation, and I’m just not really doing anything that does.

And if I were writing something and came across an annoying bottleneck? I’d likely take the NetBSD approach. Instead of trying to tweak things to run faster (LLVM, Java, implementation-specific declarations, etc…), I’d see if I couldn’t find a fundamentally more efficient algorithm first. Which reminds me, I still want those books.

[update] On the Reddit thread where apparently most of the views for this article are coming from (who’d have thought so many people would be interested in some Scheme noob’s opinions of language performance? Well over 500 hits already, scant hours later), there’s a link to a fascinating email thread discussing the floating point speed of Gambit-C. I have Brad Lucier’s paper printing to read later as I type this. There’s also a reply in the thread by Brad himself, my favorite part of which is the end summary, which is generally similar to the “NetBSD approach” mentioned above. Is there some sort of established term for this idea?

Anyway, I’ll certainly be rethinking my opinions of Gambit-C as “that threading/Termite implementation”.

[update 2] It’s morning, getting close to 1500 views for this post now… searching for r5rs performance on Google this is the 3rd result! I still don’t see why this is drawing so much attention, but as long as it is, have you seen Scheme Now!?

Scheme Now!, also known as Snow, is a repository of Scheme packages that are portable to several popular implementations of Scheme. Snow is a general framework for developing and distributing portable Scheme packages. Snow comes with a set of core packages that provide portable APIs for practical programming features such as networking, cryptography, data compression, file system access, etc. Snow packages can export procedures, macros and records.

[update 3] 1844 views as I write this on July 1st. Also now the first result on google for the aforementioned search. Wow.