Over the past week, I have been again exploring low-level functional systems, in which to write an RTS (notably a garbage collector) for high-level languages. The progress has been slim, but I’ll report on my findings.

I get the hunch that linear lambda calculus is unsuitable, but I haven’t yet convinced myself of that and it’s the closest I’ve got so far. So I have been trying to implement a minimal LLC runtime in C. Apparently Haskell has fried my brain and I can no longer write in C without segfaulting up the wazoo. Or maybe that’s what writing in C is like and I had just forgotten :-).

Said runtime is more complex than I’d like it to be. It is a fairly involved pattern-matching graph reduction machine. It needs a dedicated memory pool to allocate from and deallocate to, holding the graph currently under consideration (but when things fall off the graph, they are freed immediately, which is its charm). I was hoping that it would run in constant memory, but I don’t think that’s possible, so this is the best I could hope for (in some sense, this memory pool is the “stack”). I would like to find a more direct translation, say into a (parallel) sequence of instructions. But that seems to conflict with the graph reduction properties that I like, namely that code can be “compiled” on the fly.

I wonder if there is an incarnation which can generate short bursts of assembly and run them, thus taking the best of both worlds.

Still, I’m not sure it’s possible to implement an RTS in linear lambda calculus, so I may just be barking up a ditch.

Jeremy Siek pointed out that copying collectors are simple and composable; i.e. all they do is copy a root set into contiguous memory and throw away the rest. A copier for a composition is a composition of copiers. This is very promising, and is what I’m shooting for.

He also pointed me at the Categorical Abstract Machine (sorry I couldn’t find any public links), which I guess is fairly popular, and definitely a clever little thing. But alas, it appears to need garbage collection, so I’m not focusing on that level yet.

The boulder functional programmers group is doing a two-session focus on CUDA, and in particular next meeting we will be talking about functional high-level (at least in comparison to C for CUDA) specification of parallel algorithms for CUDA. I’m hoping that what I find for the RTS language will also be applicable to this task.

Until next time.