Sun, 18 Jul 2010

LLVM Backend : Milestone #1.

About 3 weeks ago I started work on the LLVM backend for DDC and I have now reached the first milestone.

Over the weekend I attended AusHac2010 and during Friday and Saturday I managed to get DDC modified so I could compile a Main module via the existing C backend and another module via the LLVM backend to produce an executable that ran, but gave an incorrect answer.

Today, I managed to get a very simple function actually working correctly. The function is trivial:

identInt :: Int -> Int identInt a = a

and the generated LLVM code looks like this:

define external ccc %struct.Obj* @Test_identInt(%struct.Obj* %_va) { entry: ; _ENTER (1) %local.slotPtr = load %struct.Obj*** @_ddcSlotPtr %enter.1 = getelementptr inbounds %struct.Obj** %local.slotPtr, i64 1 store %struct.Obj** %enter.1, %struct.Obj*** @_ddcSlotPtr %enter.2 = load %struct.Obj*** @_ddcSlotMax %enter.3 = icmp ult %struct.Obj** %enter.1, %enter.2 br i1 %enter.3, label %enter.good, label %enter.panic enter.panic: call ccc void ()* @_panicOutOfSlots( ) noreturn br label %enter.good enter.good: ; ----- Slot initialization ----- %init.target.0 = getelementptr %struct.Obj** %local.slotPtr, i64 0 store %struct.Obj* null, %struct.Obj** %init.target.0 ; --------------------------------------------------------------- %u.2 = getelementptr inbounds %struct.Obj** %local.slotPtr, i64 0 store %struct.Obj* %_va, %struct.Obj** %u.2 ; br label %_Test_identInt_start _Test_identInt_start: ; alt default br label %_dEF1_a0 _dEF1_a0: ; br label %_dEF0_match_end _dEF0_match_end: %u.3 = getelementptr inbounds %struct.Obj** %local.slotPtr, i64 0 %_vxSS0 = load %struct.Obj** %u.3 ; --------------------------------------------------------------- ; _LEAVE store %struct.Obj** %local.slotPtr, %struct.Obj*** @_ddcSlotPtr ; --------------------------------------------------------------- ret %struct.Obj* %_vxSS0 }

That looks like a lot of code but there are a couple of points to remember:

This includes code for DDC's garbage collector.

DDC itself is still missing a huge number of optimisations that can added after the compiler actually works.

I have found David Terei's LLVM AST code that I pulled from the GHC sources very easy to use. Choosing this code was definitely not a mistake and I have been corresponding with David, which has resulted in a few updates to this code, including a commit with my name on it.

LLVM is also conceptually very, very sound and easy to work with. For instance, variables in LLVM code are allowed to contain the dot character, so that its easy to avoid name clashes between C function/variable names and names generated during the generation of LLVM code, by making generated names contain a dot.

Finally, I love the fact that LLVM is a typed assembly language. There would have been dozens of times over the weekend that I generated LLVM code that the LLVM compiler rejected because it would't type check. Just like when programming with Haskell, once the code type checked, it actually worked correctly.

Anyway, this is a good first step. Lots more work to be done.

Posted at: 22:18 | Category: CodeHacking/DDC | Permalink