id Software released the source code for the Doom engine two days before Christmas 1997. Technically they released Linux Doom 1.10, a little different from the original code for the 1993 shareware release of Doom’s first episode.

Check out this awesome ten-year retrospective of the source on DoomWorld.

I’ve been curious about Doom lately. First I found the Visual Doom AI competition and decided to win it in 2018. Then I found a trove of historical deathmatches and made a video about them. After that, I got sucked into the friendly community with its flourishing mod scene. I tried making a level, inspired by Eevee’s excellent guide. And for the last couple of days I’ve been learning about how the underlying engine actually works.

As part of that, I’m trying to find a good reference implementation. I want a codebase that hasn’t changed the original behaviour of the game too much, but which has been cleaned up and refactored for clarity. John Carmack is a big believer in not “gold plating” your code, i.e. putting effort into making it beautiful past the point where it has any utility to your users. He goes into that in a bit of detail in this amazing tech talk:

Trust me: watch the whole bloody thing, it’s so good

The first thing I reach for when trying to understand big codebases is static analysis. And the simplest static analysis tool is counting lines of code. I use CLOC. Before I move on to more complicated code quality metrics (or before I dive into each and every codebase), let me show you my initial findings.

All C or C++, except for Mocha Doom which is purely Java. The counts are excluding comments.

I don’t want to spend time analysing these just yet. It’s hard to get much meaning out of raw code line counts, but here are some preliminary notes:

Doom in Java

Mocha Doom is the only source port which has actually re-implemented the Doom engine in another language (Java). All of the versions you hear about in the browser or on Android or the iPhone (check out this top-notch code review of the iOS version btw) use cross-compilation of C extensively.

Sceptical about its special status? The creator of Mocha made this extremely thorough FAQ to convince you.

Chocolate Doom is a lot bigger than I expected

I’m a big fan of Chocolate Doom. But for a port whose chief goal is to “accurately reproduce the experience of Doom as it was played in the 1990s”, there sure is a a lot of code. Can’t wait to dive in and find out why.

UPDATE: Linguica on the Doomworld forums just pointed out that I was including Chocolate Heretic, Chocolate Hexen, and Chocolate Strife in my count. It’s actually 56,207 SLOC, not 192,372. Impressively small!

What about other id games?

A lot of the codebases included entire big libraries wholesale (e.g. curl, zlib), so I cut those out. Otherwise we’d be seeing some pretty huge numbers. I’d’ve done the same with Quake 2 and 3, but I’m going away camping in an hour and need to pack. If anyone feels like running the numbers I’ll happily update the post :)

ZDoom has done a great job of recording their history

Yep.

Next steps

I’d like to try running some more sophisticated static analysis tools over a few of the potential reference implementations. I’m also very curious about this 3DO port from 1995, and how it differs from Linux Doom.