Sol::Stuff

Porting from DOS to Windows

Step by step through Death Rally's journey to the new millennium

This article first appeared in April 2010 issue of Game Developer Magazine, Inner Product, pages 38-41. With a better layout. And pictures. So, the 'last May' below refers to May 2009.

Max Payne / Alan Wake creator Remedy's top down combat racing game Death Rally was released for DOS computers in 1996, and although it does run under the open source DOSBox emulator, it doesn't run very well. I felt that Death Rally was still a good game and wanted to get it into a playable form again.

So last May I got an idea, and thought, "what the heck, let's go for it." I sent an email to Remedy Entertainment, volunteering to make Death Rally open source. I didn't expect a reply; at the most, I expected a polite "no." Much to my surprise, I got a "maybe."

After a couple weeks of legal checking, we agreed that while an open source release would not necessarily be possible, we could probably work something out. And so it came to be that in July, I downloaded the source package for evaluation.

The first task would be to take a cursory glance at the material and see if the project was actually possible. I expect some of you to wonder whether there was any funny code. Sure there was. Take a peek at any large project you've done as a teenager over a decade ago and see if there's any funny code in there. I couldn't find anything truly "daily wtf"-worthy, though, and what I did find wasn't anything a few days of refactoring wouldn't fix.

Instead of refactoring, I took an archeologist's approach - I made minimal changes and marked my transgressions clearly in the source code.

Starting Blocks

The source software platform was DOS, Watcom C, and some Dos4GW-style DOS extender. The extender basically meant you could use more than 640k of memory, and would not need any weird code for data larger than 64k.

The game displayed in VESA 640x480 and MCGA 320x200 graphics modes, all with 8-bit palettes; there was no true color anywhere. There were also some per-frame palette change tricks that emulators have trouble with.

The source code was mostly pure C with a couple dozen inline assembly functions. There were a few missing subsystems, specifically audio and networking, which would have to be replaced completely anyway, as well as one file for which the source code was lost and only a compiled object was available.

Getting It To Compile

First order of the day: get the game to compile. I started a Visual Studio project, imported all source files, and checked what the compiler would say.

The Visual Studio and Watcom compilers disagree on several points, which is hardly surprising as the Watcom version used was about a decade older than the Visual Studio I used.

One of the obvious things is that Watcom considers chars to be unsigned, while MSVC sees them as signed by default. There's a compile option in MSVC for this, but in order to avoid confusion further down the line, I opted to do some search-and-replace operations to designate all chars unsigned (except for those that were explicitly set to be otherwise).

MSVC is also much pickier about types, so I got lots and lots of warnings, and even errors in some cases. Most of these were relatively easy to fix - some typecasts here, a prototype added there, sprinkle some parentheses around. One rather tricky bit was where Watcom and MSVC disagreed slightly on requesting the address of an array, so I had to manually patch things up in a few hundred places.

After fixing a truckload of errors and warnings, and stubbing all assembly functions as well as other missing symbols, I ended up with about 90 functions that needed rewriting.

No More Hardware Access

In DOS, there's not much of an operating system in your way. You could, and in many cases you must, access hardware features directly. For instance, graphical video memory was mapped to the real-mode segment 0xa000. This segment was usually (if not always) mapped to the direct address 0xa0000 in DOS extenders.

Higher-resolution VESA modes could be accessed most commonly in banks through the above segment. If you wanted to access more of the memory, you used some interface to switch memory banks, and then accessed the same segment again. Thus, the applications set a graphics mode (and possibly segment) and accessed the video memory directly.

I solved this by allocating a frame buffer big enough for 640x480 and creating a global variable g0xa0000, replacing all direct addresses with said pointer. The pointer would be updated to the beginning of the frame buffer on mode init, and to different offsets based on the bank switch calls.

Other video features were accessible through hardware I/O ports. The most important were the vertical retrace check and palette access. These I replaced with completely separate functions.

Data I/O

That one object file with no source code happened to house decompression functions for the game data. I disassembled it and wound up with about five hundred lines of assembly, which, from a cursory glance, did not look like the output of a compiler. Not completely inspired to reverse engineer the code at this point, I took a shortcut to more interesting things by using the object file to make a DOS application (using OpenWatcom) which decompressed all the game's data files, and wrote a simple hack to access the decompressed files instead. This was clearly not a final solution, but it allowed me to progress.

There were some small problems with this approach. Audio files were in a differently encrypted format, and some of the game's small animations were handled differently in the decompressor with parts of the data compiled into the executable instead of the data file.

I made a note that while the cutscenes were also compressed, the source code for the decompressor was in C. So if both compression algorithms were written by the same person, the algorithms might also be similar.

A few days (and a dozen rewritten inline assembly functions) later, I came to the realization that I had to get that decompression function to work. Strange bugs had started to crop up, most likely caused by bad or completely missing data not produced by my temporary hack.

Trying to find another easy way out, I compared the characteristics of the code with known compression algorithms, discarding most of them due to the requirement of overly large lookup tables or code complexity. The source code to Info-ZIP is invaluable for these kinds of things, as it implements most common compression algorithms, not only the ones found in modern ZIP formats. In the end, it was clear this was a proprietary algorithm, so I really did have to dive in.

I spent a couple days poring over the code and trying to re-implement what it does in C. Once I understood what the assembly code was doing, I took another glance at the section that decompresses the cutscenes and realized it's almost the same - except for some additional encryption. I made a variant of that code and the data problems went away.

With that bit done, I took a look at the audio files which had an additional layer of encryption. At this point, My Remedy contact, Markus Mäki, commented, "Who on Earth has been encrypting all these things and why?" Luckily, the source code to decrypt the audio files was found.

Application Framework

The way applications work in DOS is somewhat different from what most people are used to these days. Control was entirely in the application's hands. There wasn't any multiprocessing to worry about, and you could pretty much depend on the characteristics of the de-facto VGA standard. If there were problems, users were expected to manually play around with system configuration text files.

Since the whole game was vertical retrace-synced (at VGA 70Hz), it made sense to place the OS message pump and graphics output into the vertical retrace check function. This worked beautifully, except for places where the game did not bother to wait for retrace (such as simply showing something on-screen and waiting for a key in a busy loop). No retrace check, no message pump, no keys pressed. Adding the retrace checks to the loops naturally fixed the issue.

// Copy image to screen memcpy((char*)0xA0000, myImage, 64000); // Wait for key press getch();

Copy data directly to video memory and busy wait for key - perfectly legal in the DOS era.

The game also utilized a timer interrupt that ran in sync with the video refresh rate. I did not bother trying to make a separate thread to make it run exactly at 70Hz, and simply called the interrupt routine at approximately 70Hz in my message pump code. One positive side effect of this approach was that the per-frame palette-change tricks worked automatically.

I also wrote some placeholder keyboard handling code, which much to my surprise, worked directly. Apparently, the SDL scan codes match whatever DOS had, or came close enough.

Connecting The Dots

Instead of converting one inline assembly format to another, I rewrote all the functions in C. I think the result was actually not slower, as compiler optimization technology has improved a lot and the original assembly was written with original Pentiums (or worse) in mind.

Most of the assembly functions were little things, like rectangle copy or bit mask matching, and did not take too much effort to write. First the menus, then the in-game graphics started to come into view. This part was pure joy - not so different from eating pistachio nuts: each bite takes a little effort, but has a huge payoff. I always wanted just one more, making it very difficult to call it a day.

One final piece of assembly was the polygon filler. In this area, I opted not to faithfully reproduce the original code, but wrote a software rasterizer from scratch, so if you see the polygon filler glitch, that's probably my fault.

It had been about three weeks and the game was playable. Still no sound and tons of small things to do, but playable.

Audio

Next up was audio. The game used Scream Tracker S3M modules for music and Fast Tracker 2 XM modules for sound effects. Why both were not in the more advanced XM format, I do not know. Maybe XM for sound effects was a later addition, or maybe the composer preferred the S3M format. Music was relatively easy to handle, except for a small glitch where the replacement audio system was optimizing things a bit too much.

I called Jonne about the S3M vs XM thing, and he just preferred scream tracker. You can stop sending me email about this now. Please.

The game used a trick that was common in those days, where you place several looping songs into one module and instruct the replay routine to switch between the songs by jumping to a certain order number. The songs in question had some empty orders between songs, and these got optimized out, messing up the sub-song order numbers. Luckily, the order list was easy to read from the S3M directly, so I could make simple translation tables from optimized to original and back. The sound effects, however, took a lot more effort.

After several false starts, including writing a complete XM module loader, I took the open source XM player minifmod and made some severe modifications to it in order to use it as the sound effects library. The original sound library had a notion of "pitch" that wasn't in Hertz, but in something related to the notes. I had to invent an algorithm that approximated the conversion from the original "pitch" to a relative note. While the result is probably not exactly the same, it kind of feels right.

Graphics

I started off with the idea that since the original game used two graphics modes, I might do the same. Unfortunately, the 320x200 mode did not work out, so I opted to only use one graphics mode - 640x480 with a simple scaler for the 320x200 mode. The bad side of this decision was that the aspect ratio for the 320x200 mode was wrong. I could have spent a long time making some kind of weaving algorithm to turn 640x400 into 640x480, but opted to just add black bars at the top and bottom instead.

This got the graphics going quickly. In testing, however, we found that some non-4:3 aspect ratio LCD screens stretched the image, and for dual-screen systems, switching to 640x480 moved all the other windows in an irritating manner.

Near the end, I figured it would be best to use OpenGL to scale the frame buffer to screen at desktop resolution. This would solve several issues, including the 320x200 screen mode aspect ratio. In high enough target resolution, the 320x200 looks pretty authentic when using point-sample texture lookup. I was originally wary of this approach because of possible performance concerns on low-end 3D hardware, but testing on some low-end Intel chipset mini-laptops cleared these issues.

Implementing the OpenGL blitter was easy, but it uncovered a nasty issue. While I was running in software, the display update was not synced to the display refresh rate. With OpenGL, it was. The game's internal clock was fixed to the 70Hz VGA refresh rate, which I was faking. The way I implemented the vertical retrace meant that whenever the application even asks about the retrace - not only when it was waiting for retrace - we'd do the message pump and display would update (up to 70Hz, anyway).

The in-game graphics were requesting information about the retrace about 50 times per frame in the worst case. Different aspects of the game world were asking which frame we were on for animation timing purposes. While doing the message pump, we had no idea whether we would have a new screen to show or not. As a result, we spent a few milliseconds here, a few milliseconds there, and suddenly had to wait for display refresh, wasting a dozen milliseconds and so on, until the game was crawling at about 1Hz.

In order to solve this, I made two changes. First, whenever the code asked about the retrace, I'd increment the "current frame" value by one and return that, instead of jumping to the correct real-world value I was doing originally. The exception was, if we had already caught up with the real world, in which case, the current value was returned. This solved the slowdown issue everywhere else except in-game.

I added a hack for this: a flag which disables the display update. I set this flag on for all other parts of the game loop except when actually waiting for vertical retrace.

Rally Crossed

And so the port was done. One major part which was unfortunately left out was the multiplayer networking, as it would have required a non-trivial rewrite. Apart from cosmetic changes, the game is the same as its original DOS counterpart: you now exit to OS instead of DOS - I also added a few additional delay loops where loading times have become insignificant and other little touches like that.

Pitfalls

By now, you may have noticed that all the issues I faced with the porting have to do with technologies that have changed, and in most cases, improved with time.

Apart from the speed, memory protection, and compiler issues mentioned earlier, there's one more thing that has also changed with time: quality requirements.

Back then, PCs were not even as standard as they are now. There was no process memory protection, and in general, it was okay for programs to do all sorts of funny things. Sometimes they crashed, and this was considered acceptable within reason. After all, some PCs were more stable than others. Still, these bugs haven't gone away, and you may have to add in some additional bug-hunting time for your modern port.

Luckily for me, the Death Rally codebase was pretty stable. Still, I spent some time hunting bugs that occurred rarely, and in some cases, never appeared on my development system. A few crash cases were due to my own misunderstanding of some of what was happening in the source; some were actual bugs in the original code, but they were all more or less simple to fix or work around.

Death Rally was released as freeware for Windows and can be downloaded from www.death-rally.com. I hope you enjoy it as much as I have!

Project at a Glance

Original specs:

DOS (w/extender)

Watcom C

MCGA 320x200x8

VESA 640x480x8

60-plus MHz CPU

8MB RAM

New specs

Win32

MSVC7.1

16 or 24 bit color

in various resolutions

1-plus GHz CPU(s)

1-plus GB RAM