In my previous blog post I talked about reverse engineering the virtual machine used to implement objects in the Lost Vikings. The Lost Vikings was released in 1993 by Silicon and Synapse, now better know as Blizzard.

At the time I empirically tested some opcodes by manually patching the original data chunks with a hex editor. This works, but it fairly tedious, and doesn’t scale well to experimenting with larger programs, or programs that contain loops or jumps. In the first blog post I suggested that creating a simple language and compiler would be useful for further reverse engineering the virtual machine. So, I did.

Building a Compiler

To assist reverse engineering the virtual machine, and also allow for easy creation of new programs I decided to build a compiler. To keep things simple I opted for a single pass, recursive descent parser.

The compiler design is very much based on the PL/0 type compilers often taught in university compiler courses (see PL/0). I won’t go into all the details of writing a basic compiler as there are tons of resources available which already cover this. Basically the compiler takes a source program, performs lexical analysis to create a stream of tokens, and then parses the program, generating code as it goes.

Lost Vikings C

I created a very simple C-like language for implementing object programs in. One advantage of using a C-like language is being able to reuse the existing C preprocessor to support defining constants and macros. The language does not allow for defining variables, you are restricted to the object fields and globals defined by the virtual machine. Built in functions provide support for generating opcodes such yield and spawn object operations. The language makes a few minor deviations from C just to keep the lexer and parser simple.

Sticking with the gun turret object from the first level that I modified in the first blog post, a turret which shoots arrows can be implemented in Lost Vikings C as follows:

#include "vikings.h" #define timer field_30 #define DELAY_TIME 40 function main { call set_gfx_prog(0x37bc); this.timer = 0; while (this.timer != 0xffff) { call update_obj(); call yield(); if (this.timer != 0) { this.timer = this.timer - 1; } else { // Calculate the offset of the projectile based on the // turret's direction. if (this.flags & OBJ_FLAG_FLIP_HORIZ) { g_tmp_a = this.x - 12; } else { g_tmp_a = this.x + 12; } g_tmp_b = this.y - 12; // Update the turret animation call set_gfx_prog(0x37c8); // Spawn an arrow projectile call spawn_obj(g_tmp_a, g_tmp_b, 0, 0, 7); // Delay for next shot this.timer = DELAY_TIME; } } }

Each object needs a main loop, which should never terminate. The main loop must call the yield() function to allow the virtual machine processing loop to exit. Failing to do this will cause the game engine to hang. The update_obj() call does basically what it says.

Many of the object fields uses are left up to the specific object. In this case I’ve used field_30 as a timer to control how quickly the turret shoots. When the timer hits zero, the turret fires, and then rearms the timer.

The only thing left to do now is generate code for this program.

Code Generation

The Lost Vikings virtual machine is not an ideal target for a simple compiler. PL/0 type compilers typically generate code for an abstract stack machine, so a simple statement like:

this.x = this.y + 1;

Would generate code like:

push this.y ; push this.y onto the stack push 1 ; push 1 onto the stack add ; pop the top two values, add them and push the result pop this.x ; pop the result into this.x

The Lost Vikings Virtual Machine however does not have a stack. It has a single temporary register, object fields and globals. The above program could be translated as follows for the Lost Vikings virtual machine:

52 20 ; var = this.y 56 1e ; this.x = var 51 0001 ; var = 0x0001 59 1e ; this.y += var

This is much more difficult to generate code for when compiling from a general purpose language. It would probably require generating an intermediate representation in the first pass, and then re-ordering instructions, etc in a second pass to generate the final code.

Abstract Machine

The solution I came up with was to build an abstract stack machine, which could be easily compiled for on top of the Lost Vikings virtual machine. This is made possible by the opcodes for loading and storing globals. These opcodes take an offset in the games data segment (DS) as their operand. This allows the opcodes to be used to load or store any arbitrary address in DS.

The compiler generates code by placing a fake stack at a high location (0xf004) in the data segment. Two locations at the base of the fake stack are reserved for special register: 0xf000 is a zero register, and 0xf002 is a flag register used for comparisons. The above statement can now be compiled as:

52 20 ; var = this.x 57 f004 ; ds[f004] = var 51 0001 ; var = 0x0001 57 f006 ; ds[f006] = var 53 f006 ; var = ds[f006] 5a f004 ; ds[f004] += var 53 f004 ; var = ds[f004] 56 1e ; this.y = var

While not particularly high performance this code is easy to generate. My goal is to build a compiler for testing small programs out to determine what opcodes or combinations of opcodes do. I don’t care too much for efficiency, either in time or code size, for the moment.

A second problem in crafting a language for the original virtual machine is that many of the opcodes conditionally call a function. For example opcode 0x1a checks if the current object has collided with a viking, and if so emits a call instruction. It is more desirable to let the programmer decide whether to use a call or a jump (e.g. an if statement).

I implemented this by emitting code for a generic helper function which sets a flag register and returns. When generating code for an opcode which makes a conditional call the flag register is first cleared. If the opcode makes the call then the flag register will be set. The flag can then be tested to emit a jump call. So, the following code:

if (call collided_with_viking(0x01)) { this.x = 1; }

Generates the following code:

[c009] 51 0000 ; var = 0x0000 [c00c] 57 f002 ; ds[f002] = var, clear the flag register [c00f] 1a 01 c0a5 ; if (collided_with_viking(0x01)) call c0a5 [c013] 53 f002 ; var = ds[f002] [c016] 74 f000 c026 ; if (var == ds[f000]) goto c026 [c01b] 51 0001 ; var = 0x0001 [c01e] 57 f004 ; ds[f004] = var [c021] 53 f004 ; var = ds[f004] [c024] 56 1e ; this.field_x = var [c026] ... ; next instruction after the if block ; Set flag helper function [c0a5] 51 0001 ; var = 0x0001 [c0a8] 57 f002 ; ds[f002] = var, set the flag register [c0ab] 06 ; return

The zero register is used here to compare the flag value against. The compiler emits instructions at the beginning of each program to clear the zero register.

Patching Programs

The virtual machine programs for a single world in the game are all packed into a single chunk in the data file. The compiler takes the simple approach of appending the generated code to the end of the chunk and then modifying the object’s program header to point to the patched in program.

This all works in theory, but there is one slight problem. The game stores the virtual machine programs in the extra segment (ES) at a location which is allocated using the DOS memory allocation API. The memory allocation function looks like this:

The call to allocate memory for the virtual machine programs looks like this:

So 0xc000 bytes (48K) is allocated for the programs. The problem is that the chunk for the space world programs is already 48972 bytes in size, leaving only 180 bytes at the end to patch a new program in. Not much when dealing with a compiler which generates extremely verbose code.

I discovered this problem when attempting to compile a larger program. The game started behaving erratically, either hanging or the gun turret object would randomly vanish. These sorts of issues can be difficult to chase down because it could be a bug in my compiler or generated code, a misunderstanding of how some opcode is meant to work, or a bug or limitation in the original game.

The quick solution is to patch the game binary to extend the size of the program allocation. Raising the allocation size to 0xd000 bytes (52k) works fine, and should be plenty of extra space for experimenting with simple programs. I’ve included instructions for this with the compiler.

Empirical Reversing

The goal in building a compiler was to make it easier to reverse the virtual machine. Testing a new opcode involves adding an entry to a dictionary of functions in the code generator class and providing a function for emitting instructions for the opcode. For example, one of the first opcodes I experimented with was 0x41, which appears in a few disassembled programs like the button/question box object.

It’s dictionary entry in the compiler initially looked like:

"vm_func_41" : (False, 4, emit_builtin_vm_func_41),

The tuple specifies whether the function returns a value, the number of arguments it takes and the code emitter function. Functions are marked as returning a value if the opcode conditionally makes a call or jump.

I knew from looking at the opcode in IDA that opcode 0x41 was unconditionally executed and took four arguments. Its arguments use the variable type encoding that I discussed in the previous blog post. Its emit function looks like this:

def emit_builtin_vm_func_41(self, reg_list): operands = self.pack_args(reg_list, 4) self.emit(0x41, *operands)

The pack_args helper function handles the packing of variable type arguments.

It can now be called in a quick test program. For testing I use the collided_with_viking function (opcode 0x1a) so that the opcode I am testing is only triggered when a viking touches the turret. Note that the collided_with_viking function takes a single byte operand which I don’t know the use of, but the value 0x01 seems to work. I also use field_32 of the object as a one-time trigger so that the opcode I am testing is only run the first time a viking touches the object.

My test program looks like this:

function main { call set_gfx_prog(0x37bc); this.field_32 = 0; while (this.field_32 != 0xffff) { call update_obj(); call yield(); if (call collided_with_viking(0x01)) { if (this.field_32 == 0) { call vm_func_41(1, 1, 1, 1); this.field_32 = 1; } } } }

This results in the following graphical corruption in game:

It appears the opcode is used to display a dialog box, but it is unclear why the graphical corruption is occurring. Looking at some of the disassembled programs from the original game shows that opcode 0x41 is usually followed by the unknown opcodes 0xcb and 0x42, neither of which take any arguments. Adding builtins for those opcodes to the compiler and re-running results in a the game showing the dialog box, waiting for a keypress, and then the removing the dialog box. So, opcode 0x41 is show dialog, opcode 0xcb is wait for key, and opcode 0x42 is clear dialog.

Further experimentation with opcode 0x41 shows that its first argument is the index of the string for the dialog. The strings are, somewhat unfortunately for modding, stored in the game binary. The second argument is still unknown, and the third and fourth arguments control the x and y offset, relative to the object to display the dialog at.

A good example of why this type of experimental reversing can be useful is to look at the opcode 0xcb in IDA:

Although this function is small, and what it does is clear (it modifies some globals), it isn’t obvious what effect that has on the game engine. Looking at the cross references for each of the globals in IDA isn’t particularly helpful either. Each of the globals is used in a number of places, none of which make their use immediately obvious. I could have spent a long time in IDA attempting to figure out that setting those globals tells the game engine to wait for a keypress on the next game loop. Empirical testing allowed me to figure it out very quickly. The downside is that I still don’t actually know the exact purpose of each of the globals in the function.

Interesting Fields

One of the interesting things I discovered by experimenting with the compiler is how objects can reference each other. I already knew from my initial reversing that there was some support for this, but the compiler allowed me to quickly work out more of the details.

Field 0x3c in each object specifies a current target object. Some opcodes, such as the collided_with_viking (0x1a) opcode will automatically set this field, but it can also be set manually. The vikings are always objects 0 (Baleog), 1 (Erik) and 2 (Olaf). One a target object is set it can be manipulated via the target fields. Each opcode has variants for modifying the current object field, or the target field. For example opcode 0x59 adds the temporary register to a field in the current object, whereas opcode 0x5b adds the temporary register to a field in the target object.

Other objects can manipulate fields in the viking objects to control their behaviour. For example field 0x32 for the vikings is the amount of pending damage. An object can damage a viking by adding to that field. The vikings themselves are partially implemented by programs in the virtual machine. The next time the viking’s program runs it will check the pending damage field apply damage accordingly.

Other fields of interest are 0x12 and 0x14, which are the x and y velocity for all objects. Prior to writing the compiler I had thought that objects were moved either by adding or subtracting from their x and y offset fields (0x1e and 0x20 respectively) or possibly using a function type opcode. Instead the game uses a pair of velocity fields. Some objects, such as the vikings, will automatically modify their own velocity. So, for example, if an object sets the a vikings velocity to propel it upwards, then on subsequent game loops the viking will adjust its own velocity to cause it to fall back down again.

It seems that there weren’t enough fields in the viking objects to store everything, so each of the vikings four inventory slots are globals. An object can test if an inventory slot is zero by testing if the corresponding global is zero, and give an item to a viking by directly assigning to an inventory slot global.

Advanced Game Hacking

To experiment with the virtual machine, and demonstrate how flexible it is I wrote a slightly more complex program for the gun turret in the first level. The turret checks which viking has touched it and reacts differently. If Erik touches the turret it gives him items. If Olaf touches it he gets bounced into the air. When Baleog touches it the turret flips around. It looks like this in action:

It’s pretty impressive to see this level of flexibility in a game developed in the early nineties. The source code for this program is included with the compiler in the examples directory.

Future Work

There is still a large number of opcodes, fields and globals that I do not yet understand the function of. The compiler has lots of missing features, for example it currently only supports the == and != comparison operators, and lacks most of the bitwise operators. It’s also a bit cumbersome to use, since it requires separately unpacking the original program chunk from the game data file, patching a new program in, and repacking the data file.

The tools are all open source/public domain, and available on github at: https://github.com/RyanMallon/TheLostVikingsTools. You can also follow me on twitter @ryiron.