"Another World" Code Review

I spent two weeks reading and reverse engineering further the source code of Another World ("Out Of This World" in North America). I based my work on Gregory Montoir's "binary to C++" initial reverse engineering from the DOS executable.



I was amazed to discover an elegant system based on a virtual machine interpreting bytecode in realtime and generating fullscreen vectorial cinematic in order to produce one of the best game of all time.



All this shipping on a 1.44MB floppy disk and running within 600KB of RAM: Not bad for 1991 ! As usual I cleaned up my notes, it may save a few hours to someone.

But...What source code ?!

The source code of "Another World' was never officially released nor leaked. Some people were so passionate about this groundbreaking game that they reverse engineered the DOS executable.



This was possible partly because the binary was small (20KB). Why so small ? Because ANOTHER.EXE was not the game itself but just a virtual machine:

Hosting bytecode.

Providing system calls.

The bytecode performs all the game logic with its own opcodes but uses syscalls for "heavy" stuff like drawing, playing music, sound and managing assets.



To implement only the virtual machine for the target OS reduced the effort and the game was broadly ported to more than a dozen platforms:



1991 Amiga, Atari ST

1992 Apple IIGS, DOS, SNES, Mega Drive

1993 3DO

2004 GameBoy Advanced

2005 Windows XP, Symbia OS, Windows Mobile

2011 iOS

Every time only the virtual machine had to be compiled to the target OS: The bytecode remained the same !

Architecture

The executable is 20KB. It can be summarized as:







We can see four modules:

Virtual Machine: Maestro of the entire system.

Resource Manager: Loads resources from the floppy disk when the vm request them.

Sound/Music mixer: Makes noises upon request from the vm.

Renderer: Reads and Renders vertices upon request from the vm. Read vertices from the memory segments.

Trivia : The palette memory segment actually contains several palettes, used for nice fading effects.

Upon startup, the executable sets the virtual machine's thread 0 program counter with 0x00 and start interpreting. Everything is commanded by the bytecode after that.

Rendition explained

In the previous drawing we see three framebuffers. Two because Another World implements Double Buffering" in software....and a third one as a clever optimization:



The third framebuffer is used to compose the background of a scene only once and then reuse it frame after frame with a simple memcpy :







Your browser does not support the video tag.

Trivia :

Your browser does not support the video tag.

The double buffering with rendition occuring alternatively on the two front/back buffers.

The background buffer generated once and stored in the upper left buffer. It is then copied at the beginning of each frame.

If the background changes (as an example when the car stops) the background buffer is updated to save even more space and time.

In this video the legendary first level screen of Another World has been slowed down so we can actually see things being drawn. Everything is drawn with polygons and pixigons. Overdraw is very substantial but since this is only generated once it is not so bad.This famous background is made of 981 polygons.In order to visualize the big picture I have slowed down and rendered the three framebuffer + what is seen on screen:We can see very clearly:

If you want to analyze more : Full video.



Another World Virtual Machine

Eric Chahi's webpage explains a lot about how the machine is structured.



In the code on github you can see how every opcode have been implemented. All of them are pretty easy to understand except for the renditions ones. The trick is that the polygon segment source where the vertices should be read is embedded with the opcode id.



Finally a few screenshots from the vm bytecode editor (Called "script editor" by Eric Chahi):









You can see how the label have been lost: setvec 21 nag1 sets the thread 21 instruction counter at "nag1" label offset. In the bytecode we can only see a hardcoded offset.

Opcode cases



In the following drawings we can see the virtual machine calling an opcode that is actually a system call to the resource manager in order to load the four memory segments. This happens typically at the beginning of a game part (The entire game is made of 10 game parts):







In the next drawing the opcode is also a systemcall to the renderer asking to draw and fetch vertices. The rendition opcode are a bit more complex because they contains where to read the vertices from. To set the target framebuffer is an independent opcode altogether:





Note: Whether the render should read vertices from the cinematic polygon segment or the animation segment is encoded with the opcodeId.

Resource Management

Resources are identified by an unique integer id. Upon startup the resource manager opens MEMLIST.BIN and get records as follow:



typedef struct memEntry_s { int bankId; int offset; int size; int unpackedSize; } memEntry_t;

When the vm requests a resourceId, the resource manager:

Locates it by opening the bank file (via bankId).

Skips offset and read size bytes in RAM.

and read bytes in RAM. If size != unpackedSize , the resource has to be unpacked.



A few stats about the compression:





Total # resources: 146 Compressed : 120 Uncompressed : 28 Note: 82% of resources are compressed. Total size (uncompressed) : 1820901 bytes. Total size (compressed) : 1236519 bytes. Note: Overall compression gain is : 32%. Total RT_SOUND unpacked size: 699868 (38% of total unpacked size) packedSize 585052 (47% of floppy space) gain:(16%) Total RT_MUSIC unpacked size: 33344 ( 2% of total unpacked size) packedSize 3540 ( 0% of floppy space) gain:(89%) Total RT_POLY_ANIM unpacked size: 384000 (21% of total unpacked size) packedSize 106676 ( 9% of floppy space) gain:(72%) Total RT_PALETTE unpacked size: 18432 ( 1% of total unpacked size) packedSize 11032 ( 1% of floppy space) gain:(40%) Total RT_BYTECODE unpacked size: 203546 (11% of total unpacked size) packedSize 135948 (11% of floppy space) gain:(33%) Total RT_POLY_CINEMATIC unpacked size: 365960 (20% of total unpacked size) packedSize 291008 (24% of floppy space) gain:(20%) Note: Damn you sound compression rate! Total bank files: 148 Total RT_SOUND files: 103 Total RT_MUSIC files: 3 Total RT_POLY_ANIM files: 12 Total RT_PALETTE files: 9 Total RT_BYTECODE files: 9 Total RT_POLY_CINEMATIC files: 9

I did not spent time reverse engineering the compression algorithm...the fact that sound doesn't compress very well leads me to believe it is entropy sensitive...so maybe a variation of huffman ?





Out of 146 resources: 120 are compressed:

Vector rendition plus compression on top of it was a HUGE win (up to 62% gain !!).

Compression on sound is very inefficient: gain is poor and the total size accounts for 47% of the space on the floppy disc.

Trivia : The introduction (resource 0x1C), 3 minutes long weights only 57,510 bytes once compressed.





Memory Management

Like all games from the 90s no memory is allocated during gameplay. Upon startup the game engine grabs 600KB of memory ( anybody remember DOS 640KB conventional memory here ?). Those 600KB are used as a stack allocator:





Free memory: The memory manager has the capability to unallocate one step back OR free the entire memory. In practice the entire memory is freed at the end of each 10 game parts.



Trivia : Originally the entire 600KB was storing bytecode and vertices. But after two years of generating the backgrounds with polygons/pixigons the game was still far from being done. In order to speed up the development speed Eric Chahi decided to integrate a hack in his beautiful architecture (at a performance cost): The resource manager can load background bitmap from the floppy disk to the background buffer ( void copyToBackgroundBuffer(const uint8 *src); ). Hence 32KB (320x200/2) are reserved at the end of the conventional memory.



Trivia : This hack was exploited for the release of Another World for Windows XP in 2005. All background were hand drawn and loaded directly from hard-drive without using the renderer and its pixigons:







Purist corner

If you are a purist and really want to play the original way, Another World works like a charm in DosBOX:







Or you can run the windows XP version. I recommend to get the Collector's edition since it feature a lot of additional informations, among them the techical notes from Eric Chahi:





One more thing

I worked on the code a lot, making it simpler to understand. You can see an example of how much clearer it is now.



Before:



void Logic::runScripts() { for ( int i = 0; i < 0x40; ++i) { if (_scriptPaused[0][i] == 0) { uint16 n = _scriptSlotsPos[0][i]; if (n != 0xFFFF) { _scriptPtr.pc = _res->_segCode + n; _stackPtr = 0; _scriptHalted = false; debug(DBG_LOGIC, "Logic::runScripts() i=0x%02X n=0x%02X *p=0x%02X", i, n, *_scriptPtr.pc); executeScript(); _scriptSlotsPos[0][i] = _scriptPtr.pc - _res->_segCode; debug(DBG_LOGIC, "Logic::runScripts() i=0x%02X pos=0x%X", i, _scriptSlotsPos[0][i]); if (_stub->_pi.quit) { break; } } } } }

After:



void VirtualMachine::hostFrame() { // Run the Virtual Machine for every active threads (one vm frame). // Inactive threads are marked with a thread instruction pointer set to 0xFFFF (VM_INACTIVE_THREAD). // A thread must feature a break opcode so the interpreter can move to the next thread. for ( int threadId = 0; threadId < VM_NUM_THREADS; threadId++) { if (!vmIsChannelActive[CURR_STATE][threadId]) continue; uint16 pcOffset = threadsData[PC_OFFSET][threadId]; if (pcOffset != VM_INACTIVE_THREAD) { // Set the script pointer to the right location. // script pc is used in executeThread in order // to get the next opcode. _scriptPtr.pc = res->segBytecode + pcOffset; _stackPtr = 0; gotoNextThread = false; debug(DBG_VM, "VirtualMachine::hostFrame() i=0x%02X n=0x%02X *p=0x%02X", threadId, n, *_scriptPtr.pc); executeThread(); //Since .pc is going to be modified by this next loop iteration, we need to save it. threadsData[PC_OFFSET][threadId] = _scriptPtr.pc - res->segBytecode; debug(DBG_VM, "VirtualMachine::hostFrame() i=0x%02X pos=0x%X", threadId, threadsData[0][threadId]); if (sys->input.quit) { break; } } } }



I used:

MACROS to avoid cryptic hardcoded values.

Variables renaming.

A LOT of comments.



Here is the "human readable" source code :) ! Happy hacking.

Edit (video presentation)

Jeff Somers submitted a link to a fantastic video from GDC Vault in which Eric Chahi talks about the Genesis of Another World. Thanks a lot Jeff :) !

@