by

I’ve been writing about the early development of Warcraft, but a recent blog post I read prompted me to start scribbling furiously, and the result is this three-part, twenty-plus page article about the development of StarCraft, along with my thoughts about writing more reliable game code. I’ll be posting the latter parts over the next several days.

This post: Why StarCraft crashed frequently during development

Part 2: How we could have fixed the most common causes

Part 3: Explaining the implementation details of the fix

The beginnings of StarCraft

During the development of StarCraft, a two and a half year slog with over a year of crunch time prior to launch, the game was as buggy as a termite nest. While its predecessors (Warcraft I and II) were far more reliable games than their industry peers, StarCraft crashed frequently enough that play-testing was difficult right up until release, and the game continued to require ongoing patching efforts post-launch.

Why? There were sooooo many reasons.

Orcs in space

StarCraft was originally envisioned as a game with modest goals that could fit into a one-year development cycle so that it could be released for Christmas, 1996.

The project leadership was comprised of the same folks who had started Shattered Nations (video), a turn-based strategy game along the lines of X-COM that Blizzard announced in May 1995 but canceled some months later.

The team members were regrouped to build something that could reach market quickly so Blizzard wouldn’t have a long gap between game launches.

Q4 1994 – Warcraft

Q4 1995 – Warcraft II

Q4 1996 – planned ship date for StarCraft

Q2 1998 – actual ship date for StarCraft

The decision to rush the game’s development seems ludicrous in retrospect, but Allen Adham, the company’s president, was under pressure to grow revenue. While Blizzard’s early games had been far more successful than expected, that just raised expectations for future growth.

Given a short timeframe and limited staff, the StarCraft team’s goal was to implement a modest game — something that could best be described as “Orcs in space”. A picture from around the time of the E3 game show in Q2 1996 shows the path the game team originally chose:

But a higher priority project overshadowed StarCraft and stole its developers one by one. Diablo, a role-playing game being developed by Condor Studios in Redwood City California, was in need of additional help. Condor, a company formed by Dave Brevik along with Max Schaefer and his brother Erich Schaefer, was given a budget of only $1.2 million — ridiculously small even in those days.

The Condor team had no hope of making the game they aspired to build, but they did such ground-breaking work in developing something fun that it made sense for Blizzard to acquire Condor, rename it Blizzard North, and start pouring in the money and staff the game really deserved.

Initially Collin Murray, a programmer on StarCraft, and I flew to Redwood City to help, while other developers at Blizzard “HQ” in Irvine California worked on network “providers” for battle.net, modem and LAN games as well as the user-interface screens (known as “glue screens” at Blizzard) that performed character creation, game joining, and other meta-game functions.

As Diablo grew in scope eventually everyone at Blizzard HQ — artists, programmers, designers, sound engineers, testers — worked on the game until StarCraft had no one left working on the project. Even the project lead was co-opted to finish the game installer that I had half-written but was too busy to complete.

After the launch of Diablo at the end of 1996, StarCraft development was restarted, and everyone got a chance to see where the game was headed, and it wasn’t pretty. The game was dated, and not even remotely impressive, particularly compared to projects like Dominion Storm, which looked great in demos at E3 six months before.

The massive success of Diablo reset expectations about what Blizzard should strive for: StarCraft became the game that defined Blizzard’s strategy of not releasing games until they were ready. But a lot of pain had to occur along the way to prove out this strategy.

Something to prove

With everyone looking critically at StarCraft, it was clear that the project needed to be vastly more ambitious than our previous ground-breaking efforts in defining the future of the real-time strategy (RTS) genre with the first two Warcraft games.

At the time of the StarCraft reboot, according to Johnny Wilson, then Editor in Chief of Computer Gaming World, the largest-distribution gaming magazine of that time, there were over eighty (80!!) RTS games in development. With so many competitors on our heels, including Westwood Studios, the company that originated the modern RTS play-style, we needed to make something that kicked ass.

And we were no longer an underdog; with the successes of Warcraft and Diablo continuing to fill the news we sure wouldn’t be getting any slack from players or the gaming press. In the gaming world you’re only ever as good as your last game. We needed to go far beyond what we’d done previously, and that required taking risks.

New faces

Warcraft II had only six core programmers and two support programmers; that was too few for the larger scope of StarCraft, so the dev team grew to include a cadre of new and untested game programmers who needed to learn how to write game code without much mentoring.

Our programming leadership was weak: we hadn’t yet learned how essential it is to provide guidance to less experienced developers early in the project so they learn much-needed lessons before the game launches, so it was very much a sink-or-swim proposition for new Padawans. A big part of the problem was just how thin we were on the ground — every programmer was coding like mad to meet goals, with no time for reviews, code-audits, or training.

And not only were there inexperienced junior members on the team, the leader of the StarCraft programming effort had never architected a shipping game engine. Bob Fitch had been programming games for several years with great results but his previous efforts were game ports, where he worked within an existing engine, and feature programming for Warcraft I and II, which didn’t require large-scale engine design. And while he had experience as the tech lead for Shattered Nations, that project was canceled, therefore no validation of its architectural decisions was possible.

The team was incredibly invested in the project, and put in unheard of efforts to complete the project while sacrificing personal health and family life. I’ve never been on a project where every member worked so fiercely. But several key coding decisions in the project, which I’ll detail presently, would haunt the programming team for the remainder of the project.

Some things have changed

After spending months working to launch Diablo, and further months of cleanup effort and patching afterwards, I returned to help with the reboot of StarCraft. I wasn’t looking forward to diving into another bug-fest, but that’s exactly what happened.

I thought it would be easy to jump back into the project because I knew the Warcraft code so well — I’d literally worked on every component. I was instead terrified to discover that many components of the engine had been thrown away and partially rewritten.

The game’s unit classes were in the process of being rewritten from scratch, and the unit dispatcher had been thrown out. The dispatcher is the mechanism I created to ensure that each game unit gets time to plan what it wants to do. Each unit periodically asks: “what should I do now that I finished my current behavior?”, “should I re-evaluate the path to get where I’m going?”, “is there a better unit to attack instead of the one that I’m targeting now?”, “did the user give me a new command?”, “I’m dead, how do I clean up after myself?”, and so forth.

There are good reasons code needs to be rewritten, but excising old code comes with risks as well. Joel Spolsky said it most eloquently in Things You Should Never Do, Part I:

It’s important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time. First of all, you probably don’t even have the same programming team that worked on version one, so you don’t actually have “more experience”. You’re just going to make most of the old mistakes again, and introduce some new problems that weren’t in the original version.

The Warcraft engine had taken months of programming effort to get right, and while it needed rework for new gameplay features, a fresh programming team was now going to spend a great deal of time relearning lessons about how and why the engine was architected the way it was in the first place.

Game engine architecture

I wrote the original Warcraft engine for Microsoft DOS in C using the Watcom Compiler. With the switch to releasing on Microsoft Windows, Bob chose to use the Visual Studio compiler and re-architected the game engine in C++. Both were reasonable choices but for the fact that — at that point — few developers on the team had experience with the language and more especially with its many pitfalls.

Though C++ has strengths it is easy to misuse. As Bjarne Stroustrup, the language’s creator, so famously said: “C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off.”

History tells us that programmers feel compelled to try every feature of their new language during the first project, and so it was with class inheritance in StarCraft. Experienced programmers will shudder when seeing the inheritance chain that was designed for the game’s units:

CUnit < CDoodad < CFlingy < CThingy

CThingy objects were sprites that could appear anywhere on the game map, but didn’t move or have behaviors, while CFlingys were used for creating particles; when an explosion occurred several of them would spin off in random directions. CDoodad — after 14 years I think this is the class name — was an uninstantiated class that nevertheless had important behaviors required for proper functioning of derived classes. And CUnit was layered on top of that. The behavior of units was scattered all throughout these various modules, and it required an understanding of each class to be able to accomplish anything.

And beyond the horror of the class hierarchy, the CUnit class itself was an unholy mess defined across multiple header files:

class CUnit ... { #include "header_1.h" #include "header_2.h" #include "header_3.h" #include "header_4.h" };

Each of those headers was several hundred lines, leading to an overall class definition that could at best be called amusing.

It wasn’t until many years later that the mantra “favor composition over inheritance” gained credence among programmer-kind, but those who worked on StarCraft learned the hard way much earlier.

We’re only two months from launch

With its troubled early history, after the reboot the development team was pressured to finish up, and so schedules were bandied about that showed the game could be launched in two months.

Given the number of game units and behaviors that needed to be added, the changes necessary to switch from top-down to isometric artwork, a completely new map editor, and the addition of Internet play over battle.net, it was inconceivable that the game actually could ship in that time, even assuming that the art team, designers, sound engineers, game-balancers and testers could finish their end of the bargain. But the programming team continually worked towards shipping in only two months for the next fourteen months!

The entire team worked long hours, with Bob working stretches of 40 hours, 42 hours, even 48 hours programming. As I recall no one else attempted these sorts of masochistic endeavors, though everyone was putting in massive, ridiculous hours.

My experiences developing Warcraft, with frequent all-nighters coding, and later Diablo, where I coded fourteen-plus hour days seven days a week for weeks at a time, suffered me to learn that there wasn’t any point in all-nighters. Any code submissions [ha! what an appropriate word] written after a certain point in the evening would only be regretted and rewritten in the clear light of following days.

Working these long hours made people groggy, and that’s bad when trying to accomplish knowledge-based tasks requiring an excess of creativity, so there should have been no surprises about the number of mistakes, misfeatures and outright bugs.

Incidentally, these sorts of crazy hours weren’t mandated — it was just the kind of stuff we did because we wanted to make great games. In retrospect it was foolish — we could have done better work with more reasonable efforts.

One of my proudest accomplishments was to ship four Guild Wars campaigns in a two-year window without leading the development team down that dark path.

The most common cause of StarCraft game crashes

While I implemented some important features in StarCraft, including fog-of-war, line-of-sight, flying unit pathing-repulsion, voice-chat, AI reinforcement points, and others, my primary job gravitated to fixing bugs.

Wait: voice-chat! In 1998?!? Yeah: I had it all working in December 1997. I used a 3rd-party voice-to-phoneme compressor, and wrote the code to send the phonemes across the network, decompress them, and then play them back on the other seven players’ computers.

But every single sound-card in our offices required a driver upgrade to make it work, if the sound card was even capable of full-duplex sound (simultaneous recording and playback of sounds), so I regretfully made the recommendation to scrap the idea. The tech-support burden would have been so high that we would have spent more money on game support than we would have made selling the game.

So anyway I fixed lots of bugs. Some of my own, sure, but mostly the elusive bugs written by other tired programmers. One of the best compliments I’ve received came just a few months ago, when Brian Fitzgerald, one of two best programmers I’ve had occasion to work with, mentioned a code-review of StarCraft; they were blown away by how many changes and fixes I had made over the entire code-base. At least I got some credit for the effort, if only well after the fact!

Given all the issues working against the team, you might think it was hard to identify a single large source of bugs, but based on my experiences the biggest problems in StarCraft related to the use of doubly-linked linked lists.

Linked lists were used extensively in the engine to track units with shared behavior. With twice the number of units of its predecessor — StarCraft had a maximum of 1600, up from 800 in Warcraft 2 — it became essential to optimize the search for units of specific types by keeping them linked together in lists.

Recalling from distant memory, there were lists for each player’s units and buildings, lists for each player’s “power-generating” buildings, a list for each Carrier’s fighter drones, and many many others.

All of these lists were doubly-linked to make it possible to add and remove elements from the list in constant time — O(1) — without the necessity to traverse the list looking for the element to remove — O(N).

Unfortunately, each list was “hand-maintained” — there were no shared functions to link and unlink elements from these lists; programmers just manually inlined the link and unlink behavior anywhere it was required. And hand-rolled code is far more error-prone than simply using a routine that’s already been debugged.

Some of the link fields were shared among several lists, so it was necessary to know exactly which list an object was linked into in order to safely unlink. And some link fields were even stored in C unions with other data types to keep memory utilization to a minimum.

So the game would blow up all the time. All the time.

But why did you do it that way?

Tragically, there was no need for these linked-list problems to exist. Mike O’Brien, who, along with Jeff Strain, cofounded ArenaNet with me, wrote a library called Storm.DLL, which shipped with Diablo. Among its many features, storm contained an excellent implementation of doubly-linked lists using templates in C++.

During the initial development of StarCraft, that library was used. But early in the development the team ripped out the code and hand-rolled the linked-lists, specifically to make writing save-game files easier.

Let me talk about save games for a second to make this all clearer.

Save games

Many games that I played before developing Warcraft had crappy save-game functionality. Gamers who played any game created by Origin will remember how looooooong it took to write save-game files. I mean sure, they were written by slow microprocessors onto hard-drives that — by today’s standards — are as different as tricycles and race cars. But there was no reason for them to suck, and I was determined that Warcraft wouldn’t have those problems.

So Warcraft did some tricks to enable it to write large memory blocks to disk in one chunk instead of meandering through memory writing a bit here and there. The entire unit array (600 units times a few hundred bytes per unit) could be written to disk in one chunk. And all non-pointer-based global variables could similarly be written in one chunk, as could each of the game-terrain and fog-of-war maps.

But oddly enough, this ability to write the units to disk in one chunk wasn’t essential to the speed of writing save game files, though it did drastically simplify the code. But it worked primarily because Warcraft units didn’t contain “pointer” data.

StarCraft units, which as mentioned previously contained scads of pointers in the fields for linked lists, was an entirely different beast. It was necessary to fixup all the link pointers (taking special care of unioned pointer fields) so that all 1600 units could be written at once. And then unfixup the link pointers to keep playing. Yuck.

Change back!

So after fixing many, many linked list bugs, I argued vehemently that we should switch back to using Storm’s linked lists, even if that made the save-game code more complicated. When I say “argued vehemently”, I should mention that was more or less the only way we knew how to argue at Blizzard — with our youthful brashness and arrogant hubris, there was no argument that wasn’t vehement unless it was what was for lunch that day, which no one much wanted to decide.

I didn’t win that argument. Since we were only “two months” from shipping, making changes to the engine for the better was regularly passed over for band-aiding existing but sub-optimal solutions, which led to many months of suffering, so much that it affected my approach to coding (for the better) ever since, which is what I’ll discuss in part two of this article.

More Band-Aids: path-finding in StarCraft

I wanted to mention one more example of patching over bugs instead of fixing the underlying problem: when StarCraft switched from top-down artwork to isometric artwork, the background tile-graphics rendering engine, which dated back to code I had written in 1993/4, was left unchanged.

Rendering isometric-looking tiles using a square tile engine isn’t hard, though there are difficulties in getting things like map-editors to work properly because laying down one map tile on another requires many “edge fixups” since the map editor is trying to place diagonally-shaped images drawn in square tiles.

While rendering isn’t so bad, isometric path-finding on square tiles was very difficult. Instead of large (32×32 pixel) diagonal tiles that were either passable or impassable, the map had to be broken into tiny 8×8 pixel tiles — multiplying the amount of path-searching by a factor of 16 as well as creating difficulties for larger units that couldn’t squeeze down a narrow path.

Had Brian Fitzgerald not been a stellar programmer, the path-finding problem would have prevented the game from launching indefinitely. As it was pathing was one of the problems that was only finalized at the end of the project. I plan to write more about path-finding in StarCraft because there are lots interesting technical and design bits.

End of part 1

So you’ve heard me whine a bit about how difficult it was to make StarCraft, largely through poor choices made at every level of the company about the game’s direction, technology and design.

We were fortunate to be a foolhardy but valiant crew, and our perspicacity carried the day. In the end we buckled down and stopped adding features long enough to ship the game, and players never saw the horror show underneath. Perhaps that’s another benefit of compiled languages over scripted ones like JavaScript — end users never see the train wreck!

In part two of this article I’m going to get even more technical and talk about why most programmers get linked lists wrong, then propose an alternative solution that was used successfully for Diablo, battle.net and Guild Wars.

And even if you don’t use linked-lists, the same solutions carry over to more complex data structures like hash tables, B-trees and priority queues. Moreover, I believe the underlying ideas generalize well to all programming. But let’s not get ahead of ourselves; that’s another article.

Thanks for reading this far, and sorry I haven’t yet discovered how to write concisely.