Deadlocks, livelocks, and race conditions, oh my!

The biggest challenge facing game companies right now is the problem of writing multithreaded code that fully supports the multiple-core architectures of the latest PCs and the next-generation game consoles. Last year, I wrote an article that discussed this issue, and suggested that, for a while at least, the lowest common denominator of development might prevail. As evidence for my argument, I made use of quotes from one of the most vocal detractors of the new programming model: Gabe Newell, founder and president of Valve Software, Inc.

Well, things can change quite a bit in a year! On the first day of November, a group of technology journalists were invited to a special press event held at Valve's headquarters in Bellevue, Washington. There, we were treated to an unveiling of the company's new programming strategy, which has been completely realigned around supporting multiple CPU cores. Valve is planning on more than just supporting them. It wants to make the absolute maximum use of the extra power to deliver more than just extra frames per second, but also a more immersive gaming experience.

Programming applications and games that support multiple CPUs (either multicore CPUs on the same die or discrete chips) requires learning the art of multithreading. The programmer divides an application into separate threads, which the operating system automatically allocates among available CPUs. With each thread executing at the same time, potential problems crop up that are not normally an issue with single-threaded programming.



Imagine each of these weapons is a CPU core.

Now imagine trying to wield them all at the same time.

A deadlock occurs when one thread is waiting on a second thread to finish before it can proceed, but the second thread is waiting on the first for exactly the same thing. With a livelock, both threads continue to execute, but make no progress towards their goals because the other thread is undoing the work of the first. In a race condition, some part of memory that the first thread is working on gets modified by the second, producing an undesirable result.

All these problems can be dealt with by an experienced programmer, but they are difficult to debug and can often be frustrating for someone who isn't used to multithreading. Valve's challenge was to create a set of frameworks that would not only allow junior and specialist programmers (what Gabe Newell calls "leaf coders") to not only be productive but to make maximum use of multiple cores. Relying on existing tools that came with the operating system and off-the-shelf multithreading frameworks simply wasn't good enough. They would have to create their own toolset from scratch.

Valve's hybrid threading model



Valve programmer Steve Bond demonstrates the results of

Valve's hybrid threading.

The programmers at Valve considered three different models to solve their problem. The first was called "coarse threading" and was the easiest to implement. Many companies are already using coarse threading to improve their games for multiple core systems. The idea is to put whole subsystems on separate cores; for example, graphics rendering on one, AI on another, sound on a third, and so on. The problem with this approach is that some subsystems are less demanding on CPU time than others. Giving sound, for example, a whole core to itself would often leave up to 80 percent of that core sitting unused.

The second approach was fine-grained threading, which separates tasks into many discrete elements and then distributes them among as many cores as are available. For example, a loop that updates the position of 1,000 objects based on their velocity can be divided among, say, four cores, with each core handling 250 objects apiece. The drawback with this approach is that not all tasks divide neatly into discrete components that can operate independently. Also, if some entries in the list take longer to update than others, it becomes harder to scale the tasks evenly across multiple cores. Finally, the issue of memory bandwidth quickly becomes a limitation with this method. For certain specialized tasks, such as compiling, fine-grained threading works really well. Valve has already implemented a system whereby every computer in their offices automatically acts as a compiler node. When the programmers were getting ready to demonstrate their results on the conference room computer with the big screen, they had to quickly deactivate this feature first!

The approach that Valve finally chose was a combination of the coarse and fine-grained, with some extra enhancements thrown in. Some systems were split on multiple cores using coarse threading. Other tasks, such as VVIS (the calculations of what objects are visible to the player from their point of view) were split up using fine-grained threading. Lastly, whenever part of a core is idle, work that can be precalculated without lagging or adversely affecting the game experience (such as AI calculations or pathfinding) was queued up to be delivered to the game engine later.

Valve's approach was the most difficult of all possible methods for utilizing multiple cores, but if they could pull it off, it would deliver the maximum possible benefits on systems like Intel's new quad-core Kentsfield chips.

To deliver this hybrid threading platform, Valve made use of expert programmers like Tom Leonard, who was writing multithreaded code as early as 1991 when he worked on C++ development tools for companies like Zortech and Symantec. Tom walked us through the thought process behind Valve's new threading model.