HAVE computers stopped getting faster? If you looked only at the clock speeds of microprocessor chips, you might well think so. A modern PC typically has a processor running at 3.0GHz (3 billion clock ticks per second), little changed from a PC made three or four years ago. Clock speeds, which used to double every couple of years, have stopped increasing because as chips are clocked at higher speeds they become difficult to cool and much less energy-efficient. Instead, extra oomph has been added in recent years by packaging multiple processing engines, or “cores”, inside a single chip. Modern PCs and laptops typically have dual-core processors (such as the Intel Core i3) and some have quad-core or even six-core chips.

You might expect a six-core machine to be six times faster than a machine with a single-core microprocessor. Yet for most tasks it is not. That is because nearly all software is still designed to run on a single-core chip; in other words, it is designed to do only one thing at a time. A few pieces of specialist software can take advantage of multiple cores: image-processing software, for example, may divide up a difficult task and farm it out to multiple cores to get it done faster, combining the results when each core has finished its work. And the computer's operating system may be able to assign different tasks to different cores, to ensure that, for example, video playback in a web browser does not slow down while a hard disk is scanned for viruses. But your spellchecker will not run six times faster on a six-core machine unless it has been specially written to share out the work between the available cores, so that they can tackle the job in parallel.

“We're not going to have faster processors,” says Katherine Yelick, a computer scientist at the Lawrence Berkeley National Laboratory in California. Instead, making software run faster in the future will mean using parallel-programming techniques. This will be a huge shift. At present, mainstream programs written for PCs (such as word-processor software), and specialist programs written for supercomputers with thousands of processors (such as climate-modelling or protein-folding software), are written using entirely different tools, languages and techniques. After all, software written for one sort of machine is not expected to work on the other.

But the distinction between the two is slowly breaking down. Intel, the world's biggest chipmaker, has demonstrated a 48-core processor, and chips with hundreds of cores seem likely within a few years. What was once an obscure academic problem—finding ways to make it easy to write software that can take full advantage of the power of parallel processing—is rapidly becoming a problem for the whole industry. Unless it is solved, notes David Smith of Gartner, a market-research firm, there will be a growing divide between computers' theoretical and actual performance.

Not the road to riches

Surely this problem will be solved by some bright young entrepreneur who will devise a new parallel-programming language and make a fortune in the process? Alas, designing languages does not seem to provide a path to fame and riches. Even the inventors of successful languages are mostly unknown within the industry, let alone outside it. Can you name the inventors of COBOL, C, Java or Python? (The answers are Grace Murray Hopper, Dennis Ritchie, James Gosling and Guido van Rossum.) “There are thousands of programming languages, and only a handful are used by more than their inventors,” notes David Patterson, a computer scientist at the University of California at Berkeley.

Parallel-programming languages in particular tend to languish in academic obscurity. There are dozens of them—by one count, more than a hundred. None is popular. The reasons for this neglect are simple and longstanding, says Craig Mundie, chief research and strategy officer at Microsoft. He spent most of the 1980s and early 1990s at Alliant, a supercomputing company he co-founded that planned to convert ordinary (or “sequential”) software into parallel software automatically. But there was little demand, because most existing programs do not lend themselves to running in parallel. Making efficient parallel software means starting from scratch.

Dr Patterson likens parallel programming to having ten reporters each write a paragraph of a news story. They might get the story written ten times faster than any one of them could on his own, but will it make much sense? Throwing 100 or 1,000 reporters at the same problem does not help—instead, the task becomes even more difficult, because each reporter must co-ordinate his actions with the others. In practice, that may involve a lot of waiting around for others to complete their subtasks. And what happens if two writers both end up waiting for each other? In the world of parallel programming, the resulting stoppage is known as “deadlock”.

Another obstacle to parallel programming is cultural. “Our conscious minds tend to think in terms of serial steps,” says Steve Scott, chief technology officer at Cray, a storied maker of supercomputers. (A Cray machine was, until November 2010, the world's fastest general-purpose supercomputer.) Undergraduate courses tend to focus on sequential programming—not surprisingly, since the industry is still dominated by sequential code except in a few specialist niches, and most programmers spend their time maintaining or extending old code, rather than writing entirely new code. At SC10, a computing conference held in New Orleans in November 2010, experts discussed the need to change curricula and update textbooks to reflect the growing demand for parallel-programming skills in general-purpose computing, and not just in scientific computing. This will take years.

“Serious efforts are being made to make parallel programming easier and more approachable.”

A further difficulty is the lack of tools for working with parallel code, such as compilers, to translate human-readable code into something a microprocessor can run, and debuggers to find mistakes. Debugging code on a parallel machine with hundreds or thousands of cores creates unique problems, and may be the biggest single challenge facing parallel programming, says Charles Holland of the Defence Advanced Research Projects Agency (DARPA), the research-funding agency of America's Department of Defence. In all, it is not hard to see why there has been so little progress in parallel programming, even though multicore chips have been widespread for five years.

But efforts are being made to get things moving. DARPA, renowned for its role in catalysing the development of the internet, has been taking a top-down approach. In 2001 it challenged America's computer-makers to build a new generation of high-performance supercomputers that are both easier for programmers to use and far more powerful than existing machines. The DARPA challenge included the development of new parallel languages and programming tools, in addition to hardware. The hope is that as these new machines are adopted (the first, IBM's Mira, will go into operation at the Argonne National Laboratory in 2012), there will be a “trickle down” effect as their parallel-programming tools become widely used.

As part of the project IBM has developed a parallel-programming language called X10. Cray, the other finalist in the DARPA scheme, is developing a parallel-programming language called Chapel, which is designed to allow code to run on everything from a multicore desktop machine to a huge supercomputer. Both X10 and Chapel are entirely new languages, though they are intended to be approachable by programmers who are familiar with other languages.

Parallelise this

Intel and Microsoft, meanwhile, are taking a bottom-up approach. Intel in particular has a direct commercial interest in promoting parallel programming, because if software is unable to make full use of the computing horsepower of its chips, customers will be less inclined to upgrade their hardware. Microsoft, meanwhile, wants to ensure that it maintains its position as a leading provider of programming tools. The two companies have invested a total of $16m in two new parallel-computing research centres at the University of California at Berkeley (led by Dr Patterson) and the University of Illinois at Urbana-Champaign, with the specific aim of producing tools to program multicore systems.

Rather than devising entirely new parallel languages, Intel and Microsoft have focused on extending existing languages, such as C++ and Fortran, by adding support for parallel coding. This lets programmers use tools and languages that they already know well. The two firms' marketing muscle should help promote adoption, says Marc Snir, the head of the parallel-computing research centre at the University of Illinois and a veteran of the parallel-programming field. Intel, for example, has recently been promoting new parallel-programming tools that help programmers take advantage of its latest family of multicore processors, known as Sandy Bridge, each of which has between two and eight cores.

Meanwhile, a group of obscure programming languages used in academia seems to be making slow but steady progress, crunching large amounts of data in industrial applications and behind the scenes at large websites. Two examples are Erlang and Haskell, both of which are “functional programming” languages.

Such languages are based on a highly mathematical programming style (based on the evaluation of functions) that is very different from traditional, “imperative” languages (based on a series of commands). This puts many programmers off. But functional languages turn out to be very well suited to parallel programming. Erlang was originally developed by Ericsson for use in telecoms equipment, and the language has since been adopted elsewhere: it powers Facebook's chat feature, for example. Another novel language is Scala, which aims to combine the best of both functional and traditional languages. It is used to run the Twitter, LinkedIn and Foursquare websites, among others.

The problem is still far from solved. But serious efforts are finally being made to make parallel programming easier and more approachable. Will the tools to take advantage of multicore chips come from a trickle-down of high-end scientific computing techniques, the extension of existing programming languages or the spread of previously obscure languages, driven by the needs of web developers? More than one of these paths may prove successful. And, appropriately enough, the search for new parallel-programming techniques is itself a parallel process.