Doug Burger called it Project Catapult.

Burger works inside Microsoft Research–the group where the tech giant explores blue-sky ideas–and in November 2012, he pitched a radical new concept to Qi Lu, the man who oversees Microsoft's Bing web search engine. He wanted to completely change the machines that make Bing run, arming them with a new kind of computer processor.

Doug Burger. Microsoft

Like Google and every other web giant, Microsoft runs its web services atop thousands of computer servers packed into warehouse-sized data centers, and most of these machines are equipped with ordinary processors from Intel, the world's largest chip maker. But when he sat down with Lu, Burger said he wanted millions of dollars to build rack after rack of computer servers that used what are called field-programmable arrays, or FPGAs, processors that Microsoft could modify specifically for use with its own software. He said that these chips–built by a company called Altera–could not only speed up Bing searches, but also change the way Microsoft run all sorts of other online services.

Despite the cost, and the riskiness of the proposition, Lu liked the idea. In a first for Microsoft, he approved a 1,600-server pilot-system to test out Burger's ideas, and now, he has given the green light to actually move these FPGAs into Microsoft's live data centers. This is set to happen early next year. That means that a few months from now, when you do a Bing search, there's a decent chance that it will be carried out by one of Burger's servers.

The move is part of a larger effort to fix what is an increasingly worrisome problem for big web companies like Microsoft, Google, and Facebook. After decades of regular performance boosts, chips are no longer improving at the same rate they once were. As their web services continue to grow, these companies are looking for new ways of improving the speed and efficiency of their already massive operations. Facebook is exploring the use of low-power ARM processors. According to reports, Google is too. And now Microsoft is about to roll out FPGAs. "There are large challenges in scaling the performance of software now," says Burger. "The question is: 'What's next?' We took a bet on programmable hardware."

>"There are large challenges in scaling the performance of software now," says Burger."The question is: 'What's next?' We took a bet on programmable hardware."

FPGAs, like the Altera chips that Microsoft used in its pilot project, have been around for years. A decade ago, they were widely used by chip designers as a low-cost way to prototype their new products. But lately, they've crept into networking gear, complex computer rigs that run the bitcoin digital currency, and even some specialized systems used by Wall Street firms to do data analysis. They give hardware makers more freedom to customize their gear.

Using FPGAs, Microsoft engineers are building a kind of super-search machine network they call Catapult. It's comprised of 1,632 servers, each one with an Intel Xeon processor and a daughter card that contains the Altera FPGA chip, linked to the Catapault network. The system takes search queries coming from Bing and offloads a lot of the work to the FPGAs, which are custom-programmed for the heavy computational work needed to figure out which webpages results should be displayed in which order. Because Microsoft's search algorithms require such a mammoth amount of processing, Catapult can bundle the FPGAs into mini-networks of eight chips.

Microsoft

The FPGAs are 40 times faster than a CPU at processing Bing's custom algorithms, Burger says. That doesn't mean Bing will be 40 times faster–some of the work is still done by those Xeon CPUs–but Microsoft believes the overall system will be twice as fast as Bing's existing system. Ultimately, this means Microsoft can operate a much greener data center. "Right off the bat we can chop the number of servers that we use in half," Burger says.

What's more, Microsoft can update the chips in much the same way it updates Bing's system software, and Burger and his team can modify the logic on their processors to address bugs and changes in the Bing search algorithm. They do this by building a binary file that represents the updated chip logic and distributing it though Microsoft's standard server management software, called Autopilot. It's not uncommon to have several chip updates per week, Burger says.

Of course, there have been challenges. There was a lab flood and a fire with one of their Taiwanese parts suppliers, and as it stands, Microsoft server monitoring tools didn't always know what to make of chips that are suddenly dropping offline and restarting with reconfigured logic. But Microsoft is confident that the new FPGAs can be used across the company's online empire. "If all we were doing was improving Bing, I probably wouldn't get clearance from my boss to spend this kind of money on a project like this," says Peter Lee, the head of Microsoft Research. "The Catapult architecture is really much more general-purpose, and the kinds of workloads that Doug is envisioning that can be dramatically accelerated by this are much more wide-ranging."

It's also the kind of work that's likely to be emulated at other big web companies who have the resources to hire hardware developers, says James Larus, dean of the School of Computer and Communications Sciences with the École Polytechnique Fédérale de Lausanne. He previously worked at Microsoft on Project Catapult. "The benefits of hardware specialization are far too large for the right application for these companies to pass up the opportunity," he says.

According to Burger, developing a whole new chip architecture for one of the world's largest data center operators is the kind of thing that Microsoft Research does pretty well. "Let's jump way out, think of something a little crazy, and then push on it and see how well that works," he says. Come 2015, you can get the answer that question simply by searching Bing.