A team of researchers at Microsoft have been investigating the use of field programmable gate arrays (FPGAs)—chips that can be rewired dynamically using software—to accelerate the Bing search engine. The system, called Catapult, was so successful that Microsoft plans to put it into production next year.

With their mix of hardware acceleration and software configurability, FPGAs have the potential to substantially improve the scaling and power efficiency of computing systems. Microsoft set the goal of doubling performance of a particular Bing workload with the constraint of adding no more than 30 percent to the total cost of ownership and no more than 10 percent to the power budget. GPUs have found considerable use in some high-performance computing tasks, but this tight power constraint precluded their use by the Bing team.

The FPGAs in the Catapult system were used in conjunction with a cluster of 1,632 dual Xeon servers. Each server had its own FPGA equipped with 8GB RAM, with the FPGAs connected directly to each other with 10Gb cables. FPGAs weren't used for the entire Bing workload, but were used instead for a ranking task that collates the pages that match a particular search, sorts them to put the best matches, and generates captions. An FPGA implementation for most stages of the ranking service was written and deployed to the FPGAs.

The result? For a ten percent increase in total system power and a less than 30 percent increase in total system cost, Microsoft achieved a 95 percent improvement in search throughput. The company has started deploying the technology to one of its data centers, and it will be going live in 2015.