Gary Lauterbach says that Google racked its servers like hot bread at a bakery.

The year was 2001. Lauterbach was the chief microprocessor architect at Sun Microsystems, and two of his old Sun pals, Eric Schmidt and Wayne Rosing, had just joined Google. One afternoon, Lauterbach and another Sun bigwig, Jim Mitchell, walked to Google's Palo Alto, California, office to see the server room. Even then, Google used a very different kind of server. According to Lauterbach, dirt-cheap motherboards were slotted into what looked like bread racks. These "bread rack servers" – as Lauterbach and others still call them – had no cases. They just sat on the racks, exposed to the open air.

Sun was a company that sold massive, monolithic servers built around the brawny UltraSPARC processors Lauterbach helped design. After seeing Google's machines, Mitchell turned to Lauterbach and said: "Those servers are so cheap and use so little power, we have no hope of building a product to help them." It was a moment of defeat. But for Lauterbach, it was also a moment of inspiration. Sun couldn't build server hardware for an outfit like Google. But maybe he could.

A half decade later, Lauterbach met a man with a similar story. His name was Andrew Feldman, and he had worked for Force10 Networks, an outfit that sold Google hundreds of millions of dollars in networking equipment. By the time Feldman saw the inside of the search giant, Google had outgrown its bread- rack servers – it had built a new breed of server, not to mention a worldwide network of data centers – but he learned much the same lesson as Lauterbach. Part of Google's secret was a knack for minimizing the power used by its servers. "They were buying mind-boggling amounts of networking equipment, but [networking] wasn't their primary problem," he remembers. "Their primary problem was power."

Inspired by Google, Gary Lauterbach and Andrew Feldman joined forces to start a company called SeaMicro. Rather than building servers with the world's fastest chips, the Sunnyvale, California, startup would build them with hundreds of processors originally designed for cellphones and other mobile devices. It was the bread-rack server taken to extremes. The aim was to save both power and space inside the data centers underpinning the web's most popular services.

"Everyone was having the same problem [as Google]," says Andrew Feldman, who serves as SeaMicro CEO, while Lauterbach handles the CTO duties. "Suddenly, power and space had become these enormous issues in the data center, whereas just a few years before, no one ever mentioned them." When you reach a certain size, power and space eat up enormous amounts of money.

Revenge of the Wimps

As it turns out, SeaMicro was in the vanguard of a movement to reinvent the internet data center with processors that are considerably slower than traditional server chips. Carnegie Mellon professor Dave Andersen calls them "wimpy nodes" or "wimpy cores." But if you put them together, they aren't that wimpy. Gary Lauterbach prefers the term "scale-down" computing.

The trick is to break software applications into tiny pieces that can then be spread across these low-power chips. SeaMicro machines use Intel's mobile chip, the Atom, while other outfits – including HP and an Austin, Texas company called Calxeda – are building servers packed with hundreds of ARM chips not unlike the one at the heart of the Apple iPhone.

Two years after SeaMicro released its first machines, some outfits are already using them to run their operations, including Mozilla, the maker of the open source Firefox browser, and eHarmony, the online dating site. And Facebook – one of those giants of the internet – has said that it's seriously considering a move to wimpy nodes.

But there are hurdles. Some applications must be rewritten to run effectively on these machines, and Facebook has said that it can't make the switch until these systems can accomodate more memory. Intel's Atom chip wasn't designed to address the massive amounts of memory used by modern internet services.

In fact, Google – the company that inspired SeaMicro – has published a research paper that pours some cold water on the idea of running applications across hundreds of very low-power chips. Feldman and Lauterbach acknowledge the irony, but they're confident the hurdles facing wimpy cores will be overcome. The Intel Atom will improve, and Feldman says that every internet player is potential customer – except for Google. Google, he says, is a special case.

The Search Engine That Builds Switches

Sometime in 2007 or 2008, Google quit buying all that networking hardware from Force10. The search giant won't discuss the change, but according to many people who have worked with the company, it's now building its own network routers and switches. This is all part of its effort to create what it calls "warehouse-scale computers." At Google, data centers aren't data centers. They're computers the size of warehouses.

The idea is that the entire data center – from software to servers to, yes, networking hardware – must be designed to work as a whole. "These new large data centers are quite different from traditional hosting facilities of earlier times,” Google engineers Urs Hölzle and Luiz Barroso wrote in a book they called The Datacenter as a Computer. "Large portions of the hardware and software resources in these facilities must work in concert to efficiently deliver good levels of internet service performance, something that can only be achieved by a holistic approach to their design and deployment. In other words, we must treat the data center itself as one massive warehouse-scale computer."

Part of this involves spreading software applications evenly across a large array of machines. As Barroso recently told Wired, Google was in some ways at the forefront of the "wimpy node" movement, using commodity hardware in its data centers rather than super-expensive servers. Compared to Sun's servers, those bread-rack servers were very wimpy indeed. But Barroso and Hölzle believe there's a limit to how thin you can spread your application – to how "wimpy" your nodes can be. Last year, Hölzle published a paper – edited by Barroso – called "Brawny cores still beat wimpy cores, most of the time.”

According to Barroso, the problem is that as you break your application into tiny and tiny pieces – spreading it across more and more servers – it becomes more difficult to do so. There comes a point where it's not worth the effort. Eventually, you run into Amdahl's Law, a late 1960s proclamation from IBM engineer Gene Amdahl that says your performance will improve only so much if you parallelize only part of your system.

Gary Lauterbach and Andrew Feldman agree that some applications run on SeaMicros servers better than others. "There are certain situations where you want a Toyota Tercel, and others where you want a Ford F150," Feldman says. "The car analogy is a good one. You have to think about what you're transporting and how many trips you're taking."

But for the most part, he continues, large internet services are suited to wimpy cores. He thinks that Google's unwillingness to embrace the idea is down its unique infrastructure. Google has so finely tuned each piece of hardware to work in tandem with every other part, there's no way of accommodating wimpy nodes – unless it starts from scratch. In short, he says, Google's networking hardware is designed for brawny cores, not wimpy.

"Google tackled the power problem by building their own servers and later building their own switches, trying to tune the whole facility to do the work," he says. "But not everyone can do that. And there are other ways to solve the problem. One option is to build servers in a different way."

One Server. 768 Cores

SeaMicro's latest server includes 384 Intel Atom chips, and each chip has two "cores," which are essentially processors unto themselves. This means the machine can handle 768 tasks at once, and if you're running software suited to this massively parallel setup, you can indeed save power and space.

Today, Mozilla is using SeaMicro's older 512-core machines to handle downloads of its Firefox browser, and though it declined to discuss the setup for this story, the open source outfit is on the record saying that its SeaMicro cluster draws about one fifth of the power and uses about a fourth of the space of its previous cluster.

According to Dave Andersen – the Carnegie Mellon computer science professor who coined the "wimpy node" name – servers such as SeaMicro's are well suited to the sort of simple web-serving Mozilla is doing. But with his Fast Array of Wimpy Nodes project, he has also shown that large numbers of low-power chips are suited to running databases and other large internet applications.

In recent years, the Google infrastructure has inspired so many new hardware and software platforms, from Facebook's Open Compute servers to the Hadoop open source number-crunching platform to NoSQL databases such MongoDB and Cassandra. SeaMicro is just another example.

Some of the projects closely follow the Google way, while others, such as SeaMicro, take the original idea in new places. With both, the thing to remember is that Google is not a typical business. "The choices they make are idiosyncratic, and they're not necessarily transferrable to others," Feldman says. Once these ideas leave Google, they must be judged – for better or for worse – on how they serve everyone else. Not how they serve Google.

Of course Google doesn't want SeaMicro's servers. It builds its own machines. And its software is tuned to those machines. But few other internet companies work that way. Though Google's "bread rack" servers inspired SeaMicro, that was many years ago. Gary Lauterbach and Andrew Feldman have moved on. And so has Google.