MENLO PARK, CA—Building 17 of Facebook's headquarters sits on what was once a Sun Microsystems campus known fondly as "Sun Quentin." It now houses a team of Facebook engineers in the company's electrical lab. Everyday, they push forward the company vision of how data center hardware should be built. These engineers constantly bench-test designs for their built-in-house server hardware—essentially putting an end to server hardware as we know it.

Ars recently visited Facebook's campus to get a tour of the server lab from Senior Manager of Hardware Engineering Matt Corddry, leader of Facebook's server hardware design team. What's happening at Facebook's lab isn't just affecting the company's data centers, it's part of Facebook's contribution to the Open Compute Project (OCP), an effort that hopes to bring open-source design to data center server and storage hardware, infrastructure, and management interfaces across the world.

Facebook, Amazon, and Google are all very picky about their server hardware, and these tech giants mostly build it themselves from commodity components. Frank Frankovsky, VP of hardware design and supply chain operations at Facebook, was instrumental in launching the Open Compute Project because he saw the waste in big cloud players reinventing things they could share. Frankovsky felt that bringing the open-source approach Facebook has followed for software to the hardware side could save the company and others millions—both in direct hardware costs and in maintenance and power costs.

Just as the Raspberry Pi system-on-a-board and the Arduino open-source microcontroller have captured the imagination of small-scale hardware hackers, OCP is aimed at making DIY easier, effective, and flexible at a macro scale. What Facebook and Open Compute are doing to data center hardware may not ultimately kill the hardware industry, but it will certainly tilt it on its head. Yes, the open-sourced, commoditized motherboards and other subsystems used by Facebook were originally designed specifically for the "hyper scale" world of data centers like those of Facebook, Rackspace, and other cloud computing providers. But these designs could easily find their way into other do-it-yourself hardware environments or into "vanity free" systems sold to small and large enterprises, much as Linux has.

And open-source commodity hardware could make an impact beyond its original audience quickly because it can be freely adopted by hardware makers, driving down the price of new systems. That's not necessarily good news for Hewlett-Packard, Cisco, and other big players in corporate IT. "Vanity free," open-source designed systems will likely drive innovation fast while disrupting the whole model those companies have been built upon.

“Open” as in “open-minded”

To be clear, Open Compute doesn't go open-source all the way down to the CPU. Even the Raspberry Pi isn't based on open-source hardware because there's no open-designed silicon that is capable enough (and manufacturers aren't willing to produce one in volume for economic reasons). The OCP hardware designs are "open" at a higher level. This way anyone can use standards-based components to create the motherboards, the chassis, the rack-mountings, the racks, and the other components that make up servers.

"We focus on the simplest design possible," Corddry told me. "It's focused really tightly on really high scalability and driving out complexity and any glamor or vanity in the design." That makes it easy to maintain, cheap to buy and build, and simple to adapt to new problems as they emerge. It also makes it easy to build things on top of the designs that will help Facebook and others who buy into the OCP philosophy. The ideas that came out of the OCP Hardware Hackathon at the Facebook campus on June 18 are a primary example.

So far, Facebook and Rackspace are the main adopters of OCP hardware. But that could soon change as the dynamics of open-source hardware start to kick in. As Intel, AMD, and others start to turn out more components built to the OCP specification and contribute more intellectual property to the initiative, some involved with the effort believe it will snowball.

"I think our industry realized probably about a decade ago, when Linux took over for Unix, that open was actually a pretty positive thing for the suppliers as well as the consumers in large-scale computing," Corddry said. "Linux didn't kill the data center industry or the OS industry. I think we're looking at the same pattern and seeing that openness and hardware doesn't mean the death of hardware. It probably means the rebirth of hardware, where we see a greater pace of innovation because we're not always reinventing the wheel."

Deconstructing the server

At the center of Facebook's data center design philosophy is "disaggregation"—the breaking up of what has traditionally constituted a "server" into purpose-specific chunks of hardware interconnected largely by network hardware. It's ironic, in a way, that this is happening on the old Sun campus. In its heyday, Sun advertised with the slogan "The network is the computer." Now, the computer is the network both conceptually and physically.

The design principles behind Facebook's hardware come from direct hands-on experience. Corddry said that all his engineers spend time working as technicians at Facebook data centers "so everyone walks a mile in those shoes and understands what it is to work on this gear at scale."

The approach taken by Facebook and by the Open Compute Project is post-modernist deconstruction for the data center—the disaggregation of the components that usually make up a server into functional components with as little complexity and as much efficiency to them as possible. There are few "servers" per se in Facebook's data center architecture. Instead, there are racks filled with "sleds" of functionality. "That's going to be a pattern you see from us over the next couple of years," Corddry said while showing off a few sled designs in the hardware lab. "A lot of our hardware designs will be focused on one class of problem."

The approach is already being rolled out in Facebook's newest data centers, where racks are filled with systems built from general-purpose compute sleds (motherboards populated with CPUs, memory, and PCI cards for specific tasks), storage sleds (high-density disk arrays), and "memory sleds" (systems with large quantities of RAM and low-power processors designed for handling large in-memory indexes and databases).

"We're not putting these into the network sites yet," Corddry said, referring to the colocation sites Facebook uses to connect to major Internet peering sites. "But we're putting them in all our data center facilities, including Ashburn, Virginia—where it's not a net new build, it's more of a classic data center environment. In fact, we've designed a variant of the original Open racks to go into that kind of facility that has the standard dual power so they can play nice with the colo environment."

The one thing all of these disaggregated modules of hardware have in common is that they're fully self-contained and can be yanked out, repaired, or replaced with minimum effort. Corddry pointed to one of the compute sleds on the lab's workbench. "If there's something to repair on this guy, all you have to do is grab the handle and pull. There are no screws, no need for a screwdriver, and just one cable in front."

Corddry demonstrated this with the "Windmill" compute sled, based on a second-generation open-source motherboard design. "It's a two-socket Intel motherboard in a tube," said Corddry. "We refer to it as the 'sushi boat' form factor." The compute sled has the barest bones of what you’d expect on a server motherboard: two processors, 16 DIMM slots for memory, and a few PCI slots. Facebook uses PCI 10-gigabit Ethernet cards instead of putting Ethernet directly on the motherboard. Corddry said this is largely so the company can get them from multiple suppliers.

There's no power supply on a compute sled; all the power is pulled from the rack for the sake of efficiency. "There's a 12 volt power connection in back," Corddry explained. "We send 12 volt regulated to the board, so we get rid of all the complexity of having power conversion and supply in the system. The principles of efficiency tell you to only convert the power the minimum number of times required; you’re losing two to five percent of the power every time you step it down. So we bring unregulated 480 volt 3-phase straight into our rack and have a power shelf that converts it to 12 volt that goes straight to the motherboard. We convert it once from when it comes in from the utility to when it hits the motherboard. A lot of data centers will convert power three or four times: from 480 to 208, into a UPS, back out of the UPS, into a power distribution unit, and into a server power supply."

The efficiency continues within the design of the cooling fans in each compute sled. "These guys are incredibly efficient," Corddry said. "It has nice big fan blades that turn slowly. It only takes three or four to move air through this guy to keep it cool compared to a traditional 1U server, which can be 80 to 100 watts."

The one set of servers that breaks Facebook's pattern of simplification and disaggregation (slightly) is the company's database servers. "Database servers are usually the hardest thing to do with DIY hardware," Corddry said. "But we wanted to get rid of these expensive, hard-to-service OEM servers from our database tier. It was an interesting kind of quick development project, not a hack-a-thon exactly. But we said, 'Hey, we're still buying this OEM hardware; it's more expensive and harder to service—can we do something better?' Within a matter of months, engineers jumped in and said, 'Let's see if we can get rid of this last piece of OEM equipment in our inventory.'"

Facebook's hardware engineering team came up with a solution using Windmill motherboards, adding a power supply kit "to allow us to run off our high availability dual feed power," Corddry said. "Normally, this design would actually have a single power supply in the back and two motherboards."

While the majority of Facebook's servers run off a single power feed—"the power goes out, the generator kicks in, and we're OK," said Corddry—the database servers need extra power protection to prevent corruption caused by an outage. So the servers that support Facebook's User Database and other big databases need to have redundant power.

Listing image by Sean Gallagher