Last night a few hundred assorted New York geeks packed into Google’s new Chelsea offices to hear Google engineer Luiz Barroso discuss “Watts, Faults, and Other Fascinating Dirty Words Computer Architects Can No Longer Afford to Ignore.” Capsule version: While CPU power has been increasing regularly, for the last fifteen years or so MIPs per watt may actually have been going down. At the very least, it has not been growing nearly as fast as total processing power per server. The good news is that since we haven’t really bothered to optimize our computers for energy efficiency there’s still a lot of low-hanging fruit left to pick off in this space.



MIPs vs. Watts

Given Google’s large data centers, and the sheer cost of building one ($10-22 per Watt construction cost and $0.80 per Watt-year operating cost) Google is quite concerned about getting maximally efficient use out of its systems. So should we all be if as Barroso predicts the energy cost per server is likely to soon outstrip the initial purchase price (which he joked might lead power companies to implement a cell phone business model: sign up for two years of electricity and they’ll give you the servers for free).

There are numerous problems here. The first and one of the easiest to fix is power supplies. A typical computer power supply is only 50%-70% efficient. By paying a little more up-front for your power supplies you can easily get one that’s 90% efficient. Most of us don’t think about the energy efficiency of a power supply when shopping for PCs, but we should. It’s like buying compact fluorescent light bulbs: a little more expense up front but you more than make the difference up over time.

After that, the problems get harder to fix and require some work from chip vendors and others. The first problem is that all the speculative execution done in modern chips costs power. Chips guess where a program is likely to go and precalculate several different paths, all of which cost energy. However, they then throw away the results of all but one of those paths. That’s energy wasted. The more predictions a chip makes and the deeper it looks down the pipeline the more energy it wastes calculating results it will never use.

Problems also arise because power drain isn’t linear. A server that draws 100W when pegged at 100% CPU may still draw 50W when idling. This may be able to be improved by better system design. Until then, though, it’s important to try to peg all your servers. It’s much more efficient power-wise to have 50 servers running at 100% load than 100 servers running at 50% load or 100 servers running at an average 10% load.

System vendors should be able to improve on this. Currently the power ratio between peak usage and idling is about 2:1. By contrast human bodies can manage a 20-30:1 ratio between peak power usage and idling.

Of course, in reality, not all servers can run at 100% all the time, and you do need to have some room to spare for peak times. Consequently, Barroso noted that if you’re careful you can oversubscribe a datacenter. I don’t think he gave exact numbers, but it looked like about 20% was safe for Google’s usage patterns.

Faults and Disks

Barroso also spent some time discussing disk failures. He’s co-author??? of a much publicized paper noting that hard disks fail a lot more often than their manufacturers say they do. After losing couple of days to a failed LaCie disk recently, I was more interested in this subject than I used to be.

Google hasn’t found any solutions yet, but they have ruled out several possibilities. For one thing, they find no correlation between temperature and disk failure. Cooling a drive does not help it.

For another S.M.A.R.T diagnostics appear to be a waste of time. Only 46% of disks that fail show any S.M.A.R.T errors at all. 54% fail catastrophically with no prior errors at all. Furthermore, of the disks that do report errors, many continue for months with no actual failures. Barroso did not know how many of the disks that did not fail also reported errors. That is there are four groups of disks:

Fail, reported errors

Fail, did not report errors

Did not fail, reported errors

Did not fail, reported no errors

He only had numbers for the first three.

The Future

Google is working with a variety of vendors on these problems. I happen to know they’re not the only ones either. Sun is very concerned with server power issues, and I suspect others are as well. The Environmental Protection Agency is now looking into server power requirements as servers become a larger and larger portion of the nation’s power budget.

I don’t know when we’re likely to see results, but I hope it’s sooner rather than later. My electricity bill is already too high, and it;s only likely to go up if I get one of the new octo-core Macs I’ve been drooling over. Gentlemen, start your multimeters.