Fifteen months ago, a CPU developer named Calxeda made waves when it announced a joint effort with HP to develop dense ARM servers that would challenge x86 supremacy in the server market. The company promised that it could leverage the low power consumption of ARM products to build clusters of Cortex-A9 SoCs inside a rackmounted chassis.

There have always been questions about whether or not Calxeda’s approach would actually scale in real-world server workloads. Calxeda’s system design stacks EnergyCards in rows atop a large motherboard. Each EnergyCard contains four SoCs, four DIMMs, and 16 SATA ports. The SoCs are all quad-core Cortex-A9s with a larger-than-average L2 cache (4MB rather than 1MB). That works out to 16 Cortex-A9 cores per EC. Maximum memory per SoC is 4GB due to the Cortex-A9’s 32-bit limitations.

Anandtech’s Johan De Gelas (a name old-timers will recognize from Aces Hardware) has benchmarked and written the first review of a Calxeda-based system, the Boston Viridis. This system contains six EnergyCards, totaling 24 CPUs (96 Cortex-A9 cores) clocked at 1.4GHz. Anandtech ran the system through a range of synthetic and real-world application tests and compared its single- and quad-threaded performance to both Atom and Xeon-based solutions.

The results are sure to make Intel sit up and take notice. The ECX-1000 processor at the heart of the Viridis lags even Atom in some metrics, like bandwidth utilization (Atom is ridiculously slow compared to Xeon processors, just to put that in perspective). Its per-thread performance in integer workloads, however, is quite competitive with Intel’s in-order architecture. While it never matches the Xeon-based products in terms of single-threaded performance per clock, the synthetic tests show the ECX-1000 is an excellent product.

The real-world tests are stunning. Not only does Calxeda’s array of “wimpy” cores outperform Xeon processors in web server tests, it beats them in both raw performance and performance-per-watt. De Gelas writes that “the Calxeda’s ECX-1000 server node is revolutionary technology.” After seeing the performance figures, I agree. There’s a place for ARM products in the datacenter. This also makes AMD’s long-term bet on an ARM server solution look like a good idea.

The current caveats

There are still a number of real-world limitations on Calxeda’s ARM products. They’re limited by maximum RAM (4GB), the Cortex-A9’s bandwidth and architectural limitations, and the fact that software support is still in very early stages. If you wanted to buy the most flexible solution available today, you’d buy a Xeon or an Opteron, hands down. The Boston Viridis server Anandtech reviewed runs about $20,000 while the x86 hardware is less than half that price. Power consumption matters — but $12,000 per box pays for an awful lot of wattage.

Then there’s the external factors. Calxeda’s roadmap shows Cortex-A15 and future 64-bit Cortex-A57 CPUs as being in the pipe, but Intel has its own 22nm Atom refresh coming later this year. Atom is badly in need of a new architecture; the 22nm design could flip the performance advantage back to its own camp. Software and OS compatibility also favor x86, and by a wide margin. It’s also true that the upcoming ARM processors will inevitably draw more power than the Cortex-A9 — whether you use ARM or x86, there’s no getting around the fact that higher single-thread performance costs more energy, as does adding RAM.

ARM server shipments will be fractional for the next few years, but this is the biggest potential challenge to x86’s server monopoly in well over a decade. Success is scarcely assured, but the technology has promise.

Now read: A tour of Google’s top-secret data centers