A little less than 2 years ago, we investigated the first Arm server SoC that had a chance to compete with midrange Xeon E5s: the Cavium ThunderX. The SoC showed promise, however the low single-threaded performance and some power management issues relegated the 48-core SoC to more niche markets such as CDN and Web caching. In the end, Cavium's first server SoC was not a real threat to Intel's Xeon.

But Cavium did not give up, and rightfully so: the server market is more attractive than ever. Intel's datacenter group is good for about 20 Billion USD (!) in revenue per year. And even better, profit margins are in 50% range. When you want to profits and cash flow, the server market far outpaces any other hardware market. So following the launch of the ThunderX, Cavium promised to bring out a second iteration: better power management, better single thread performance and even more cores (54).

The trick, of course, is actually getting to a point where you can take on the well-oiled machine that is Intel. Arm, Calxeda, Broadcom, AppliedMicro and many others have made many bold promises over the past 5 years that have never materialized, so there is a great deal of skepticism – and rightfully so – towards new Arm Server SoCs.

However, the new creation of underdog Cavium deserves the benefit of the doubt. Much has changed – much more than the name alone lets on – as Cavium has bought the "Vulcan" design from Avago. Vulcan is a rather ambitious CPU design which was originally designed by the Arm server SoC team of Broadcom, and as a result has a much different heritage than the original ThunderX. At the same time however, based on its experience from the ThunderX, Cavium was able to take what they've learned thus far and have introduced some microarchitectural improvements to the Vulcan design to improve its performance and power.

As a result, ThunderX2 is a much more "brainiac" core than the previous generation. While the ThunderX core had a very short pipeline and could hardly sustain 2 instructions per clock, the Vulcan core was designed to fetch 8 and execute up to 4 instructions per clock. It gets better: 4 simultaneous threads can be active (SMT4), ensuring that the wide back-end is busy most of the time. 32 of those cores at clockspeeds up to 2.5 GHz find a home in the new ThunderX2 SoC.

With up to 128 threads running and no less than eight DDR4 controllers, this CPU should be able to perform well in all server situations. In other words, while the ThunderX (1) was relegated to niche roles, the ThunderX2 is the first Arm server CPU that has a chance to break the server market open.