Why ARM Servers, And Why Now?

It is a pity that smartphones and tablets did not come along earlier and did not need 64-bit processing and memory addressing sooner than they did. Had these consumer devices (which are now generally thought of as being indispensable for business as well) required such rich circuitry earlier, then the collective of chip manufacturers who are part of the ARM collective might have put some server-class chips into the field a lot earlier and given datacenters some real alternatives to the X86 architecture by now.

But that did not happen, and for a variety of chicken-and-egg reasons. This was one of the topics of conversation at the ARM Tech Day dedicated to servers that ARM Holdings hosting in Austin, Texas, last week and which EnterpriseTech attended. The ARM server folks were on hand to drill down a bit into the technology that the holding company creates on behalf of its licensees and a number of vendors demonstrated their hard and soft wares to show that the ARM server ecosystem is coming along.

It is helpful to remember that from the birth of the IBM PC in 1981 it took a fifteen years for a true server processor, the Pentium Pro, to come to market, and it took another decade before the X86 architecture was the unquestioned leader in volumes and revenues in the systems space. These things take time, mainly because designing a processor is one of the most complex endeavors on Earth and building a software ecosystem for server workloads on any new processor requires the coordinated efforts of untold thousands of programmers. No one wants to be too early, as was the case with ARM server chip upstart Calxeda, which had a clever design integrating serving and switching functions on the same die but which spent all of its venture capital before getting to 64 bits and ahead of the software ecosystem. But no one wants to be too late, either, especially when giving Intel years to bring its chip engineers and manufacturing prowess to bear can wipe out any significant lead.

You rarely get second chances coming up against Intel.

Even without 64-bit ARM chips shipping in volume, the market is nonetheless getting what it needs from ARM Holdings and the half dozen companies who are working on ARM server chip designs: Pressure on Intel to make a wider variety of chips, often with customizations for large-scale buyers and with a plethora of performance bands, prices, and thermal characteristics. This is, of course, precisely the role that Intel’s own Pentium Pro and then Xeon processors played against RISC and CISC architectures two decades ago. Just knowing there is an option gives companies leverage. Just ask Google and Amazon Web Services, both of which are rumored to be designing their own custom ARM server chips. Google is heading up the OpenPower Foundation started by IBM to promote and open up its Power8 processors for much the same reason.

Leverage is interesting, but what the ARM collective wants is to disrupt the hegemony of X86 in the datacenter. Over two decades, Intel has brilliantly changed the price/performance of serving and is now doing the same for storage and switching, but it is useful to remember a few things as we contemplate ARM doing the same thing to X86.

First, the ARM hardware and software ecosystem is huge, spanning from the tiniest sensors to servers, and it is just as reasonable to predict that ARM will take over the datacenter as Intel will take over smartphones, tablets, and other devices. With an increasing number of hyperscale Web application providers owning their own stacks and public clouds offering up services as well as raw infrastructure for running Windows, Linux, or Unix instances, a very large portion – and importantly, a portion that keeps growing every year and the world gets cloudier – of the server installed base will be able to make the jump when several 64-bit ARM server chips are in the field. Even Microsoft has joined the Open Compute Project, donating its own minimalist server design from its Azure cloud, and is also participating with ARM Holdings on developing hardware and operating system standards for ARM server chips. Microsoft may end up using Windows Server on ARM internally long before it makes it available as a commercial product. Microsoft has made no such commitment to put Windows Server on ARM chips, but with Windows accounting for nearly half of worldwide server revenues and an even larger share of shipments, having both Windows and Linux available on ARM would anoint these RISC processors as the alternative to X86 in the datacenter.

The questions that many people are asking are: Why ARM servers and why now? Jeff Underhill, director of server programs at ARM Holdings, gave the answer to both questions.

“People are doing things at a scale that we have not seen before,” explained Underhill. “And they are optimizing for total cost of ownership.”

This makes perfect sense, given that the hyperscale datacenter operators are by and large providing software or raw compute as a service and their IT infrastructure and their people are the only capital expenses they have. In many cases, the services are free and companies derive their revenues from advertising or other data services they provide. (In some sense, access to Web surfers is in fact the product, and the IT infrastructure is just there to gather them up and engage them.) So every penny not spent on IT infrastructure, space, power, or cooling is one that drops to the bottom line. In other industries, IT spending might be on the order of 2 to 5 percent of revenues, we estimate, but it is many times larger at the hyperscale companies and public cloud builders. So the amount of money we are talking about here is huge. Facebook has said that by designing its own datacenters and systems it has saved over $1.2 billion.

Moreover, the style of applications in use by hyperscale companies is amenable to a shift to the ARM architecture. Many modern data storage and analytics applications are not CPU bound – meaning having a very fast single-thread or core is not as important as having lots of memory or I/O bandwidth. These new workloads, ranging from Memcached caching software to distributed Web servers to relational database clusters to Hadoop and other data analytics to various NoSQL data stores, are well suited to ARM processors, says Underhill. The fact that much of this software is open source means it can be ported from X86 chips to ARM processors; you don’t have to wait for a software vendor to do the work.

The other thing that hyperscale companies need is speed, and they need it on a few fronts. First, they want to be able to iterate system designs faster and they want multiple sources for parts so as to not be beholden to one vendor and its product roadmaps. The ARM ecosystem has two companies – AMD and Applied Micro – sampling 64-bit chips now, and others are working to get processors in the field before too long. (Google and Amazon may be building their own as well.) They also want speed in the form of efficiency in processor itself and broad compatibility with the current software ecosystem largely dominated by X86 iron in the datacenter.

“ARMv8 is a clean 64-bit instruction set architecture, aligned to the programming style on the Web and in the cloud,” says Underhill. All of the key software components needed for hyperscale operators (and many enterprises) are in the stack:

As Canonical announced two weeks ago, Ubuntu Server 14.04 LTS is the X-Gene 1 from Applied Micro and the Thunder from Cavium Networks. Red Hat has demonstrated its Fedora development Linux on the X-Gene 1 and AMD’s “Seattle” Opteron A1150 processors, and Jon Masters, chief ARM architect at Red Hat, said at the ARM Tech Day that 98.6 percent of the packages in RHEL are “ARM clean,” and added that this is a full-on 64-bit implementation of the stack and that Red Hat would not support 32-bit code and that the support for 64KB memory pages made it impossible to do a 32-bit port. That said, you can always load a virtual machine with a version of Linux that does support 32-bit code if you need to, said Masters. Red Hat will not comment on when or how ARM support will be available with the commercial-grade Enterprise Linux, and a lot depends on the availability of hardware that supports various standards (UEFI and ACPI for instance), demand from customers, and the readiness of key programs such as Java. SUSE Linux has added support for ARM chips in the openSUSE development version and is very likely shooting to have ARM support in Enterprise Server 12, due to go into beta maybe later this year for production next year.

Red Hat and Canonical are doing a lot of the work to get the KVM hypervisor running properly on 64-bit ARM and Citrix Systems is also working to get the Xen hypervisor, which is the preferred virtualizer on Linux-based public clouds (Amazon Web Services, Rackspace Hosting, and IBM SoftLayer all use a variant of Xen; it is not clear what Google uses and Microsoft clearly uses Hyper-V). Stephano Stabellini, senior principal software engineer at Citrix, explained that Xenon ARM was a “lean and simple architecture” that “removed all of the cruft accumulated over the years” in the X86 implementation of Xen. The ARM variant of Xen has no emulation and does not make use of QEMU, and it only provides one type of guest, which combines the two options available on X86 machinery. (That would be the full virtualization of a Hardware Virtual Machine and the partial virtualization available through Para-Virtualization).

Some coders started part-time hacking on the ARM version of Xen at the end of 2011 and development accelerated after Citrix joined the Linaro Linux-on-ARM effort. Last June, the ARM64 variant of Xen was added in the Linux 3.11 kernel and support for this was added to Xen 4.3 last July. Xen 4.4 came out this March and includes, among other features, support for 64-bit guests on ARMv8 iron, memory ballooning, CPU pooling, vCPU pinning, and scheduler improvements. The OMAP5 chip from Texas Instruments and the X-Gene 1 from Applied Micro can run Xen 4.4, as can a number of development boards with various 32-bit processors. With Xen 4.5, due in the fourth quarter, the hypervisor will support UEFI and ACPI, live migration of virtual machines, and have a bunch of new drivers for core components of ARM processors.

With a few suppliers of processors shipping products late this year and early next and operating system stacks getting close to polished for ARM, 2015 looks like the year when ARM will actually start putting actual pressure on X86 servers, and to a lesser extent, on other architectures such as Power and Sparc. Underhill says that ARM Holdings figures that ARM chips could account for somewhere between 5 percent and 10 percent of server shipments in 2017; there are those who think this number of a low-ball figure.

The expectation is that the hyperscale datacenter operators are going to jump in first, and at the AMD announcements this week, where the X86 and ARM chip maker fleshed out its “ambidextrous computing” strategy, Paul Santeler, general manager of Hewlett-Packard’s Hyperscale Business Unit, explained why.

“The dense server market is different from the enterprise server space, which has multiple operating systems across multiple architectures,” said Santeler. “The scale-out architectures are very homogeneous in how they are built. They have multiple application tiers, they get that structured, and then they scale it out when they need more capacity or to add more users. That allows them to that core software expertise, and they adapt new technologies pretty quickly, too. If you look at all of the new technologies, they are actually pushing quicker through scale-out datacenters – flash memory, scale-out databases, open source – because they need these capabilities and they see an immediate return on investment. I think scale-out datacenters will be the on ramp for ARM64.”

A lot will depend, of course, on how well ARM chips perform, how widely available they are, and at what price. All of these are still largely variables at this point, which means everyone is still guessing. But having seen a client device take over the datacenter once before, we all know for certain that it can happen. The questions is: Can it happen again?