At an event in San Jose on Wednesday, Qualcomm and partners officially announced that its Centriq 2400 server processor based on the Arm-architecture was shipping to commercial clients. This launch is of note as it becomes the highest-profile and most partner-lauded Arm-based server CPU and platform to be released after years of buildup and excitement around several similar products. The Centriq is built specifically for enterprise cloud workloads with an emphasis on high core count and high throughput and will compete against Intel’s Xeon Scalable and AMD’s new EPYC platforms.

Paul Jacobs shows Qualcomm Centriq to press and analysts

Built on the same 10nm process technology from Samsung that gave rise to the Snapdragon 835, the Centriq 2400 becomes the first server processor in that particular node. While Qualcomm and Samsung tout that as a significant selling point, on its own it doesn’t hold much value. Where it does come into play and impact the product position with the resulting power efficiency it brings to the table. Qualcomm claims that the Centriq 2400 will “offer exceptional performance-per-watt and performance-per dollar” compared to the competition server options.

The raw specifications and capabilities of the Centriq 2400 are impressive.

Centriq 2460 Centriq 2452 Centriq 2434 Architecture ARMv8 (64-bit)

Core: Falkor ARMv8 (64-bit)

Core: Falkor ARMv8 (64-bit)

Core: Falkor Process Tech 10nm (Samsung) 10nm (Samsung) 10nm (Samsung) Socket ? ? ? Cores/Threads 48/48 46/46 40/40 Base Clock 2.2 GHz 2.2 GHz 2.3 GHz Max Clock 2.6 GHz 2.6 GHz 2.5 GHz Memory Tech DDR4 DDR4 DDR4 Memory Speeds 2667 MHz

128 GB/s 2667 MHz

128 GB/s 2667 MHz

128 GB/s Cache 24MB L2, split

60MB L3 23MB L2, split

57.5MB L3 20MB L2, split

50MB L3 PCIe 32 lanes PCIe 3.0 32 lanes PCIe 3.0 32 lanes PCIe 3.0 Graphics N/A N/A N/A TDP 120W 120W 120W MSRP $1995 $1383 $888

Built on 18 billion transistors a die area of just 398mm2, the SoC holds 48 high-performance 64-bit cores running at frequencies as high as 2.6 GHz. (Interestingly, this appears to be about the same peak clock rate of all the Snapdragon processor cores we have seen on consumer products.) The cores are interconnected by a bi-directional ring bus that is reminiscent of the integration Intel used on its Core processor family up until Skylake-SP was brought to market. The bus supports 250 GB/s of aggregate bandwidth and Qualcomm claims that this will alleviate any concern over congestion bottlenecks, even with the CPU cores under full load.

The caching system provides 512KB of L2 cache for every pair of CPU cores, essentially organizing them into dual-core blocks. 60MB of L3 cache provides core-to-core communications and the cache is physically divided around the die for on-average faster access. A 6-channel DDR4 memory systems, with unknown peak frequency, supports a total of 768GB of capacity.

Connectivity is supplied with 32 lanes of PCIe 3.0 and up to 6 PCIe devices.

As you should expect, the Centriq 2400 supports the ARM TrustZone secure operating environment and hypervisors for virtualized environments. With this many cores on a single chip, it seems likely one of the key use cases for the server CPU.

Maybe most impressive is the power requirements of the Centriq 2400. It can offer this level of performance and connectivity with just 120 watts of power.

With a price of $1995 for the Centriq 2460, Qualcomm claims that it can offer “4X better performance per dollar and up to 45% better performance per watt versus Intel’s highest performance Skylake processor, the Intel Xeon Platinum 8180.” That’s no small claim. The 8180 is a 28-core/56-thread CPU with a peak frequency of 3.8 GHz and a TDP of 205 watts and a cost of $10,000 (not a typo).

Qualcomm had performance metrics from industry standard SPECint measurements, in both raw single thread configurations as well as performance per dollar and per watt. I will have more on the performance story of Centriq later this week.

More important than simply showing hardware, Qualcomm and several partners on hand at the press event as well as many statements from important vendors like Alibaba, HPE, Google, Microsoft, and Samsung. Present to showcase applications running on the Arm-based server platforms was an impressive list of the key cloud services providers: Alibaba, LinkedIn, Cloudflare, American Megatrends Inc., Arm, Cadence Design Systems, Canonical, Chelsio Communications, Excelero, Hewlett Packard Enterprise, Illumina, MariaDB, Mellanox, Microsoft Azure, MongoDB, Netronome, Packet, Red Hat, ScyllaDB, 6WIND, Samsung, Solarflare, Smartcore, SUSE, Uber, and Xilinx.

The Centriq 2400 series of SoC isn’t perfect for all general-purpose workloads and that is something we have understood from the outset of this venture by Arm and its partners to bring this architecture to the enterprise markets. Qualcomm states that its parts are designed for “highly threaded cloud native applications that are developed as micro-services and deployed for scale-out.” The result is a set of workloads that covers a lot of ground:

Web front end with HipHop Virtual Machine

NoSQL databases including MongoDB, Varnish, Scylladb

Cloud orchestration and automation including Kubernetes, Docker, metal-as-a-service

Data analytics including Apache Spark

Deep learning inference

Network function virtualization

Video and image processing acceleration

Multi-core electronic design automation

High throughput compute bioinformatics

Neural class networks

OpenStack Platform

Scaleout Server SAN with NVMe

Server-based network offload

I will be diving more into the architecture, system designs, and partner announcements later this week as I think the Qualcomm Centriq 2400 family will have a significant impact on the future of the enterprise server markets.