Intel has taken the wraps off Knights Landing, its next-gen, up-to-72-core Xeon Phi supercomputing chip. The main change is that Knights Landing will be a standalone processor, rather than a slot-in coprocessor that must be paired with standard Xeon CPU. Furthermore, Knights Landing will have up to 16GB of DRAM 3D stacked on-package, providing up to 500GB/sec of memory bandwidth (along with up to 384GB of DDR4-2400 mainboard memory). Knights Landing will debut in 2015 on Intel’s 14nm process, and with a promise of 3 teraflops (double precision) per socket it will almost certainly be used to build some monster 100+ petaflop x86 supercomputers, and beyond to exascale.

The current version of Xeon Phi (Knights Corner) is a PCIe expansion board with an up-to-61-core Intel MIC (Many Integrated Core) chip. These cores are based on the original P54C Pentium core — just like its stillborn Larrabee predecessor — but with a lot of modern additions, such as 64-bit support and 512-bit vector registers. (Read more details about the current Xeon Phi.) Knights Landing is a major revision of Knights Corner, making sweeping changes across almost the entirety of the platform. Gone are the P54C cores, replaced with up to 72 out-of-order Silvermont (Atom) cores. These new cores will implement AVX-512 (AVX 3.1 instructions). Perhaps most importantly, though, Knights Landing will be a standalone CPU, with an integrated six-channel DDR4-2400 memory controller, up to 16GB of on-package 3D stacked RAM, and 36 PCIe 3.0 lanes.

All of these changes equate to theoretical performance of 6 teraflops of single precision math, or 3 teraflops of double precision math. By comparison, Haswell maxes out at around 500 gigaflops of double precision math. Power-wise, Knights Landing should manage between 14 and 16 gigaflops per watt. While it’s a nascent comparison, the most efficient supercomputers currently max out at around 4 gigaflops per watt. With 16GB of on-package RAM with bandwidth of 500GB/sec, there should be significant latency gains, too. Later in 2015, there will also be a special version called Knights Landing-F that integrates a 100Gbps Cray HPC interconnect on 32 of those PCIe 3.0 lanes, allowing supercomputer makers to connect up Knights Landing chips via standard QSFP optical links.

Tianhe-2 ), but adoption is generally lower (just 13 of the top 500). By becoming an actual CPU, rather than an add-in card that must be controlled by a “normal” CPU (Haswell, Opteron, etc.), it will be possible to build supercomputers entirely out of Xeon Phi — a huge change that will both reduce the complexity and cost of building supercomputers, and, thanks to the unified architecture, it’ll be a lot easier to write software that takes full advantage of the hardware.

At 3 teraflops per socket, assuming four sockets per 1U server, we’re looking at a full 500 teraflops (half a petaflop) in a single 42U rack. If the 100 petaflops barrier hasn’t been broken by 2015, it will almost certainly be a Knights Landing-based supercomputer that does it first — and it should be a serious competitor for the race to exascale (1000+ petaflops) computing.

Now read: The history of supercomputers