Until now, Intel was optimizing its existing general-purpose Xeon CPUs (central processing unit) for DL (deep learning) training and inference. This technique wasn’t efficient, so it developed dedicated processors to provide flexibility and efficiency in various types of DL models.

Intel’s NNP-T (code-named Spring Crest) is a scalable 16 nm processor featuring 24 tensor cores dedicated to AI workloads. Even NVIDIA’s Volta and Turing GPUs and Google’s custom tensor processing unit use tensor cores for AI. NNP-T features 32 GB of HBM2 (high-bandwidth memory) and delivers the performance of 119 TOPS (theoretical operations per second). The CPU supports an x16 PCIe 4.0 (peripheral component interconnect express) connection and is expected to consume 150–250 watts of power. Nervana Systems connects all these elements on a single die using TSMC’s advanced chip-on-wafer-on-substrate packaging.