Intel enlisted one of the most enthusiastic users of deep learning and artificial intelligence to help out with the chip design. "We are thrilled to have Facebook in close collaboration sharing their technical insights as we bring this new generation of AI hardware to market," said Intel CEO Brian Krzanich (below) during his keynote speech at WSJDLive. On top of social media, Intel is targeting healthcare, automotive and weather, among other applications.

Intel CEO Brian Krzanich at WSJDLive (AOL/Nicole Lee)

Unlike its PC chips, the Nervana NNP is an application-specific integrated circuit (ASIC) that's specially made for both training and executing deep learning algorithms. "The speed and computational efficiency of deep learning can be greatly advanced by ASICs that are customized for ... this workload," writes Intel's VP of AI, Naveen Rao.

The chips are designed to do matrix multiplication and convolutions, among the most common calculations done by deep learning programs. Intel has eliminated the generalized cache normally seen on CPUs, instead using special software to manage on-chip memory for a given algorithm. "This enables the chip to achieve new levels of compute density and performance for deep learning," says Rao.

The chip is also designed with high-speed interconnects both on and off the chip, allowing for "massive bi-directional data transfer." That means if you link a bunch of the chips together, they can act as a huge virtual chip, allowing for increasingly larger deep-learning models.

Oddly, the Nervana NNP uses a lower-precision form of integer math called Flexpoint. "Neural networks are very tolerant to data 'noise' and this noise can even help them converge on solutions," Rao adds. At the same time, using lower-precision numbers allowed designers to increase so-called parallelism, reducing latency and increasing bandwidth.

The goal of this new architecture is to develop a processor that is flexible enough to handle Deep learning workloads and scalable enough to handle high intensity computation requirements by making core hardware components as efficient as possible.

NVIDIA has famously pushed Intel to the side of the road in AI thanks to a sort of lucky accident. As it happens, the GPUs it uses in graphics cards and supercomputers are the best option for training AI algorithms -- though not executing them -- so companies like Google and Facebook have been using them that way. Meanwhile, Intel's arch-rival Qualcomm has been working on chips that are exceptionally good at executing AI programs.

Intel is no doubt hoping to change that formula with the Nervana NNP chips, which are efficient at both AI training and execution. The company says it has "multiple generations" of the chips in the pipeline, and obviously has the manufacturing and sales infrastructure needed to pump them out in volume and get them into clients' hands. Intel is also working on a so-called neuromorphic chip called Loihi that mimics the human brain, and of course has the Myriad X chip designed specifically for machine vision.

While Intel is hoping to at least catch up to NVIDIA, the latter isn't exactly standing still. It recently released the V100 chip specifically for AI apps, and hired Clément Farabet as VP of AI infrastructure, likely with the aim of making chips that are just as good at running deep learning programs as they are at training them. At the same time, Google has built its own "Tensor Processing Unit" (TPU) that it strictly uses in its own data centers, and IBM has a neuromorphic chip dubbed "True North." In other words, if you think we've reached peak AI, you haven't seen anything yet.