For years, the semiconductor world seemed to have settled into a quiet balance: Intel vanquished virtually all of the RISC processors in the server world, save IBM’s POWER line. Elsewhere AMD had self-destructed, making it pretty much an x86 world. Then Nvidia mowed down all of it many competitors in the 1990s. Suddenly only ATI, now a part of AMD, remained. It boasted just half of Nvidia’s prior market share.

On the newer mobile front, it looked to be a similar near-monopolistic story: ARM ruled the world. Intel tried mightily with the Atom processor, but the company met repeated rejection before finally giving up in 2015.

Then just like that, everything changed. AMD resurfaced as a viable x86 competitor; the advent of field gate programmable array (FPGA) processors for specialized tasks like Big Data created a new niche. But really, the colossal shift in the chip world came with the advent of artificial intelligence (AI) and machine learning (ML). With these emerging technologies, a flood of new processors has arrived—and they are coming from unlikely sources.

That macro-view doesn’t even begin to account for the startups. The New York Times puts the number of AI-dedicated startup chip companies—not software companies, silicon companies—at 45 and growing, but even that estimate may be incomplete. It’s tricky to get a complete picture since some are in China being funded by the government and flying under the radar.

Why the sudden explosion in hardware after years of chip maker stasis? After all, there is general consensus that Nvidia’s GPUs are excellent for AI and are widely used already. Why do we need more chips now, and so many different ones at that?

The answer is a bit complex, just like AI itself.

Google

Microsoft

Intel

Follow the money (and usage and efficiency)

While x86 currently remains a dominant chip architecture for computing, it’s too general purpose for a highly specialized task like AI, says Addison Snell, CEO of Intersect360 Research, which covers HPC and AI issues.

“It was built to be a general server platform. As such it has to be pretty good at everything,” he says. “With other chips, [companies are] building something that specializes in one app without having to worry about the rest of the infrastructure. So leave the OS and infrastructure overhead to the x86 host and farm things out to various co-processors and accelerators.”

The actual task of processing AI is a very different process from standard computing or GPU processing, hence the perceived need for specialized chips. A x86 CPU can do AI, but it does a task in 12 steps when only three are required; a GPU in some cases can also be overkill.

Generally, scientific computation is done in a deterministic fashion. You want to know two plus three equals five and calculate it to all of its decimal places—x86 and GPU do that just fine. But the nature of AI is to say 2.5 + 3.5 is observed to be six almost all of the time without actually running the calculation. What matters with artificial intelligence today is the pattern found in the data, not the deterministic calculation.

In simpler terms, what defines AI and machine learning is that they draw upon and improve from past experience. The famous AlphaGo simulates tons of Go matches to improve. Another example you use every day is Facebook’s facial recognition AI, trained for years so it can accurately tag your photos (it should come as no surprise that Facebook has also made three major facial recognition acquisitions in recent years: Face.com [2012], Masquerade [2016], and Faciometrics [2016]).

Once a lesson is learned with AI, it does not necessarily always have to be relearned. That is the hallmark of Machine Learning, a subset of the greater definition of AI. At its core, ML is the practice of using algorithms to parse data, learn from it, and then make a determination or prediction based on that data. It’s a mechanism for pattern recognition—machine learning software remembers that two plus three equals five so the overall AI system can use that information, for instance. You can get into splitting hairs over whether that recognition is AI or not.

AI for self-driving cars, for another example, doesn’t use deterministic physics to determine the path of other things in its environment. It’s merely using previous experience to say this other car is here traveling this way, and all other times I observed such a vehicle, it traveled this way. Therefore, the system expects a certain type of action.

The result of this predictive problem solving is that AI calculations can be done with single precision calculations. So while CPUs and GPUs can both do it very well, they are in fact overkill for the task. A single-precision chip can do the work and do it in a much smaller, lower power footprint.

Make no mistake, power and scope are a big deal when it comes to chips—perhaps especially for AI, since one size does not fit all in this area. Within AI is machine learning, and within that is deep learning, and all those can be deployed for different tasks through different setups. “Not every AI chip is equal,” says Gary Brown, director of marketing at Movidius, an Intel company. Movidius made a custom chip just for deep learning processes because the steps involved are highly restricted on a CPU. “Each chip can handle different intelligence at different times. Our chip is visual intelligence, where algorithms are using camera input to derive meaning from what’s being seen. That’s our focus.”

Brown says there is even a need and requirement to differentiate at the network edge as well as in the data center—companies in this space are simply finding they need to use different chips in these different locations.

“Chips on the edge won’t compete with chips for the data center,” he says. “Data center chips like Xeon have to have high performance capabilities for that kind of AI, which is different for AI in smartphones. There you have to get down below one watt. So the question is, ‘Where is [the native processor] not good enough so you need an accessory chip?’”

After all, power is an issue if you want AI on your smartphone or augmented reality headset. Nvidia’s Volta processors are beasts at AI processing but draw up to 300 watts. You aren’t going to shoehorn one of those in a smartphone.

Sean Stetson, director of technology advancement at Seegrid, a maker of self-driving industrial vehicles like forklifts, also feels AI and ML have been ill served by general processors thus far. “In order to make any algorithm work, whether it’s machine learning or image processing or graphics processing, they all have very specific workflows,” he says. “If you do not have a compute core set up specific to those patterns, you do a lot of wasteful data loads and transfers. It’s when you are moving data around when you are most inefficient, that’s where you incur a lot of signaling and transient power. The efficiency of a processor is measured in energy used per instruction.”

A desire for more specialization and increased energy efficiency isn’t the whole reason these newer AI chips exist, of course. Brad McCredie, an IBM fellow and vice president of IBM Power systems development, adds one more obvious incentive for everyone seemingly jumping on the bandwagon: the prize is so big. “The IT industry is seeing growth for the first time in decades, and we’re seeing an inflection in exponential growth,” he says. “That whole inflection is new money expected to come to IT industry, and it’s all around AI. That is what has caused the flood of VC into that space. People see a gold rush; there’s no doubt.”