Artificial Intelligence—or, if you prefer, Machine Learning—is today's hot buzzword. Unlike many buzzwords have come before it, though, this stuff isn't vaporware dreams—it's real, it's here already, and it's changing your life whether you realize it or not.

A quick overview of AI/ML

Before we go too much further, let's talk quickly about that term "Artificial Intelligence." Yes, it's warranted; no, it doesn't mean KITT from Knight Rider, or Samantha, the all-too-human unseen digital assistant voiced by Scarlett Johansson in 2013's Her. Aside from being fictional, KITT and Samantha are examples of strong artificial intelligence, also known as Artificial General Intelligence (AGI). On the other hand, artificial intelligence—without the "strong" or "general" qualifiers—is an established academic term dating back to the 1955 proposal for the Dartmouth Summer Project on Artificial Intelligence (DSRPAI), written by Professors John McCarthy and Marvin Minsky.

All "artificial intelligence" really means is a system that emulates problem-solving skills normally seen in humans or animals. Traditionally, there are two branches of AI—symbolic and connectionist. Symbolic means an approach involving traditional rules-based programming—a programmer tells the computer what to expect and how to deal with it, very explicitly. The "expert systems" of the 1980s and 1990s were examples of symbolic (attempts at) AI; while occasionally useful, it's generally considered impossible to scale this approach up to anything like real-world complexity.

Artificial Intelligence in the commonly used modern sense almost always refers to connectionist AI. Connectionist AI, unlike symbolic AI, isn't directly programmed by a human. Artificial neural networks are the most common type of connectionist AI, also sometimes referred to as machine learning. My colleague Tim Lee just got done writing about neural networks last week—you can get caught up right here.

If you wanted to build a system that could drive a car, instead of programming it directly you might attach a sufficiently advanced neural network to its sensors and controls, and then let it "watch" a human driving for tens of thousands of hours. The neural network begins to attach weights to events and patterns in the data flow from its sensors that allow it to predict acceptable actions in response to various conditions. Eventually, you might give the network conditional control of the car's controls and allow it to accelerate, brake, and steer on its own—but still with a human available. The partially trained neural network can continue learning in response to when the human assistant takes the controls away from it. "Whoops, shouldn't have done that," and the neural network adjusts weighted values again.

Sounds very simple, doesn't it? In practice, not so much—there are many different types of neural networks (simple, convolutional, generative adversarial, and more), and none of them is very bright on its own—the brightest is roughly similar in scale to a worm's brain. Most complex, really interesting tasks will require networks of neural networks that preprocess data to find areas of interest, pass those areas of interest onto other neural networks trained to more accurately classify them, and so forth.

One last piece of the puzzle is that, when dealing with neural networks, there are two major modes of operation: inference and training. Training is just what it sounds like—you give the neural network a large batch of data that represents a problem space, and let it chew through it, identifying things of interest and possibly learning to match them to labels you've provided along with the data. Inference, on the other hand, is using an already-trained neural network to give you answers in a problem space that it understands.

Both inference and training workloads can operate several orders of magnitude more rapidly on GPUs than on general-purpose CPUs—but that doesn't necessarily mean you want to do absolutely everything on a GPU. It's generally easier and faster to run small jobs directly on CPUs rather than invoking the initial overhead of loading models and data into a GPU and its onboard VRAM, so you'll very frequently see inference workloads run on standard CPUs.