Why deep neural networks don’t actually think

Neural networks continue to solve ever more challenging problems, but will never think as long as we used a “layered” approach

Overview

It’s 2019 now and we have machines that can

But the one thing that machines fail to do is…. think. Let’s talk about why.

Figure 1: Magazine cover glamorizing Artificial Intelligence

Popular Science and Fiction

Magazines continue to draw a link between human and artificial intelligence which blurs our common understanding of what’s capable. Headlines continue to allude to the “fact” that machines “think” or are somehow imbued with the ability to act in a sentient manner.

We continue to conjure up images of “Rosy” from the Jetsons or even HAL from “2001: A Space Odyssey.” It’s a lot more fun to think of robots that can think than to realize the cold hard truth that they (currently) can’t. It may seem like semantics, but let’s delve into how the basic mechanisms of these artificially “intelligent” systems work.

Biological Inspiration

Figure 2: Artist rendition of neurons in the brain

You’ll sometimes hear how Artificial Neural Networks are “biologically inspired.” We have neurons in our brains (over 100 billion!) and they connect to each other in many interesting ways. In fact there are over 100 trillion neural connections (also called synapses) in the average human brain.

To put that into perspective, that’s over 1,000 times the number of stars in our galaxy.

These neurons are exceedingly simple in that all they do is receive a small electrical impulse from one or more neurons and pass along an electrical impulse to one or more other neurons via the synapses. That’s it. That’s how all of your thoughts, emotions, memories, actions, decisions and regrets happen — tiny electrical impulses through neurons that decide whether to (and the amount of) send a signal to other neurons.

This is a REAL neural network. Your brain has actual neurons that connect (in non linear ways) to each other in all sorts of fantastic arrangements. A single neuron can be connected to either one neuron or thousands — the possibilities are endless.

Artificial Neural Networks (ANNs)

Almost all of the incredible advancements in machine learning as of late are due to the creation of the “Artificial Neural Network,” which is an attempt, albeit an extremely simple one, to model how the brain works. All of these fantastic feats are achieved with the help of a technology that was first conceived of in the 1940’s. That’s right — all of our modern advances in machine learning and artificial intelligence are because of an 80 year old advancement. It was further refined in the 1970’s with the advent of “back propagation.” This magical innovation is what gives machines the apparent ability to “think.”

Some 8th Grade Math

Before we understand how ANNs work, let’s revisit some basic algebra. At it’s core, algebra is solving equations with unknowns (called “variables.”) A “system of equations” is when you have multiple equations and multiple variables and you need to solve for those variables. Something like:

Figure 3: A basic system of equations

If you have two variables, in this case x and y, you need at least 2 equations to figure out what those variables are. I won’t force you to go through the math, but in this case x is 3 and y is -1/9. If you had only one of those equations, say 8x + 9y = 23, well there’s a lot of numbers that can satisfy that. If x = 0, then y would have to be 2.555555. If x=1, then y would be 1.66666666. And it goes on infinitely from there. If we had only one equation and 2 variables, there are literally an infinite amount of possible solutions.

It’s only by adding the other equation that we have introduced some bounds. But let’s expand it a bit, and instead of having 2 variables and 2 equations, we have 1,000 variables and 100 equations. We wouldn’t be able to accurately solve but we could probably get pretty close.

That’s all an ANN is doing. It’s trying to come up with numbers for the (potentially) tens of thousands of variables and since it can’t come up with numbers that DEFINITELY work, it’s finding numbers that give the fewest errors.

We’re about to get into the architecture of ANNs, and the math can get pretty intense, but for our purposes, the “simple system of equations” model will suffice.

Neural Network Model