Photo by Paul on Unsplash

One of the biggest misconceptions around is the idea that Deep Learning (DL) or Artificial Neural Networks (ANN) mimics biological neurons. At best, ANN mimics a cartoonish version of a 1957 model of a neuron. Anyone claiming Deep Learning is biologically inspired is doing so for marketing purposes or has never bothered to read the biological literature. Neurons in Deep Learning are essentially mathematical functions that perform a similarity function of its inputs against internal weights. The closer a match is made, the more likely action is performed (i.e., not sending a signal to zero). There are exceptions to this model (see: Autoregressive networks); however, it is general enough to include the perceptron, convolution networks, and RNNs.

Neurons are very different from DL constructs. They don’t maintain continuous signals but rather exhibit spiking or event-driven behavior. So, when you hear about “neuromorphic” hardware, then these are inspired to “integrate and spike” neurons. These kinds of system, at best, get a lot of press (see: IBM TrueNorth), but have never been shown to be effective. However, there has been some research work that has shown some progress (see: https://arxiv.org/abs/1802.02627v1). If you ask me, if you genuinely want to build biologically-inspired cognition, then you should at the very least explore systems are not continuous like DL. Biological systems, by nature, will use the least amount of energy to survive. DL systems, in stark contrast, are power hungry. That’s because DL is a brute-force method to achieve cognition. We know it works, we just don’t know how to scale it down.

Jeff Hawkins of Numenta has always lamented that a more biologically-inspired approach is needed. So, in his research in building cognitive machinery, he has architected systems that try to more closely mirror the structure of the neocortex. Numenta’s model of a neuron is considerably more elaborate than the Deep Learning model of a neuron as you can see in this graphic:

The team at Numenta is betting on this approach in the hopes of creating something that is more capable than Deep Learning. It hasn’t been proved to be anywhere near successful. They’ve been doing this long enough that the odds of them succeeding are diminishing over time. By contrast, Deep Learning (despite its model of a cartoon neuron) is unexpectedly effective in performing all kinds of mind-boggling feats of cognition. Deep Learning is doing something that is extraordinarily correct; we just don’t know exactly what that is!

Unfortunately, we have to throw in a new monkey wrench on all this research. Newer experiments on the nature of neurons have revealed that biological neurons are even more complex than we have imagined them to be:

(1) A single neuron’s spike waveform typically varies as a function of the stimulation location. (2) Spatial summation is absent for extracellular stimulations from different directions. (3) Spatial summation and subtraction are not achieved when combining intra- and extra- cellular stimulations, as well as for nonlocal time interference, where the precise timings of the stimulations are irrelevant.

In short, there is a lot more going on inside a single neuron than the simple idea of integrate-and-fire. Neurons may not be pure functions dependent on a single parameter (i.e., weight) but instead, they are stateful machines. Alternatively, perhaps the weight may not even be single-valued and instead requires complex-valued or maybe higher dimensions. This is all behavior that research has yet to explore, and thus we have little understanding to date.

Vince Daria, a researcher at the Australian National University, has been exploring the computational complexity of single neurons using biological neuron from the cortical column of rodent brains. He leverages advanced optical techniques (neurophotonics) to stimulate and record the activity within single neurons. Using light, he can stimulate multiple inputs along the dendritic tree. Through analysis of the behavior, he’s uncovering a much richer complexity than expected. Daria has discovered that there is additional computation complexity within the dendrite.

If you think this throws a monkey wrench on our understanding, there’s an even newer discovery that reveals even greater complexity:

Many of the extracellular vesicles released by neurons contain a gene called Arc, which helps neurons to build connections with one another. Mice engineered to lack Arc have problems forming long-term memories, and several human neurological disorders are linked to this gene.

What this research reveals is that there is a mechanism for neurons to communicate with each other by sending packages of RNA code. To clarify, these are packages of instructions and not packages of data. There is a profound difference between sending codes and sending data. This implies that behavior from one neuron can change the behavior of another neuron; not through observation, but rather through injection of behavior.

This code exchange mechanism hints at the validity of my earlier conjecture: “Are biological brains made of only discrete logic?”

Experimental evidence reveals a new reality. Even at the smallest unit of our cognition, there is a kind of conversational cognition that is going on between individual neurons that modifies each other’s behavior. Thus, not only are neurons machines with state, but they are also machines with an instruction set and a way to send code to each other. I’m sorry, but this is just another level of complexity.

There are two obvious ramifications of these experimental discoveries. The first is that our estimates of the computational capabilities of the human brain are likely to be at least an order of magnitude off. The second is that research will begin in earnest to explore DL architectures with more complex internal node (or neuron) structures.

If we were to make the rough argument that a single neuron performs a single operation, the total capacity of the human brain is measured at 38 peta operations per second. If were then to assume a DL model of operations being equal to floating point operations then a 38 petaflops system would be equivalent in capability. The top-ranked supercomputer, Sunway Taihulight from China is estimated at 125 petaflops. However, let’s say the new results reveal 10x more computation, then the number should be 380 petaflops, and we perhaps have breathing room until 2019. What is obvious, however, is that biological brains actually perform much more cognition with less computation.

The second consequence it that it’s now time to get back to the drawing board and begin to explore more complex kinds of neurons. The more complex kinds we’ve seen to date are the ones derived from LSTM. Here is the result of a brute force architectural search for LSTM-like neurons:

Google Neural Architecture Search

It’s not clear why this more complex LSTM is more effective. Only the architectural search algorithm knows, but it can’t explain itself.

There is a newly released paper that explores more complex hand-engineered LSTMs:

that reveals measurable improvements over standard LSTMs:

In summary, a research plan that explores more complex kinds of neurons may bear promising fruit. This is not unlike the research that explores the use of complex values in neural networks. In these complex-valued networks, performance improvements are noticed only on RNN networks. This should indicate that these internal neuron complexities may be necessary for capabilities beyond simple perception. I suspect that these complexities are necessary for advanced cognition that seems to evade current Deep Learning systems. These include robustness to adversarial features, learning to forget, learning what to ignore, learning abstraction and recognizing contextual switching.

I predict in the near future that we shall see more aggressive research in this area. After all, nature is already unequivocally telling us that neurons are individually more complex and therefore our neuron models may also need to be more complex. Perhaps we need something as complicated as a Grassmann Algebra to make progress. ;-)

Explore Deep Learning: Artificial Intuition: The Improbable Deep Learning Revolution