$\begingroup$

I am an Statistics student at University of Warwick (incoming Stanford University) and I have an interest in explaining machine learning concepts in a non-mathematical/non-technical way.

Biological Neural Network

Our brain has a large network of interlinked neurons, which act as a highway for information to be transmitted from point A to point B. To send different kinds of information from A to B, the brain activates a different sets of neurons, and so essentially uses a different route to get from A to B. This is how a typical neuron might look like.

At each neuron, its dendrites receive incoming signals sent by other neurons. If the neuron receives a high enough level of signals within a certain period of time, the neuron sends an electrical pulse into the terminals. These outgoing signals are then received by other neurons.

Artificial Neural Network

The ANN model is modelled after the biological neural network (and hence its namesake). Similarly, in the ANN model, we have an input node (in this example we give it a handwritten image of the number 6), and an output node, which is the digit that the program recognized.

A simple Artificial Neural Network map, showing two scenarios with two different inputs but with the same output. Activated neurons along the path are shown in red.

The main characteristics of an ANN is as such:

Step 1. When the input node is given an image, it activates a unique set of neurons in the first layer, starting a chain reaction that would pave a unique path to the output node. In Scenario 1, neurons A, B, and D are activated in layer 1.

Step 2. The activated neurons send signals to every connected neuron in the next layer. This directly affects which neurons are activated in the next layer. In Scenario 1, neuron A sends a signal to E and G, neuron B sends a signal to E, and neuron D sends a signal to F and G.

Step 3. In the next layer, each neuron is governed by a rule on what combinations of received signals would activate the neuron (rules are trained when we give the ANN program training data, i.e. images of handwritten digits and the correct answer). In Scenario 1, neuron E is activated by the signals from A and B. However, for neuron F and G, their neurons’ rules tell them that they have not received the right signals to be activated, and hence they remains grey.

Step 4. Steps 2-3 are repeated for all the remaining layers (it is possible for the model to have more than 2 layers), until we are left with the output node.

Step 5. The output node deduces the correct digit based on signals received from neurons in the layer directly preceding it (layer 2). Each combination of activated neurons in layer 2 leads to one solution, though each solution can be represented by different combinations of activated neurons. In Scenarios 1 & 2, two images given to the input. Because the images are different, the network activates a different set of neurons to get from the input to the output. However, the output is still able to recognise that both images are “6”.

Read the full post here, which includes some examples of handwritten digits that were used: https://annalyzin.wordpress.com/2016/03/13/how-do-computers-recognise-handwriting-using-artificial-neural-networks/