In this guide, we will be looking at how to run an artificial neural network on an Arduino. This is the perfect project to learn about machine learning and the basics of artificial intelligence. The neural network in this example is a feed-forward backpropagation network as this is one of the most commonly used, the network concept will be described briefly in the background section.

Although we’ve used an Uno in this example, the network can be run on a Nano, Mega or Leonardo as well.

This project assumes you know the basics of Arduino programming, otherwise read our article on getting started with Arduino.

Background on Artificial Neural Networks

An artificial neural network is a mathematical computing model which is designed to mimic the way in which the brain reacts to sensory inputs. The brain is made up of millions of neurons which are connected to each other in huge networks. Each neuron is capable of being stimulated, much like a switch being turned on or off, and the state of the neuron turns surrounding neurons on or off as well depending on the level of activation of the neuron and the strength of the connection between the neurons. A neuron with a strong connection will have a greater level of stimulation than one with a weaker connection. Very simplistically, the neurons own level of stimulation is related to the sum of the stimulation it is receiving from all of the other neurons connected to it, and this is precisely how the artificial neural network works.

Let’s start off by understanding what exactly a backpropagation network is. The network itself is not a new concept, in fact they have been around since the 80’s and while they are based on some fairly complicated mathematics, you do not need to understand the mathematics in order to understand how the network functions.

So what is an artificial neural network? In short, an artificial neural network is a segment of code which learns how to respond to inputs based on example sets of inputs and outputs. They are very powerful tools and are rapidly finding their place in facial recognition, autonomous vehicles, stock market and sports predictions and even as far as websites suggesting products which you may be interested in. Their most powerful application lies in pattern recognition, where the exact input into the network is not known. There may be too little or too much information and it is up to the network to decide how it is processed. A good example of the application of an artificial neural network is in handwriting recognition. We are able to recognise letters and numbers but the exact shape of the characters varies from person to person, therefore the input into the neural network is never precisely known. It is up to the neural network to identify the input and relate it to the relevant output.

In an artificial or software based neural network, a mathematical model of all of the neurons and their connections is created. An input is then fed into the network and the neurons systematically add up their inputs and produce an output into the next level of neurons until an output is reached.

One of the key principles in an artificial neural network is that the network needs to be trained. When the network is set up, random weights are applied to each of the connections. These weights provide a starting point for the network but will almost invariably provide “rubbish” outputs. A set of sample data is input into the network and the results are compared to the expected results. The weights are then adjusted and the input/output cycle repeated. This training cycle is repeated until the output data from the network matches the expected output data within a certain level of accuracy. This typically takes a few tens of thousands of training cycles depending on the complexity of the data and network.

In this example, we’re going to be building a three layer feed forward network, the three being the input layer, hidden layer and output layer as shown below.

In the sketch in this article features a set of training inputs and outputs which map the seven segments of an LED numerical display to the corresponding binary number. The network runs through the training data repetitively and makes adjustments to the weightings until a specified level of accuracy is achieved, at this stage the network is said to have been trained.

The input and output training data:

You’ll need to run the Serial monitor on your Arduino IDE in order to see the progressive training and the final results. The program will send through a set of training data every one thousand cycles so that you can see how the network is “learning” and getting closer to the correct answers.

You can create your own training data to train your network on, have a look at the last sections in this guide for instructions on creating your own training data.

What You Need For Your Arduino Based Artificial Neural Network

An Arduino (Uno Used In This Guide) – Buy Here

How To Build Your Arduino Based Artificial Neural Network

In this example, we are simply training a network with a predefined set of training data until a solution is achieved. This is the easiest and most basic way to get an artificial neural network running on your Arduino and it requires no connections to the input or output pins. You’ll simply need to plug your Arduino into your computer using the USB cable and you’re ready to upload the neural network code.

Understanding The Code

As stated before, the mathematics behind a neural network can be quite complex if you don’t have a strong mathematical background but fortunately you don’t need to understand the code to be able to use it and modify it to use your own training data. You should be able to follow the majority of the code through a simple understanding of arrays and loops.

Simplistically, the program establishes a system of arrays which store the network weights and the data being fed through the network. The data is then sequentially fed forward through the network and then the errors are back propagated through the network and the weightings adjusted.

Here’s a summary of the code operation:

Set up the arrays and assign random weights.

Start a loop which runs through each item of training data.

Randomise the order in which the training data is run through each iteration to ensure that convergence on local minimums does not occur.

Feed the data through the network calculating the activation of the hidden layer’s nodes, output layer’s nodes and the errors.

Back propagate the errors to the hidden layer.

Update the associated weights.

Compare the error to the threshold and decide whether to run another cycle or if the training is complete.

Send a sample of the training data to the Serial monitor every thousand cycles.

Uploading The Code

The best way to learn and understand how the code works is to run it and see on the Serial monitor how the solution to the training data is developed.

Here is the code:

//Author: Ralph Heymsfeld //28/06/2018 #include <math.h> /****************************************************************** * Network Configuration - customized per network ******************************************************************/ const int PatternCount = 10; const int InputNodes = 7; const int HiddenNodes = 8; const int OutputNodes = 4; const float LearningRate = 0.3; const float Momentum = 0.9; const float InitialWeightMax = 0.5; const float Success = 0.0004; const byte Input[PatternCount][InputNodes] = { { 1, 1, 1, 1, 1, 1, 0 }, // 0 { 0, 1, 1, 0, 0, 0, 0 }, // 1 { 1, 1, 0, 1, 1, 0, 1 }, // 2 { 1, 1, 1, 1, 0, 0, 1 }, // 3 { 0, 1, 1, 0, 0, 1, 1 }, // 4 { 1, 0, 1, 1, 0, 1, 1 }, // 5 { 0, 0, 1, 1, 1, 1, 1 }, // 6 { 1, 1, 1, 0, 0, 0, 0 }, // 7 { 1, 1, 1, 1, 1, 1, 1 }, // 8 { 1, 1, 1, 0, 0, 1, 1 } // 9 }; const byte Target[PatternCount][OutputNodes] = { { 0, 0, 0, 0 }, { 0, 0, 0, 1 }, { 0, 0, 1, 0 }, { 0, 0, 1, 1 }, { 0, 1, 0, 0 }, { 0, 1, 0, 1 }, { 0, 1, 1, 0 }, { 0, 1, 1, 1 }, { 1, 0, 0, 0 }, { 1, 0, 0, 1 } }; /****************************************************************** * End Network Configuration ******************************************************************/ int i, j, p, q, r; int ReportEvery1000; int RandomizedIndex[PatternCount]; long TrainingCycle; float Rando; float Error; float Accum; float Hidden[HiddenNodes]; float Output[OutputNodes]; float HiddenWeights[InputNodes+1][HiddenNodes]; float OutputWeights[HiddenNodes+1][OutputNodes]; float HiddenDelta[HiddenNodes]; float OutputDelta[OutputNodes]; float ChangeHiddenWeights[InputNodes+1][HiddenNodes]; float ChangeOutputWeights[HiddenNodes+1][OutputNodes]; void setup(){ Serial.begin(9600); randomSeed(analogRead(3)); ReportEvery1000 = 1; for( p = 0 ; p < PatternCount ; p++ ) { RandomizedIndex[p] = p ; } } void loop (){ /****************************************************************** * Initialize HiddenWeights and ChangeHiddenWeights ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { for( j = 0 ; j <= InputNodes ; j++ ) { ChangeHiddenWeights[j][i] = 0.0 ; Rando = float(random(100))/100; HiddenWeights[j][i] = 2.0 * ( Rando - 0.5 ) * InitialWeightMax ; } } /****************************************************************** * Initialize OutputWeights and ChangeOutputWeights ******************************************************************/ for( i = 0 ; i < OutputNodes ; i ++ ) { for( j = 0 ; j <= HiddenNodes ; j++ ) { ChangeOutputWeights[j][i] = 0.0 ; Rando = float(random(100))/100; OutputWeights[j][i] = 2.0 * ( Rando - 0.5 ) * InitialWeightMax ; } } Serial.println("Initial/Untrained Outputs: "); toTerminal(); /****************************************************************** * Begin training ******************************************************************/ for( TrainingCycle = 1 ; TrainingCycle < 2147483647 ; TrainingCycle++) { /****************************************************************** * Randomize order of training patterns ******************************************************************/ for( p = 0 ; p < PatternCount ; p++) { q = random(PatternCount); r = RandomizedIndex[p] ; RandomizedIndex[p] = RandomizedIndex[q] ; RandomizedIndex[q] = r ; } Error = 0.0 ; /****************************************************************** * Cycle through each training pattern in the randomized order ******************************************************************/ for( q = 0 ; q < PatternCount ; q++ ) { p = RandomizedIndex[q]; /****************************************************************** * Compute hidden layer activations ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { Accum = HiddenWeights[InputNodes][i] ; for( j = 0 ; j < InputNodes ; j++ ) { Accum += Input[p][j] * HiddenWeights[j][i] ; } Hidden[i] = 1.0/(1.0 + exp(-Accum)) ; } /****************************************************************** * Compute output layer activations and calculate errors ******************************************************************/ for( i = 0 ; i < OutputNodes ; i++ ) { Accum = OutputWeights[HiddenNodes][i] ; for( j = 0 ; j < HiddenNodes ; j++ ) { Accum += Hidden[j] * OutputWeights[j][i] ; } Output[i] = 1.0/(1.0 + exp(-Accum)) ; OutputDelta[i] = (Target[p][i] - Output[i]) * Output[i] * (1.0 - Output[i]) ; Error += 0.5 * (Target[p][i] - Output[i]) * (Target[p][i] - Output[i]) ; } /****************************************************************** * Backpropagate errors to hidden layer ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { Accum = 0.0 ; for( j = 0 ; j < OutputNodes ; j++ ) { Accum += OutputWeights[i][j] * OutputDelta[j] ; } HiddenDelta[i] = Accum * Hidden[i] * (1.0 - Hidden[i]) ; } /****************************************************************** * Update Inner-->Hidden Weights ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { ChangeHiddenWeights[InputNodes][i] = LearningRate * HiddenDelta[i] + Momentum * ChangeHiddenWeights[InputNodes][i] ; HiddenWeights[InputNodes][i] += ChangeHiddenWeights[InputNodes][i] ; for( j = 0 ; j < InputNodes ; j++ ) { ChangeHiddenWeights[j][i] = LearningRate * Input[p][j] * HiddenDelta[i] + Momentum * ChangeHiddenWeights[j][i]; HiddenWeights[j][i] += ChangeHiddenWeights[j][i] ; } } /****************************************************************** * Update Hidden-->Output Weights ******************************************************************/ for( i = 0 ; i < OutputNodes ; i ++ ) { ChangeOutputWeights[HiddenNodes][i] = LearningRate * OutputDelta[i] + Momentum * ChangeOutputWeights[HiddenNodes][i] ; OutputWeights[HiddenNodes][i] += ChangeOutputWeights[HiddenNodes][i] ; for( j = 0 ; j < HiddenNodes ; j++ ) { ChangeOutputWeights[j][i] = LearningRate * Hidden[j] * OutputDelta[i] + Momentum * ChangeOutputWeights[j][i] ; OutputWeights[j][i] += ChangeOutputWeights[j][i] ; } } } /****************************************************************** * Every 1000 cycles send data to terminal for display ******************************************************************/ ReportEvery1000 = ReportEvery1000 - 1; if (ReportEvery1000 == 0) { Serial.println(); Serial.println(); Serial.print ("TrainingCycle: "); Serial.print (TrainingCycle); Serial.print (" Error = "); Serial.println (Error, 5); toTerminal(); if (TrainingCycle==1) { ReportEvery1000 = 999; } else { ReportEvery1000 = 1000; } } /****************************************************************** * If error rate is less than pre-determined threshold then end ******************************************************************/ if( Error < Success ) break ; } Serial.println (); Serial.println(); Serial.print ("TrainingCycle: "); Serial.print (TrainingCycle); Serial.print (" Error = "); Serial.println (Error, 5); toTerminal(); Serial.println (); Serial.println (); Serial.println ("Training Set Solved! "); Serial.println ("--------"); Serial.println (); Serial.println (); ReportEvery1000 = 1; } void toTerminal() { for( p = 0 ; p < PatternCount ; p++ ) { Serial.println(); Serial.print (" Training Pattern: "); Serial.println (p); Serial.print (" Input "); for( i = 0 ; i < InputNodes ; i++ ) { Serial.print (Input[p][i], DEC); Serial.print (" "); } Serial.print (" Target "); for( i = 0 ; i < OutputNodes ; i++ ) { Serial.print (Target[p][i], DEC); Serial.print (" "); } /****************************************************************** * Compute hidden layer activations ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { Accum = HiddenWeights[InputNodes][i] ; for( j = 0 ; j < InputNodes ; j++ ) { Accum += Input[p][j] * HiddenWeights[j][i] ; } Hidden[i] = 1.0/(1.0 + exp(-Accum)) ; } /****************************************************************** * Compute output layer activations and calculate errors ******************************************************************/ for( i = 0 ; i < OutputNodes ; i++ ) { Accum = OutputWeights[HiddenNodes][i] ; for( j = 0 ; j < HiddenNodes ; j++ ) { Accum += Hidden[j] * OutputWeights[j][i] ; } Output[i] = 1.0/(1.0 + exp(-Accum)) ; } Serial.print (" Output "); for( i = 0 ; i < OutputNodes ; i++ ) { Serial.print (Output[i], 5); Serial.print (" "); } } }

The code can also be downloaded through this link: ArtificialNeuralNetwork

Creating Your Own Training Data

Once you have the basic network running, you may want to try and input your own training data. To do this, you’ll need to modify the training data in the table as well as these items in the input parameters:

PatternCount – The number of items/row of training data in your table.

InputNodes – The number of neurons associated with the input data.

Output Nodes – The number of neurons associated with the output data.

In addition to the above parameters which have to be changed for the new training data, the following items can also be changed and experimented with to get different training results:

HiddenNodes – The number of neurons associated with the hidden layer.

LearningRate – The proportion of the error which is back propagated.

Momentum – The proportion of the previous iteration which affects the current iteration.

InitialWeightMax – The maximum starting value for the randomly assigned weights.

Success – The threshold at which the program recognises that it has been sufficiently trained.

You’ll may need to adjust some or all of these values in order to optimise the training process for your new training data. It is possible that a solution may never be reached and the training process gets stuck oscillating above and below the threshold infinitely, you’ll need to then adjust these values such that a solution is able to be reached. You can read up further on each of these parameters if you research and improve your understanding in how artificial neural networks work.

It is worth noting that the training data and network configuration provided in this example is about as large as you can run on an Arduino Uno without exceeding it’s 2K SRAM. If you’d like to experiement with a larger network, you’ll need to use an Arduino board with a larger SRAM allocation such as the Mega. Unfortunately, no warning is given by the IDE or the Arduino if the allocation is exceeded, you;ll just keep getting strange results and the network will be unable to be trained.

Obstacle Avoiding Robot Running A Neural Network

Tim Kälin has used this code as a basis for an obstacle avoiding robot which uses two ultrasonic modules connected to an ESP32 running the neural network to control it’s movements. The neural network has two inputs nodes from the ultrasonic modules and five output nodes, turn left, turn right, light left, light right and go straight

According to Tim it takes a minute or two to train the neural network on the data set when powered up. It also has some on-board LED strips which are used to display the inputs, hidden layer activation and selected outputs so that you get a visual representation of the neural network’s functioning.

It’s really impressive to watch.

The code is quite a bit more complicated than in this example due to the complexity of the project but it’s broken down quite well. You can download the code through this link – Neural Network Robot

What To Try Next

Once you’ve become familiar with artificial neural networks and you’ve tried experimenting with different training data, I’m sure you’d like to make use of the Arduino in a more practical way. Here are some ideas to take this project further:

Add an LCD or TFT display to your Arduino and send the training or output data to the display instead of to your Serial monitor.

Develop a network which responds to inputs to the Arduino. For example, you could use physical switches or photoresistors on the Arduino inputs to activate the input nodes and drive a “learnt” output.

Use the network to drive outputs on your Arduino. Add a motor or servo onto your Arduino which uses the neural network to response to inputs. For example you could make a servo arm shade screen which covers the Arduino when light falls onto a photoresistor.

Have you built your own artificial neural network? Tell us how it went and what you’ve used it for in the comments section below.

Hi, my name is Michael and I started this blog in 2016 to share my DIY journey with you. I love fixing, renovating and building – I’m always looking for new projects and exciting DIY ideas. If you do too, grab a cup of coffee and settle in, I’m happy to have you here.