Introduction

In late 1859, Charles Darwin published what is considered to be the founding work of modern evolutionary biology. In his book, On the Origin of Species, he proposed that life forms evolve over time due to natural selection, a phenomenon in which slight changes in individuals result in an advantageous edge, hence more success of their offspring in later generations. This led to the diversity witnessed in life forms.

Nearly two centuries later, his ideas have grown with advances in various fields of biology such as genetics and can now be adapted into others. In computer science, one of the fields is Evolutionary Computing. We will visit one of the branches of evolutionary computing in this article, Genetic Algorithms also referred to as Evolutionary Algorithms. These algorithms can be seen as a form of heuristic search.

Genetic algorithms try to model Darwinian ideas of strife for survival in living things.

This is however not a biological paper so let’s focus on computational problems, for example mathematical optimization. Genetic algorithms can be used in a problem in which a solution can be approximately correct. This means the answer has a degree of error in it. In most optimization problems, we know how the answer could look but don’t have the answer specifically so we start off with a relatively bad answer and slowly work our way to a relatively good answer. A good example is the stochastic gradient descent algorithm.

If using a genetic algorithm to solve an optimization problem:

An individual can be considered a possible solution.

A population will be a collection of possible solutions.

A fitness measure is a way of measuring how good a solution is.

A generation is a level of advancement towards the desired fitness.

A variation operation is a change to an individual to alter its output.

Terminology

Let’s define some terms we might use. I’ll try to give an example of the same in a mathematical optimization scenario.

Chromosome: a collection of linked genes that carries genetic information. This can be a section of a mathematical function with several parameters.

a collection of linked genes that carries genetic information. This can be a section of a mathematical function with several parameters. Crossover: production of offspring by combining parents. This can be forming a new mathematical function by combining sets of parameters from other functions. It is one of the main variation operations mentioned above.

production of offspring by combining parents. This can be forming a new mathematical function by combining sets of parameters from other functions. It is one of the main variation operations mentioned above. Gene: a carrier of genetic information regarding a specific trait, in other words, a unit of heredity. This can be a parameter in a mathematical function that affects a specific variable.

a carrier of genetic information regarding a specific trait, in other words, a unit of heredity. This can be a parameter in a mathematical function that affects a specific variable. Genome: the entire combination of genetic information for an individual. This can be a mathematical function in its entirety.

the entire combination of genetic information for an individual. This can be a mathematical function in its entirety. Fitness: a measure of adaptation of an individual. This can be how well a solution to an optimization problem performs.

a measure of adaptation of an individual. This can be how well a solution to an optimization problem performs. Locus: a specific location on a chromosome, for example, the location of a gene. This could be a location of a parameter in a mathematical function.

a specific location on a chromosome, for example, the location of a gene. This could be a location of a parameter in a mathematical function. Mutation: a change in a gene sequence. This can be a change to a parameter in a mathematical function leading to a change in the results it produces. It is also a major variation operation.

a change in a gene sequence. This can be a change to a parameter in a mathematical function leading to a change in the results it produces. It is also a major variation operation. Population: a set of all individuals being considered in a study. This can be a set of many solutions to an optimization problem of which the best is to be selected.

Algorithm Structure

Let us use a general framework accounting for most of existing genetic algorithms.

We will need some elements:

Stopping criterion: this is a way to stop the algorithm, given a solution has been found, a limit of allowed generations is reached or any other desired criteria. It is also known as ‘termination condition’.

this is a way to stop the algorithm, given a solution has been found, a limit of allowed generations is reached or any other desired criteria. It is also known as ‘termination condition’. Selection: a method of choosing the individuals to be used in creating the next generation.

a method of choosing the individuals to be used in creating the next generation. Variation: this is applying changes to selected individuals to alter their behavior. This is done by either mutation or crossover.

The basic structure of a genetic algorithm:

begin

count = 0

initialize population

evaluate population

while not termination condition do

begin

count = count + 1

select individuals for reproduction

apply variation operators

evaluate offspring

end

end

A genetic algorithm approach to solving linear regression

As an example, we will solve linear regression using what we’ve learnt so far.

Linear regression can be analytically solved by matrix calculus. However, it is a problem in which we can be approximately correct, hence a good example for demonstrating how genetic algorithms work. Hardly will this ever be done in practice though.

In any regression problem, we have some elements:

Inputs: these are the variables used in predicting outputs.

these are the variables used in predicting outputs. Outputs: these are the variables we wish to predict.

these are the variables we wish to predict. Parameters: values that define relationships between inputs and outputs. We try to predict these with as much accuracy as possible.

A problem with 3 inputs (represented by ‘x’) and an output (represented by ‘y’) with parameters ‘a’, ‘b’ and ‘c’ would look like :

y = a(x1) + b(x2) + c(x3)

Let’s write some python code and solve a small regression problem.

Some function arguments and variables may seem unavailable but will be provided before the functions are run.

To work this out we’ll need some data. We won’t use real-world data, we will simulate some by using randomly generated numbers.

First, we import necessary modules and needed functions:

tqdm is a small library for viewing progress of a loop. numpy is mainly used for matrix manipulation. These are the only non-standard libraries used.

We then write a function to generate data. We will generate data with a clear pattern. This ensures we have an idea of the desired result. This is for demonstration purposes only, real data is needed in practice. The coeff variable is the known model that we generate the data with. It can be varied however one pleases. In this scenario, we simulate a problem with four input variables within the range of 0 and 1. We generate a dataset with 1000 data-points. The coeff is what we expect to get or a close approximation of. x are our inputs, y our outputs that have been generated using the coefficients in coeff , by multiplying each datapoint by coeff .

We generate a data with 1000 data-points. You can however change that to your liking. You can also generate one for testing whether your model works outside the data it has ‘seen’.

Then, we write a function to demonstrate how linear regression would be analytically solved. The solution returned should equal the coefficients used to generate the data.

SST: the total error in a model, it is the sum of all deviations squared.

SSR: a measure of the explained variation in SST.

COD: stands for ‘coefficient of determination’ which is basically a measure of how good a model is. It represents the portion of variance in the dependent variable (output) that is explained by the variance in the independent variables(inputs). The values range between 0–100, any other value means the model is really bad. In the acceptable range, 0 is a relatively bad model and 100 is the best possible model.

error: the average error, is an average of all deviations from expected values. This is to show that the coeff values can be analytically obtained.

On the fourth line of this function, I used the Moose-Penrose pseudo-inverse(pinv) to avoid errors in the event of a singular matrix.

One of the first requirements of a genetic algorithm is a termination condition. So we write a function to check whether we currently meet any of the termination conditions. I decided to use COD and check for the goodness of a model to terminate if it is good enough. If practicing genetic algorithms, you can get the best possible values and estimate your termination condition using them since the best possible COD changes with a problem. You can also use error instead.

To create an initial individual, we will use random assigning of variables. The individual size is just the number of input variables.

We then use this function to create an initial population:

To get the fitness of an individual, we make predictions with it and return its error, COD, and coefficients.

We then use this function to evaluate a population and return the best individuals. I chose to use least errors for simplicity in selecting the best individuals.

Variation operations need to be applied to evolve a population. We will use crossover and mutation.

In crossover, two parents provide the genes for a child, each giving half. Unlike in real scenarios, these chromosomes are not made up of actually linked genes. They are just a collection of genes from each parent, taken from different loci.

In mutation, a selected number of genes in a selected number of individuals is randomly changed.

With all these, we need to be able to create a new generation of individuals from a small group of individuals to advance a population forward. Its made to be of equal size with the original population but can be randomized.

Finally, we now implement the logic that actually runs the functions. The values assigned to the variables can be altered to experiment with different scenarios:

This will print out the advancement, so you can trace the program as it gets better and finally outputs the best individual.

The entire script can be found here.

An example of the program’s output.

As you can see, in just 6 generations, we move from a really bad model with a COD of -109% to a very good one of 98%. With more generations, a higher degree of accuracy is possible.

This is but one example of how genetic algorithms can be used. Basically, if an optimization problem is stated well, a genetic algorithm can be designed to solve it. Some applications are: automated design, data center configuration, signal processing, data compression and much more.