Evolutionary algorithms is a powerful solution to different optimisation problems. Evolutionary algorithms help to find appropriate solution in case finding it using strict methods is so difficult that we can say that it is not possible. One of the most popular variants of evolutionary algorithms is a genetic algorithm. One can read more about genetic algorithms in the “Introduction to Genetic Algorithms — Including Example Code” article.

Sometimes, if a problem is complex and individuals are heavy it might not be possible to implement efficient genetic algorithm, because computations take too much time and it is not possible to store all needed data in memory. In order to overcome the second issue one may store some individuals on disk, and load them into memory once they are needed. This does not solve performance issues, moreover, it brings one more: reading (and, possibly, deserializing) an individual from disk to memory may make the entire system even more slower. Moreover, all individuals an algorithm has ever seen were created using the same genetic operators and thus may have too much in common, so that an algorithm will be walking around a local optima. Is there a way to solve this? Possibly, yes: we can use a parallel or a distributed genetic algorithm.

This article describes two variants of genetic algorithms both intended to improve algorithm performance: parallel and distributed genetic algorithms.

Parallel genetic algorithm

Parallel genetic algorithm is such an algorithm that uses multiple genetic algorithms to solve a single task [1]. All these algorithms try to solve the same task and after they’ve completed their job, the best individual of every algorithm is selected, then the best of them is selected, and this is the solution to a problem. This is one of the most popular approach to parallel genetic algorithms, even though there are others. This approach is often called ‘island model’ because populations are isolated from each other, like real-life creature populations may be isolated living on different islands. Image 1 illustrates that.

Image 1. Parallel genetic algorithm

These genetic algorithms do not depend on each other, as a result, they can run in parallel, taking advantage of a multicore CPU. Each algorithm has its own set of individual, as a result these individuals may differ from individuals of another algorithm, because they have different mutation/crossover history.

Paper [2] describes a parallel genetic algorithm that uses two independent algorithms to improve its performance. The difference between these two algorithms is the way individuals are selected for mutation and crossover. Moreover, some creatures, with the highest fitting, are allowed to ‘migrate’ from one algorithm to another. While this may be sufficient sometimes, but when a task is very hard to solve or an individual is a complex entity, we may need even more diversity within individuals.

In order to achieve this we may use as many algorithms as possible (say, two times more than the number of CPU cores we have) and variate almost every property of an algorithm. As a result, each algorithm has its own set of individuals that was created using methods that differ from those used by other algorithms. This is also an ‘island model’, even though ‘islands’ are more different from each other. The only restriction is that all algorithms use the same fitting function to assess individuals so that it is possible to compare individuals that belong to different algorithms. As a result, these genetic algorithms may differ in the following aspects:

how new individuals are created;

how individuals mutate and how they are selected for mutation;

how individuals crossover and how they are selected for crossover;

how much individuals are taking part in a single crossover event and how much individuals are created as a result;

how much individuals survive after each algorithm iteration and how these creatures are selected;

how much individuals each generation contain;

how much generations an algorithm walks through.

By varying these characteristics it is possible to make many different genetic algorithms that solve the same task but have entirely different individuals.

It is important to note that these independent genetic algorithms have the same structure as any other conventional genetic algorithm, so that they can be extracted from a parallel genetic algorithm and be used on their own.

Crossover between algorithms

So, what do we have? We have several genetic algorithms that are running independently and have their own sets of individuals. We can select individuals that belong to different algorithms and cross them over. As a result, features that appear in individuals of one algorithm will be available to another one. This allows to create such individuals that could not be created by any of the algorithms alone, bringing even more diversity to them and spreading convenient features.

Let’s illustrate this with an example. Say we have three independent genetic algorithms and we want to crossover them in pairs. We take the first algorithm and randomly select the second element of a pair. So, we create as much pairs as many algorithms we have, in each pair the first element is chosen sequentially, and the second one is random. Then we perform crossover on populations of these two algorithms taking individuals from both of them. We pick individuals for crossover in such a way, that individuals from different algorithms are crossed over together. We use crossover mechanisms that are used by the first algorithm of a pair and this algorithm receives all of the individuals that were created as a result of a crossover; the second algorithm of a pair is simply a donor that provides its individuals. Therefore, each algorithm receives new individuals when it is the first element of a pair. If a crossover algorithm requires more than two individuals, additional individuals may be taken from any of the algorithms of a pair, it is only recommended that no algorithm dominate here.

It is also possible to use special crossover technique when it comes to crossover between algorithms, that is applied only for crossover between algorithms.

It is recommended to use crossover between algorithms somewhere in the middle on the process (or even multiple times), not at the very beginning or at the end. At the very beginning each algorithm has fresh individuals that were not affected by mutations or crossover, as a result their features that are specific to a particular algorithm are not brightly expressed. At the very end, there are no more generations that may affect new individuals produced within crossover, as a result they will not compete with other individuals in order to determine which one is better.

Conclusion

A parallel genetic algorithm may take a little more time than a non-parallel one, that is because is uses several computation threads which, in turn, cause the Operation System to perform context switching more frequently. Nevertheless, parallel genetic algorithm tend to produce better results and more optimal individuals than a non-parallel one.

Even though going parallel may significantly improve the result, in some cases even this may be insufficient. In this case we can use …

Distributed genetic algorithm

Distributed genetic algorithm is actually a parallel genetic algorithm that has its independent algorithms running on separate machines. Moreover, in this case each of these algorithms may be in turn a parallel genetic algorithm! Distributed genetic algorithm also implements the ‘island model’ and each ‘island’ is even more isolated from others. If each machine runs a parallel genetic algorithm we may call this as ‘archipelago model’, because we have groups of islands. It actually does not matter what a single genetic algorithm is, because distributed genetic algorithm is about having multiple machines running independent genetic algorithms in order to solve the same task. Image 2 illustrates this.

Image 2. Distributed genetic algorithm with parallel components

Distributed genetic algorithm may also help when we have to create many individuals in order to observe the entire domain, but it is not possible to store all of them in memory of a single machine.

When we were discussing parallel genetic algorithm we introduced the ‘crossover between algorithms’ term. Distributed genetic algorithm enables us to perform crossover between separate machines!

In case of distributed genetic algorithm, we have a kind of ‘master mind’ that controls the overall progress and coordinates these machines. It also controls crossover between machines, selecting how machines will be paired together to perform crossover. In general, process is the same as in case of parallel genetic algorithm, except that individuals are moved over the network from one machine to another. In order to avoid transferring individuals twice, it is recommended to send individuals to the machine, that will receive new individuals created as a result of a crossover operation. The ‘master mind’ also selects the best individual from individuals of secondary machines it is connected to and that were actually running the computation. As a result, this ‘master mind’ is the entry point of a distributed genetic algorithm that communicates with the one, who asks it for a solution.

Implementation

This article is mostly dedicated to theory, but when it comes to implementation, it is possible to have a ‘distributed genetic algorithm framework’ that takes care about controlling secondary machines and transferring individuals over the network. In general, such a framework is able to take care about everything except actions that need to know the internal structure of an individual. Therefore, a user needs to implement only several operations:

individual creation and disposal;

individual crossover;

individual mutation;

fitting function that assesses individuals.

Everything else is universal and can be implemented within the framework. Such a framework can be able to run multiple distributed genetic algorithms at a time, without actually known what problem is being solved.

Conclusion

Using parallel and distributed genetic algorithms one can increase performance of the system that uses evolutionary algorithms. Anyway, we should keep in mind that evolutionary algorithms do not guarantee that a solution will ever be found and that there no more optimal solutions than the one that was found.

One of the main issues we have to deal with while using genetic algorithms is preliminary convergence to a subset of individuals that dominate others. Parallel and distributed genetic algorithms try to address it introducing differences between algorithms that make them to have different set of individuals.

With parallel and distributed genetic algorithms individuals are more divergent, as a result it is possible to create less individuals than using non-parallel genetic algorithm, keeping solution quality at the same rates.

References