Evolvable mathematical models

The evolvable mathematical models (EMMs) used to represent the agents are defined by a system of equations of the form

where vt is the state vector of the agent at time t, vt + Δt = vt + Δvt+Δt at the next timestep, ϑt is the motor output governing the direction of the robot’s movement relative to a given reference direction and ωt in/out are the robot’s input (from the nearest neighbour) and output communication signals. Every state vector includes the coordinates xt, yt of the robot. These equations are encoded in a set of directed tree graphs which serves as the agent’s genome (see Supplementary Fig. S1). Terminal nodes of the equation trees take on the values of one of the variables (variable leaves) or a numerical constant (constant leaves) while nonterminal (branch) nodes perform one of the four basic arithmetic operations (addition, subtraction, multiplication, division). We use the term “evolvable mathematical model” to refer to the genomic representation of agents by equation trees as evolved via genetic programming.

NoiseWorld

When two robots come into close proximity to each other (within a prespecified “reproduction distance” ρ, here ρ = 0.139), an offspring is born by sexual reproduction using genetic programming. During reproduction, offspring genomes are subject to a variety of genetic operators. For each equation that the two parents have in common (the equations have unique identification tags based on when they first appeared via mutation in the simulation), either the equation from parent 1 or parent 2 will go to the offspring. Which equation is inherited is decided randomly for each equation in common. An offspring must receive at least one equation from each parent, thus sexual reproduction is enforced at the equation level. If an offspring receives an equation that contains a variable modified by another equation that is not common to both parents, the offspring will inherit that equation as well (see Supplementary Fig. S2 for several examples).

A mutation will occur in an equation tree with a probability of p m ; here p m = 0.025β/n, where n is the number of trees in the genome and β is independently calculated on each island every 10,000 timesteps as 500/b with b being the numbers of births on the island in the previous 10,000 timesteps. The parameter β saturates at 100 but has no minimum; it is used in an effort to keep the number of mutations per unit time constant.

A tree mutation is a point or subtree mutation with equal probability. A point mutation takes the form of a perturbation of a constant leaf (if any exist in the tree) or the mutation of another node with equal probability. A perturbation of a constant is drawn from the Guassian distribution N(μ, σ); here, μ = 0 and σ = 0.5. A mutation to a branch-node reassigns it to another arithmetic operation and a mutation to a variable leaf changes it to another variable or a new constant, k; here, k ∈[−5, 5]. A subtree mutation replaces a randomly selected node with a randomly generated subtree (generated via the ramped half-and-half method, see below). There is a 5% chance that the randomly generated subtree will replace the entire original tree, with the original tree then being spliced onto a randomly selected node on this new subtree. A genetic splice operation occurs with a probability p m whereby a randomly selected node is replaced with a randomly selected subtree from a parent genome.

Initial conditions of the state, v0, are also subject to mutation with probability p m ; in these mutations, the initial state values are either augmented by a perturbation taken from N(0, 0.25) or completely replaced with a random value drawn from the interval [−1, 1] with equal probability.

Finally, for each tree in an offspring genome, there is a probability 0.5p m that a new state equation will be added to the offspring’s genome, with a reference to the corresponding new state variable inserted into a randomly selected location on the tree. The equation tree for the new variable is initialized in the same manner as for the primordial population (see below).

There is a total of 100 islands in NoiseWorld, each with a large two-dimensional expanse (−20 < x < 20, −20 < y < 20). If a robot reaches the edge of an island (which is only possible with very long lifespans as robots are initialized far from the edges of their island and can only move an average of 0.0005 units per timestep, see below), it effectively falls off the island, a death that is enforced at the next reproduction event. The subpopulations on each island are isolated except for the occasional migration occurring at birth (see below). Topologically, NoiseWorld is toroidal where every island is surrounded by eight neighbouring islands, four sharing a “border” and four sharing a “corner” on a two-dimensional manifold. Each island is seeded with 50 agents and each agent is initialized with a random genome using the “ramped half-and-half” method27 to generate trees with a maximum depth of 1 or 2. Ramped half-and-half is a combination of two methods, the “full” method and the “grow” method. In the “full” method, nonterminal nodes are randomly generated until the maximum depth is reached. At the maximum depth, only terminal nodes are created. In the “grow” method, as in the “full” method, only terminal nodes are created at the maximum depth. The difference is that before the maximum depth is reached, randomly generated nodes can be either terminal or nonterminal nodes, allowing for a wider range of potential tree shapes. The ramped half-and-half method chooses to create a random subtree using either the “full” or “grow” method with equal probability.

Each agent is supplied with two immutable equations, Δxt+Δt = aΔt cosϑt and Δyt+Δt = aΔt sinϑt in Δvt+Δt, which govern its movement; a is drawn from N(1, 0.025) and Δt = 0.0005 in dimensionless time units. Each robot knows its latitude and longitude, x and y, but has no direct information about any of its fellow robots. The angle ϑt is measured relative to either “east” (+x direction) or “north” (+y direction).

Offspring begin life in a randomly selected location on its parents’ island (within a circle, here of radius 1.13, centred on the origin) although there is a small probability (p b = 0.001) that it will appear on a bordering island (diagonal migrations are not permitted). A minimum distance (here the reproduction distance ρ) to the offspring’s nearest neighbour is enforced. Parents are also moved to new random locations on their island (in the same manner as described above for their offspring) and reinitialized. Migration allows the spread of genes among islands. Otherwise, robots are restricted to remain on their native islands. To maintain a constant population, when a birth occurs, another robot randomly dies.

A newly created offspring genome has a 10% chance of being selected to undergo equation reduction. In such an event, the following operations are applied recursively across all of the agent’s equation trees:

The subtraction, addition, multiplication or division of two constants is reduced to a single constant by performing the encoded operation.

The sum of two identical subtrees is reduced to 2× a single version of the subtree.

The subtraction of two identical subtrees is reduced to 0.

The multiplication of a subtree by 0 is reduced to 0.

The division of 0 by a nonzero subtree is reduced to 0.

An agent’s genome is limited to a maximum of 200 nodes across all of its equation trees. An offspring born with more than 200 nodes dies immediately.

If one or more of an agent’s output variables exceed the minimum or maximum representable floating-point number, the agent will have that output set to a random floating-point number and will be selected to die when the next birth occurs.

It is worthwhile noting that further investigation and observation of the robots’ behaviour show the evolutionary process to be developing a simple control mechanism. Taking again the evolved agent of era 937 as our example (see Supplementary Equations S6–S7), we see that the mutual dynamics of two identical agents (1 and 2) possesses the fixed point x 1 = x 2 , y 1 = y 2 . Moreover, this point behaves in a stable fashion. From a control-theoretic viewpoint, then, the evolution produces a stable controller in which the objective is bring two agents to consensus in position and where ϑt is the control variable and ω out/in serves as the measurement variable.

Computational experiments

All simulation experiments were run for 48 wall clock hours on a dedicated Linux server with an Intel Xeon E5540 at 2.53 GHz. Each island is implemented as a separate process so that the algorithm can take full advantage of the parallel architecture of the Intel Xeon CPU (8 cores/16 threads). A master/slave parallel implementation is used, where a “master” process handles the synchronization of “slave” processes (i.e., the islands). Islands are synchronized and migrants exchanged every 10,000 timesteps. Islands introduce incoming migrants into their subpopulations at a rate of η =10,000 migrants per timestep (in a randomized order). Migration events are treated as new births on the receiving island, thus engendering a random death on the island at the following timestep. While migration isn’t necessary for symbolic communication to emerge, it has the effect of improving the probability of a run achieving symbolic communication, as well as reducing the accumulation of neutral mutations in agent genomes, thus significantly increasing the number of eras that can be simulated in 48 wall clock hours (Supplementary Table S2).

To test how an island snapshot performs with and without communication enabled (i.e., ω in = 0), as well as to collect data for the correlation calculations (Fig. 2d and Supplementary Fig. S3), a control test simulation was performed. The duration of a control test run is one era (100,000 timesteps) and the genomes used are taken from a snapshot of an island population. The robots are initially placed randomly in the test world and initialized. If during the control test run two robots meet one another (within the distance ρ, see above), the event is counted as a reproduction event but no offspring genome is created. Instead, the two parent robots are moved to new random locations and reinitialized. This prevents any evolution during the control runs. The effects of births and deaths were simulated by moving a robot to new random position and reinitializing it with a probability of 0.001 per robot per timestep.

For correlation calculations, the communication outputs and position information of the top reproducing agent in the snapshot are recorded throughout the test simulation, yielding 100,000 sets of input/output values per test. The Pearson product-moment correlation coefficient r (“correlation” in Fig. 2d and Supplementary Fig. S3) was calculated as

where n is the number of samples (n = 100,000), v i is a sample of the input variable in question (i.e., x or y position), is the mean of the input samples, ω i is a sample of the communication output, and is the mean of the communication output samples. If the communication output is a constant then r is undefined and these points are omitted from Fig. 2d and Supplementary Fig. S3.

Source code

The source code for the computer simulation experiments described in this work is freely available at https://github.com/pgrouchy/NoiseWorld.

Embodied robotic agents

The hardware experiments in this work were performed on e-puck robots. Two agents (EMMs) were run in a synchronized fashion on a laptop, with motor speed adjustments being sent to two e-puck robots via Bluetooth.

Robot orientation ϑ and positions x and y were determined using an overhead webcam (640 × 480 resolution), colour detection software and coloured markers affixed to the top of the robots (see Fig. 4). Binary images indicating the locations of the red or blue markers were created from webcam images by using RGB colour masks. Blob detection was then performed on these binary images. The pixel values of the centroids of the two largest blobs were used to determine robot position, with the larger of the two red blobs always indicating one robot, and the larger of the two blue blobs indicating the other robot. Pixel values of blob centroids were scaled by 424 to yield position value magnitudes comparable to those that agents would typically see in simulation. A robot’s orientation was determined by taking the arctangent (using the atan2 function) of the difference between the pixel values of the centroids of its two blobs/markers.

The protocol for the robotic experiments was as follows (Supplementary Fig. S9):

1 Set all agent variables to their initial values. 2 Get agent x and y positions from overhead tracking system. 3 Evaluate both sets of agent equations for 10 timesteps, with an agent’s ω in being set to the other agent’s ω out from the previous timestep. The x and y input values are not changed during these 10 timesteps, although new noise values are used at each step; ϑ is treated as an internal variable and is thus updated at each step. 4 Calculate the cumulative expected motion of each agent over the past 10 timesteps. This yields a new expected position. Each robot is turned to face its expected position (±π/16) and then set to drive forward. If a robot is already within ±π/16 of this expected orientation, it is not turned. If the robots are in motion and at least one agent needs to turn, both robots are stopped. Otherwise they are left to continue forward in their current direction. 5 Loop back to step 2.

For the hardware experiments presented in this paper, the following EMM was used for both agents:

This is the top reproducing agent from a previously unreported run.

Auditory interpretations of communication outputs