Black dots represent the highest-performing network from each of the 100 experiments from both the PA and P&CC treatments. Both the sparsity (p = 1.08 × 10 −16 ) and modularity (p = 1.19 × 10 −5 ) of networks significantly correlates with their performance. Performance was measured in 80 randomly generated environments ( Methods ). Significance was calculated by a t-test of the hypothesis that the correlation is zero. Notice that many of the lowest-performing networks are close to the maximum of 150 connections.

After observing that a connection cost significantly improves performance and modularity, we analyzed whether this increased performance can be explained by the increased modularity, or whether it may better correlate with network sparsity, since P&CC networks also have fewer connections (P&CC median number of connections is 35.5 [95% CI: 31.0, 40.0] vs. PA 82.0 [74.0, 97.1], p = 7.97 × 10 −19 ). Both sparsity and modularity are correlated with the performance of networks ( Fig. 7 ). Sparsity also correlates with modularity (p = 5.15 × 10 −40 as calculated by a t-test of the hypothesis that the correlation is zero), as previously shown [ 23 , 66 ]. Our interpretation of the data is that the pressure for both functionality and sparsity causes modularity, which in turn helps evolve learners that are more resistant to catastrophic forgetting. However, it cannot be ruled out that sparsity itself mitigates catastrophic forgetting [ 1 ], or that the general learning abilities of the network have been improved due to the separation into a skill module and a learning module. Either way, the data support our hypothesis that a connection cost promotes the evolution of sparsity, modularity, and increased performance on learning tasks.

To quantify whether learning is separated into its own module, we adopted a technique from [ 23 ], which splits a network into the most modular decomposition according to the modularity Q score [ 65 ]. We then measured the frequency with which the reinforcement inputs (reward/punishment signals) were placed into a different module from the remaining food-item inputs. This measure reveals that P&CC networks have a separate module for learning in 31% of evolutionary trials, whereas only 4% of the PA trials do, which is a significant difference (p = 2.71 × 10 −7 ), in agreement with our hypothesis ( Fig. 1 , bottom). Analyses also reveal that the networks from both treatments that have a separate module for learning perform significantly better than networks without this decomposition (median performance of modular networks in 80 randomly generated environments ( Methods ): 0.87 [95% CI: 0.83, 0.88] vs. non-modular networks: 0.80 [0.71, 0.84], p = 0.02). Even though only 31% of the P&CC networks are deemed modular in this particular way, the remaining P&CC networks are still significantly more modular on average than PA networks (median Q scores are 0.25 [0.23, 0.28] and 0.2 [0.19, 0.22] respectively, p = 4.37 × 10 −6 ), suggesting additional ways in which modularity improves the performance of P&CC networks.

Dark blue nodes are inputs that encode which type of food has been encountered. Light blue nodes indicate internal, non-modulatory neurons. Red nodes are reward or punishment inputs that indicate if a nutritious or poisonous item has been eaten. Orange neurons are neuromodulatory neurons that regulate learning. P&CC networks tend to separate the reward/punishment inputs and neuromodulatory neurons into a separate module that applies learning to downstream neurons that determine which actions to take. For each treatment, the highest-performing network from each of the nine highest-performing evolution experiments are shown (all are shown in the Supporting Information). In each panel, the left number reports performance and the right number reports modularity. We follow the convention from [ 23 ] of placing nodes in the way that minimizes the total connection length.

The presence of a connection cost also significantly increases network modularity ( Fig. 4 ), confirming the finding of Clune et al. [ 23 ] in this different context of networks with within-life learning. Networks evolved in the P&CC treatment tend to create a separate reinforcement learning module that contains the reward and punishment inputs and most or all neuromodulatory neurons ( Fig. 6 ). One of our hypotheses ( Fig. 1 , bottom) suggested that such a separation could improve the efficiency of learning, by regulating learning (via neuromodulatory neurons) in response to whether the network performed a correct or incorrect action, and applying that learning to downstream neurons that determine which action should be taken in response to input stimuli.

Plotted is median performance per day (± 95% bootstrapped confidence intervals of the median) measured across 100 organisms (the highest-performing organism from each experiment per treatment) tested in 80 new environments (lifetimes) with random associations ( Methods ). P&CC networks significantly outperform PA networks on every day (asterisks). Eating no items or all items produces a score of 0.5; eating all and only nutritious food items achieves the maximum score of 1.0.

Modularity is measured via a widely used approximation of the standard Q modularity score [ 23 , 57 , 65 , 67 ] ( Methods ). For each treatment, the median from 100 independent evolution experiments is shown ± 95% bootstrapped confidence intervals of the median ( Methods ). Asterisks below each plot indicate statistically significant differences at p < 0.01 according to the Mann-Whitney U test, which is the default statistical test throughout this paper unless otherwise specified.

The addition of a cost for connections (the P&CC treatment) leads to a rapid, sustained, and statistically significant fitness advantage versus not having a connection cost (the PA treatment) ( Fig. 4 ). In addition to overall performance across generations, we looked at the day-to-day performance of final, evolved individuals ( Fig. 5 ). P&CC networks learn associations faster in their first summer and winter, and maintain higher performance over multiple years (pairs of seasons).

Modular P&CC Networks Learn More and Forget Less

We next investigated whether the improved performance of P&CC individuals is because they forget less. Measuring the percent of information a network retains can be misleading, because networks that never learn anything are reported as never forgetting anything. In many PA experiments, networks did not learn in one or both seasons, which looks like perfect retention, but for the wrong reason: they do not forget anything because they never knew anything to begin with. To prevent such pathological, non-learning networks from clouding this analysis, we compared only the 50 highest-performing experiments from each treatment, instead of all 100 experiments. For both treatments, we then measured retention and forgetting in the highest-performing network from each of these 50 experiments.

To illuminate how old associations are forgotten and new ones are formed, we performed an experiment from studies of association forgetting in humans [11]: already evolved individuals learned one task and then began training on a new task, during which we measured how their performance on the original task degraded. Specifically, we allowed individuals to learn for 50 winter days—to allow even poor learners time to learn the winter associations—before exposing them to 20 summer days, during which we measured how rapidly they forgot winter associations and learned summer associations (Methods). Notice that individuals were evolved in seasons lasting only 5 days, but we measure learning and forgetting for 20 days in this analysis to study the longer-term consequences of the evolved learning architectures. Thus, the key result relevant to catastrophic forgetting is what occurs during the first five days. We included the remaining 15 days to show that the differences in performance persist if the seasons are extended.

P&CC networks retain higher performance on the original task when learning a new task (Fig. 8, left). They also learn the new task better (Fig. 8, center). The combined effect significantly improves performance (Fig. 8, right), meaning P&CC networks are significantly better at learning associations in a new season while retaining associations from a previous one.

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 8. Comparing the retention and forgetting of networks from the two treatments. P&CC networks, which are more modular, are better at retaining associations learned on a previous task (winter associations) while learning a new task (summer associations), better at learning new (summer) associations, and significantly better when measuring performance on both the associations for the original task (winter) and the new task (summer). Note that networks were evolved with five days per season, so the results during those first five days are the most informative regarding the evolutionary mitigation of catastrophic forgetting: we show additional days to reveal longer-term consequences of the evolved architectures. Solid lines show median performance and shaded areas indicate 95% bootstrapped confidence intervals of the median. The retention scores (left panel) are normalized relative to the original performance before training on the new task (an unnormalized version is provided as Supp. S6 Fig). During all performance measurements, learning was disabled to prevent such measurements from changing an individual’s known associations (Methods). https://doi.org/10.1371/journal.pcbi.1004128.g008

To further understand whether the increased performance of the P&CC individuals is because they learn more, retain more, or both, we counted the number of retained and learned associations for individuals in 80 randomly generated environments (lifetimes). If we regard performance in each season as a skill, this experiment measures whether the individuals can retain a previously-learned skill (perfect summer performance) after learning a new skill (perfect winter performance). We tested the knowledge of the individuals in the following way: at the end of each season, we counted the number of sets of associations (summer or winter) that individuals knew perfectly, which required them knowing the correct response for each food item in that season. We formulated four metrics that quantify how well individuals knew and retained associations.

The first metric (“Perfect”) measures the number of seasons an individual knew both sets of associations (summer and winter). Doing well on this metric indicates reduced catastrophic forgetting because it requires retaining an old skill even after a new one is learned. P&CC individuals learned significantly more Perfect associations (Fig. 9, Perfect).

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 9. P&CC networks significantly outperform PA networks in both learning and retention. P&CC individuals learn significantly more associations, whether counting only when the associations for both seasons are known (“Perfect” knowledge) or separately counting knowledge of either season’s association (total “Known”). P&CC networks also forget fewer associations, defined as associations known in one season and then forgotten in the next, which is significant when looking at the percent of known associations forgotten (“% Forgotten”). P&CC networks also retain significantly more associations, meaning they did not forget one season’s association when learning the next season’s association. See text for more information about the “Perfect”, “Known”, “Forgotten,” and “Retained” metrics. During all performance measurements, learning was disabled to prevent such measurements from changing an individual’s known associations (Methods). Bars show median performance, whiskers show the 95% bootstrapped confidence interval of the median. Two asterisks indicate p < 0.01, three asterisks indicate p < 0.001. https://doi.org/10.1371/journal.pcbi.1004128.g009

The second metric (“Known”) is the sum of the number of seasons that summer associations were known and the number of seasons that winter associations were known. In other words, it counts knowing either season in a year and doubly counts knowing both. P&CC individuals learned significantly more of these Known associations (Fig. 9, Known).

The third metric counts the number of seasons in which an association was “Forgotten”, meaning an association was completely known in one season, but was not in the following season. There is no significant difference between treatments on this metric when measured in absolute numbers (Fig. 9, Forgotten). However, measured as a percentage of Known items, P&CC individuals forgot significantly fewer associations (Fig. 9, % Forgotten). The modular P&CC networks thus learned more and forgot less—leading to a significantly lower percentage of forgotten associations.

The final metric counts the number of seasons in which an association was “Retained”, meaning an association was completely known in one season and the following season. P&CC individuals retained significantly more than PA individuals, both in absolute numbers (Fig. 9, Retained) and as a percentage of the total number of known items (Fig. 9, % Retained).

In each season, an agent can know two associations (summer and winter), leading to a maximum score of 6 × 80 × 2 = 960 for the known metric (6 seasons per lifetime (Fig. 2), 80 random environments). The agent can retain or forget two associations each season except the first, making the maximum score for these metrics 5 × 80 × 2 = 800. However, the agent can only score one perfect association (meaning both summer and winter is known) each season, leading to a maximum score of 6 × 80 = 480 for that metric.

In summary, this analysis reveals that a connection cost caused evolution to find individuals that are better at gaining new knowledge without forgetting old knowledge. In other words, adding a connection cost mitigated catastrophic forgetting. That, in turn, enabled an increase in the total number of associations P&CC individuals learned in their lifetimes.