Phenotypic plasticity, learning, and evolution

February 4, 2014 by Artem Kaznatcheev

Learning and evolution are eerily similar, yet different.

This tension fuels my interest in understanding how they interact. In the context of social learning, we can think of learning and evolution as different dynamics. For individual learning, however, it is harder to find a difference. On the one hand, this has led learning experts like Valiant (2009) to suggest that evolution is a subset of machine learning. On the other hand, due to its behaviorist roots, a lot of evolutionary thought simply ignored learning or did not treat it explicitly. To find interesting interactions between the two concepts we have to turn to ideas from before the modern synthesis — the Simpson-Baldwin effect (Baldwin 1886, 1902; Simpson, 1953):



Organisms adapt to the environment individually. Genetic factors produce hereditary characteristics similar to the ones made available by individual adaptation. These hereditary traits are favoured by natural selection and spread in the population.

In the setting of a static fitness landscape, we can view the Baldwin effect as proceeding in two phases: (1) learning guides evolution toward a fitness peak, and (2) learning is phased out at the fitness peak. In October, I concentrated on the latter, because most biologists have a hard time reconciling it with observed phenotypic plasticity. In particular, all the explanation I have seen for explaining the prevalence of learning always attribute it to model-extrinsic environmental variability (Sznajder, et al., 2012), which is definitely biologically reasonable in many cases, but theoretically boring because the external environmental changes are injecting complexity into the model. In the slightly more interesting case of model-intrinsic environmental variability due to frequency-dependent selection, Smead & Zollman (2009; Smead, 2012) showed that learning is not sustainable for most pairwise interactions. Of course, all these analyses are inherently limited because they assume evolutionary equilibrium, which is in general not reachable even in static fitness landscapes (Kaznatcheev, 2013). In such a setting, it is possible to have evolution always (i.e. for an exponentially long time) remain in the first phase. Using this we can explain observed learning as a consequence of non-equilibrium evolution.

However, the first phase of the Baldwin effect is equally interesting, and actually less intuitive to the hardened Darwinian, as @hsfrey pointed out on r/evolution: “I see no necessary connection between 1 & 2. What is the point of this?” If none of the learning is passed on to the offspring, why would it even matter to evolution?

When most people think of Mendel and his peas, they echo Mayr’s (1982) influential treatise on the Growth of Biological Thought by considering the friar’s most important contribution to be the “inference that each character is represented in the fertilized egg by two, but only two, factors, one derived from the father and the other from the mother” (p. 721; emphasis mine). This is consistent with the orthodoxy of the genetic code: DNA stores all the information, which is then transcribed and transformed by downstream processes into an organism that is only important to the genes in that it is a function for computing fitness. Unfortunately, this interpretation misses an important and subtle feature of Mendel’s thinking:

[T]he distinguishing traits of two plants can, after all, be caused only by differences in the composition and grouping of the elements existing in dynamical interaction in their primordial cells. (Mendel, 1866; translated by Stern & Sherwood, 1968)

It is this ‘dynamical interaction’ that has become the battle cry of many developmental biologists and the related field of evo-devo (Newman, 2002). In this view, the characters or phenotypes of organisms are as much dependent on the genes, as they are on the physical rules governing interactions, emergent properties of dynamic systems, and environmental feedback on these. By taking the dynamic nature of the organism seriously, it is possible to follow work like Turing’s exploration of morphogenesis or the modern study of epigenetics. The parts of these dynamical interactions affected by chance or the environment are called phenotypic plasticity, or — more vaguely — learning.

Although the learned characters are not inherited by offspring, it doesn’t mean that they have no effect on evolution. Inheritance is not the only driver of evolution, phenotypic plasticity affects the fitness of the organism, and that fitness signal is a second important feature of evolution. Hinton & Nowlan (1987) gave an example of how to use this observation as a method for learning to “speed up” evolution. Their work only exploits a difference in timescales between learning and evolution that allows the learning strain to sample more potential phenotypes than the non-learning strain. They consider a completely flat fitness landscape with a single isolated peak. Both the learning and non-learning strains undergo a random walk around this landscape. However, the learning strain is able to “see” much further due to sampling many more phenotype during each agent’s lifetime. Once a learning organism is within “sight” of the fitness peak, the landscape stops being flat and becomes a gentle slope toward the peak because an agent that is one step closer to the peak can find it slightly faster through learning and thus spend more time at that peak, producing a higher fitness. Evolution can follow this slope to the peak instead of blindly stumbling around. The non-learning strain, however, can only ‘see’ point mutations and thus has no slope to follow until it is right next to the peak.

Of course, we can also consider the effects of learning on non-flat landscapes. Suppose we are in complicated fitness landscape with two agents, one starts at phenotype x and the other at phenotype y, and there is a nearby high fitness peak at phenotype z. Now, the relationship in fitness between x and y can be arbitrary, but suppose f(x) < f(y), but x is “closer” to z than y is to z. If there is no learning then y> will replace x, taking us further away from the fitness peak. However, if there is learning then x will find the fitness peak z faster through learning than y will, and so will spend more of its life expressing z and thus achieve a higher fitness than y. Another way to say this, is that learning allowed us to smooth the fitness valley between y and z.

Learning isn’t always helpful, in fact Steven A. Frank (2011) writes: “It was clear that learning could slow evolutionary rate. Different genotypes could, through learning, end with the same phenotype. Reducing the phenotypic distinction between different genotypes would generally slow evolutionary rate”. In other words, learning could shield evolution, removing selective pressures and thus slowing it down (Kirby et al., 2007). An example of this would be to genotypes x and y with f(x) < f(y) but both close and equidistant to a very high fitness peak z with f(z) >> f(y). Since x and y are equidistant to z, they will find it at about the same time in their learning. Since the peak is close, this will be relatively early and the high fitness f(z) will obscure the original fitnesses f(x) and f(y). Thus will greatly reducing the selection strength from x to y and slow down the populations transition from x to y.

At times it can be even more drastic, and instead of just slowing down evolution, we can ‘reverse’ it or lock in an unfit genotype. The easiest way to see this is to consider phenotypes w, x, y, and z with fitness f(w) < < f(x) < f(y) < < f(z) and arrange them as z – w – x – y i.e. z is closer to w than to x and closed to x than to y. Since x is closer to z than y is, it can find the high fitness genotype faster and spend more of its life at that peak, resulting in a higher overall fitness than the learning y. However, the fitness of w could be so much lower (or even non-viable) than x that even though it is closer to z, finding it slightly faster than x is not sufficient to overcome the initial fitness penalty. This will result in a learning organism locked into the non-optimal genotype x that it could escape (by moving to y and then maybe continuing a slower climb forward to an eventual vertex of higher fitness than even z) if there was no learning present.

Of course, for any of these mechanisms to work, we have to satisfy these varied conditions on the fitness landscape and learning has to see further (and in a different way) than evolution does. The biological question becomes: are there any models that satisfy all these examples and achieve this sort of fitness valley smoothing?

Biologists seem to believe that at least some of these exist, Ancel (2000; see also Frank, 2011) observed:

Learning accelerates evolution only under certain conditions.

Although learning might speed evolution, that isn’t necessarily why it evolved.

Learning can eliminate otherwise uncross-able fitness valleys.

Unsurprisingly, the Baldwin effect is very difficult to study empirically, in part due to the difficulty of finding model organisms that can both display significant phenotypic plasticity or learning and have generations that are fast enough to run experiments. Thankfully, the fruit fly has proved to be useful for looking at Baldwin effect (phase one) in controlled experiments (Mery & Kawecki, 2004), and tiger snake in its isolated island environments has allowed us to look at genetic assimilation (phase two) in the wild (Aubert & Shine, 2009). However, as is often the case in evolutionary biology of fitness landscapes, more connections are still needed between theory and practice.

References

Ancel, L. W. (2000). Undermining the Baldwin expediting effect: Does phenotypic plasticity accelerate evolution?. Theoretical Population Biology, 58(4): 307-319.

Aubret, F., & Shine, R. (2009). Genetic assimilation and the postcolonization erosion of phenotypic plasticity in island tiger snakes. Current Biology, 19(22): 1932-1936.

Baldwin, J.M. (1886). A new factor in evolution. Amer. Nat., 30: 441-451, 536-553.

Baldwin, J.M. (1902). Development and evolution. Macmillan, New York.

Frank, S.A. (2011). Natural selection II: Developmental variability and evolutionary rate. Journal of evolutionary biology, 24 (11), 2310-20 PMID: 21939464

Hinton, G. E., & Nowlan, S. J. (1987). How learning can guide evolution. Complex Systems, 1(3): 495-502.

Kaznatcheev, Artem (2013). Complexity of evolutionary equilibria in static fitness landscapes. ArXiv: 1308.5094v1.

Kirby, S., Dowman, M., & Griffiths, T. L. (2007). Innateness and culture in the evolution of language. Proceedings of the National Academy of Sciences, 104(12): 5241-5245.

Mayr, E. (1982). The growth of biological thought: diversity, evolution and inheritance. Harvard University Press.

Mery, F., & Kawecki, T. J. (2004). The effect of learning on experimental evolution of resource preference in Drosophila melanogaster. Evolution, 58(4): 757-767.

Newman, S.A. (2002). Putting Genes in their place. Journal of Biosciences, 27:97-104.

Simpson, G.G. (1953). The Baldwin effect. Evolution, 7(2): 110-117.

Stern, C., & Sherwood, E.R. (1968). The origin of genetics: a Mendel source book. San Francisco: W.H. Freeman.

Sznajder, B., Sabelis, M. W., & Egas, M. (2012). How adaptive learning affects evolution: reviewing theory on the Baldwin effect. Evolutionary Biology, 39(3): 301-310.

Valiant, L.G. (2009) Evolvability. Journal of the ACM, 56(1): 3.