In the early 1970s the eminent evolutionary geneticist Richard C. Lewontin wrote that population genetics “was like a complex and exquisite machine, designed to process a raw material that no one had succeeded in mining.” By this, Lewontin meant that in the 1930s when R. A. Fisher, Sewall Wright and J. B. S. Haldane established the theoretical foundations of the field, the techniques to discover the variation in populations to test their suppositions was rather thin (naturally, this resulted in many controversies, see The Origins of Theoretical Population Genetics). Geneticists were using classical methods, utilizing salient phenotypes which were proxies for underlying genetic markers, and tracing patterns of co-inheritance of traits with known locations in the genetic map with novel mutants. Researchers were not even clear at that point as to the underlying biochemical structure of the particle of Mendelian inheritance, what we term DNA. That arrived onto the scene in in the 1960s. But in the early 1970s when the above was written we’re not talking about DNA sequencing. Rather, this is the allozyme era, which Lewontin helped usher in with a paper in 1966. He expresses the excitement of the times later in the passage:

Quite suddenly the situation has changed. The mother-lode has been tapped and facts in profusion have been poured into the hoppers of this theory machine. And from the other end has issued–nothing. It is not that the machine does not work, for a great clashing of gears is clearly audible, if not deafening, but it somehow cannot transform into a finished product the great volume of raw material that has been provided.”

Despite the pessimism expressed above the emergence of molecular evolution stimulated the debates around neutral theory. Over a generation ago evolutionary geneticists were grappling with the swell of data which was confronting theoretical frameworks constructed in the early 20th century. Today we live in the “post-genomic” era, and now think in terms of whole genomes. The details may differ, but many of Lewontin’s observations in the 1970s still hold true, as novel results meet the paradigms of old. Last month in PNAS Brian Charlesworth published a paper which brought this to mind, Causes of natural variation in fitness: Evidence from studies of Drosophila populations. You may know Charlesworth as the coauthor of Elements of Evolutionary Genetics, an encyclopedia of a text which I highly recommend to all. In the paper, which is both review for those of us not steeped in Drosophila genetics, and a distillation of derivations to be found in the supplements, Charlesworth notes that there is a contradiction in terms of the typical selection coefficients inferred for deleterious alleles from population genomics in relation to those from quantitative genetics. Population genomics is a new field, and involves sequencing many markers (often whole genomes) to good accuracy across a reasonable number of individuals. Quantitative genetics is a more classical framework utilizing statistical methods which interpret variation in traits within laboratory populations.

The fruit fly has a storied role in Mendelian genetics. To a great extent the study of the fruit fly is the early history of Mendelian genetics (see Lords of the Fly: Drosophila Genetics and the Experimental Life). Therefore it is natural that a large body of research exists in this area, and one can’t accept novel results obtained through new methods such as genomics at face value without some degree of skepticism. Charlesworth notes that the extremely small fitness effects of the mutation discovered via genomic methods are biased toward single nucleotide variants (SNVs); point mutations. In contrast it seems likely that the larger effect mutations implied by quantitative genetic studies, which are rather rare, and so missed in population genomic sample sizes, are due to transposable elements (TEs) interspersing themselves across the genome, and presumably disrupting function. In line with older theoretical models, most of the variation in fitness is due to a small number of mutations. Presumably as genomic methods get better (e.g., longer read to catch repeat elements and larger sample sizes) they will converge upon the older established quantitative genetic methods. Two interesting other results in this paper is that much of the variation is due to balancing selection. For theoretical reasons balancing selection can not be pervasive across the genome (too much fitness variation would result in huge death rates per generation), but, of the variation within the population much of it is maintained by balancing selection according to Charlesworth. Another interesting dynamic is that the population genomic method seem to be better at capturing the distribution of fitness effects in humans, because of our smaller effective population size. You can read the paper for the technical reason why, but the key here is to remember that one has to be careful about extrapolating from model organisms. The models are imperfect, and we always need to never outrun our ability to generalize.

As genomics becomes pervasive in population genetics this sort of analysis will be more common. Rather than “genome-of-the-week” papers we’ll move to actually trying to grapple with what the sequence data is telling us specifically about the lineage in question, and, what we can generalize from the results about evolution writ large. Some organisms have a long history of scientific study, so population genomics will supplement and complement. In other cases though organisms do not have such a rich literature and scientific culture, and the pitfalls that are highlighted here might alert us to the deficiencies in genomic methods.

Citation: Charlesworth, Brian. “Causes of natural variation in fitness: Evidence from studies of Drosophila populations.” Proceedings of the National Academy of Sciences (2015): 201423275.