In 2012 at a scientific conference I met a Swedish population geneticist named Ola Hössjer. He and I sat down in the lobby of the hotel where we were staying to discuss what kind of population genetics model might be possible to test whether humanity could have come from a single first pair of humans. The motivation for doing so was the repeated challenge from other population geneticists claiming that we humans had to come from a population of thousands, not just two. He and I both knew the assumptions that had to go into the models such population geneticists constructed, and wondered if different starting assumptions would yield different results.

Population genetics is a field that uses math to model how genes and mutations are distributed in populations and how that distribution changes over time. It can be used to model our ancestry, to reconstruct our genetic history as a population as a whole, or as smaller subgroupings such European, Asian, African, etc., or even tribes within a larger grouping. The standard population genetics models that reconstruct ancestral history work backward by a process of coalescence to a starting point where everything is identical — everything starts out the same, with one set of chromosomes, and diverges from there by the accumulation of mutations, and the processes of recombination, genetic drift, and natural selection.

Back in that hotel lobby, Ola and I quickly came up with a list of variables that would need to be accounted for in any model, things that are unknown aspects of the history of our origin, and we talked about the computational problems of any forward-looking model, one that goes from two individuals at the start to something like the present population. To keep track of all the variables and to trace the possible genetic changes quickly becomes computationally too intense to go very far. I personally thought such a model was intractable and beyond anyone’s ability to build. Was I wrong!

A little over a year ago Ola presented a model to our now co-author Colin Reeves and me that took all those variables we had discussed in Copenhagen into account. It is the most comprehensive population genetics model I have seen anywhere — it’s a brilliant piece of work. Ola found a way to solve the problem of the explosive nature of forward-directed models I mentioned above. He uses the same coalescent technique of the standard models to reconstruct an ancestral tree from a few thousand individuals in the present time, going backward to a starting point of two. His model then reverses the process by going forward in time, using the tree as a framework to keep track of genetic changes.

The model is general enough that it can be of use to any geneticist to test the effects on genetic diversity of processes such as migration, age structure, mating behavior, and other aspects of population dynamics and demography.

The key assumption that distinguishes our model from the standard ones is that we assume that the first pair started out with heterogeneous chromosomes — four distinct sets, two sets for each individual. The standard population genetics models work backward assuming everything starts from a single point. We are proposing that things started out different, not the same, with diversity present from the beginning in the genomes of the starting first pair.

We still need to code this model, which is a work in progress being done by Colin Reeves, and we hope others as well, as it is a massive project, and will require time and resources. But when it’s completed, we will be able to test the hypothesis that we can recreate modern genetic diversity starting from an original pair with original genetic diversity. Should we be able to demonstrate this, there will be two competing models for human origins, one that says we came from a population of thousands, and ours that says we came from a population of two. We will see which best fits the available data and yields the most insight.

The model has now been published in the journal BIO-Complexity, in two parts, the first being a general introduction to population genetics and the rationale for the model, and the second being the model itself. My hope is that this model will be the catalyst for much research and discussion, on both sides.

Photo credit: Lhfage [CC0], via Wikimedia Commons.