A health worker collecting blood samples in Uganda for the African Genome Variation Project. Blood samples are used to extract genetic material for sequencing and genotyping. Credit@GinaMurphy

All the information necessary to describe a living organism is contained within its genome, typically the DNA. Given the wide spectrum of shapes, behaviours and chemistries living things come in, it’s only natural to assume that these great differences are mirrored by similarly great differences in the genome. Yet this is far from the case. In fact, using comparative genomics, the similarity between and within species becomes immediately evident. Mice and men for example differ by about 2.5% genome wise. Two average people, on the other hand, share 99.9% of their genetic material. Of course considering the size of the genome, these small percentages actually translate into an outstanding number of differences in a DNA sequence, which is why genetic identification is possible at all. Identifying these differences may shed light on the course of different lineages through time and even pinpoint areas on the genome important for medical research.

So far, several projects have taken up the task of enriching the available knowledge on human diversity. Still, many populations are under-represented. This includes many African populations, which are particularly important considering the immense genetic diversity of the continent, and since understanding genetic variation of African populations may provide insight into human evolution. In a paper published in Nature last week, a team of scientists describes an attempt to quantify genetic diversity in sub-Saharan Africa (SSA) through the African Genome Variation Project and discuss the implications on positive selection, genetic admixtures and gene associations with a number of ailments, such as malaria.

The study genotyped 2,185 individuals from 16 distinct African populations, examining about 2.2 million variants and sequencing 320 individuals from 7 ethno-linguistic groups. The difference between sequencing and genotyping lies in the depth of the genetic analysis. Sequencing someone’s genome can be a labourious and costly procedure. The Human Genome Project, which produced the first complete sequence of human genetic material, took 10 long years to complete. Today, using the knowledge acquired from that and other such projects, the areas that vary between humans are known to some extent . Genotyping uses this advantage, only examining these areas, speeding up the process and making it more cost-effective.

Of course this procedure has its limitations, as the number of people who have been sequenced so far is relatively small (in comparison to the global population), so a lot of variation may go by unidentified. However, it is still effective.

The scientists identified almost 30 million single nucleotide polymorphisms (SNPs); the differences in a single base in the genome widespread within a population, of which a significant part was unshared and novel, particularly among Ethiopian populations. In addition, their findings suggested a great degree of common ancestry between Ethiopian populations and isolated hunter-gatherer communities, as well as ancestral connections between hunter-gatherers and Eurasian communities. Other than confirming known loci (areas on genes) under possible selection in SSA, like the loci associated with skin pigmentation or Lassa fever, several novel loci were also found to exhibit a Europe-Africa differentiation; including genes associated with malaria susceptibility and hypertension. However, little differentiation was identified across African countries, suggesting a great genetic influence from Eurasian lineages. Additional findings included genes associated with type-1 diabetes, trypanosome response (protozoans responsible for sleeping sickness) and trachoma, better known as granular conjunctivitis.

Results also indicated the additional steps needed to be taken to capture the full genetic diversity of the region and to identify the potential associations of genes with particular conditions like malaria. Firstly, larger reference panels need to be created to improve imputation accuracy, the ability to predict unobserved genotypes. Secondly, to improve associations of particular loci with various conditions, a broad sequencing approach is necessary, targeted at identifying the haplotype diversity (grouped alleles usually inherited together). To achieve this, the writers conclude that a pan-African array (a chip with attached DNA sequences used in sequencing and genotyping) needs to be developed. Such a step, followed by a large scale sequencing effort across Africa, may shed light on ancient admixture, patterns of historical population movement and may create a valuable resource for future medical genomic studies.

What other populations may provide interesting insights into human evolution?