When last we saw the Vi 33.16 X chromosome, I was wresting out its secrets by looking for SNP haplotypes shared by this Neandertal with the European and African samples from the HapMap (“Neandertal segments of X chromosomes”). Neandertal haplotypes in the CEU (Utah, European ancestry) sample, that are not also found in African samples, are candidate loci for Neandertal ancestry outside Africa.

In my earlier post, I pointed out some drawbacks and weaknesses of this simple approach. The SNPs have poorer power than sequence data, and we will miss relevant short haplotypes. Some Neandertal-derived alleles are probably present at low frequencies in Africa. Excluding rare African alleles will cause us to miss these cases. What we will find is a filtered set of Neandertal candidate loci, where we don’t control the filter.

Finding these haplotypes lets us look at their frequencies within the European sample. As I pointed out, most of the Neandertal haplotypes in the CEU sample are rare, one or two copies. A handful are quite common, up to 30-40 copies in the sample. A good-sized set occurs in 5-10 copies.

We know from Green and colleagues’ comparisons that at least three people outside of Africa have the same fraction of Neandertal ancestry – one from France, one from China, and one from Papua New Guinea. But there’s no reason to think they have inherited the same segments from Neandertals. The overall proportion of Neandertal ancestry is very slight, less than five percent. If five percent of loci were 100 percent Neandertal, then everyone would have the same Neandertal loci. But that’s not the way they are distributed. Different individuals certainly have different Neandertal genes.

A rare allele in one sample is quite likely not to appear in geographically distant samples. So for many of the Neandertal haplotypes in the CEU sample, we shouldn’t expect to see them in China. And, as you can tell from the figure below, that is in fact the case.

What you’re looking at is a 3-D histogram of Neandertal candidate haplotypes in China and Europe. The number of copies in the CEU HapMap sample is on the X axis, the number of copies in the CHB HapMap sample on the Z axis, going back into the picture. From the leftmost corner, at the origin, going along the X axis is the set of haplotypes present in CEU but absent in China. As you can see, the most frequent outcome is one copy in either one sample or the other. This being a histogram, those are both lumped into the highest bar at the origin.

Here’s a detail of the area near the origin, turned upward so we’re looking at almost an X-Z plot.

As we go down the X axis, you see there are many haplotypes with 3 or 4 copies in CEU and none in CHB. In fact, there are very few that have 3 copies in CEU and any in CHB – many fewer altogether than occur in 3 copies in CEU and none at all in CHB. The ones that have 10 or so copies in both samples are, well, scarce.

This is very striking. China and Europe by and large have different Neandertal-derived haplotypes. Haplotypes from Neandertals that are common in Europe – say, with more than two or three copies – are mostly rare in China. And vice-versa; haplotypes that are common in CHB are rare in CEU.

Why should this be? Green and colleagues Green:draft:2010 hypothesized an early population mixture of Africans and Neandertals in West Asia, before that population dispersed throughout the rest of Eurasia. This hypothesis was meant to explain why China and Europe have the same proportion of Neandertal genes.

I think that is also consistent with the fact that China and Europe have different Neandertal genes. If the population mixture was followed by substantial genetic drift as the West Asian population dispersed in different geographic directions, drift would randomly increase the frequency of some haplotypes in one direction, others in the other direction. Europe and China would end up with the same proportion of Neandertal ancestry, but it would be distributed very differently among loci.

Next, we’ll examine whether this pattern is the same for the rest of the chromosomes. Or maybe something even more interesting…