I’ve talked about rs17822931 in ABCC11 several times. The reasons are manifold. First, on many traits of interest it exhibits variation across populations in a simple Mendelian (recessive expression) manner. Second, there are suggestive variations in distribution. Third, the traits are kind of interesting without being biomedical. In other words, it’s a cool illustration of pleiotropy and human genetic variation that isn’t going to depress you. If you check out the SNPedia page you note that it is associated with variation in earwax type (wet vs. dry), body odor, and colostrum secretion. This is not the full list, and I’m moderately confident that biologists haven’t hit on all the major phenotypes that this affects variation in.

Until recently I’ve really only been interested in the population genetics of the trait. But talking with a few friends who were molecular biologists I realized I should follow up and dig deeper, and what I found was very interesting. Specifically, as it relates to body odor, which, like it or not is a phenotype of significance in the modern world. The trait happens to segregate within my family. My son is a TT genotype, because his parents are heterozygotes. That means he will exhibit less body odor as an adult. How much less?

In The Journal of Dermetological Science I found Functional characterisation of a SNP in the ABCC11 allele—Effects on axillary skin metabolism, odour generation and associated behaviours. Obviously this is not a journal I read often, but some of the tables are fascinating. The subjects were a few hundred Filipins. This is a population where the allele of interest segregates in intermediate frequencies. So there are many individuals with dry earwax as well as wet earwax, and all the associated traits.

Here are some tables I extracted*:

Mean malodour scores 5 hours 24 hours TT 2.59 2.6 CT 3.26 3.4 CC 3.21 3.5 Genotype TT CT CC Uses deodorant 0.5 0.86 0.97 Does not use 0.5 0.14 0.03

I have no idea how subjective malodour scales work, but the moral is pretty straightforward. Those with the TT genotype saturate at a much lower point. This manifests in daily behavior. There is a fair amount of Japanese data that people who go to the doctor for body odor issues are much more likely to have wet earwax. This data from the Philippines illustrates that individuals with the derived genotype, TT, must be conscious enough of their lack of body odor to forgo deodorant purchases, even though I assume it is normative in the American influenced culture of the Philippines.

But most interesting to me are the chemical differences of the sweat of the different genotypes. They note that there were differences in N α -3-methyl-3-hydroxy-hexanoylglutamine (HMHA-Gln), N α -3-methyl-2-hexenoyl-glutamine (3M2H-Gln), and 3-methyl-3-sulfanyl-hexanol-cysteine-glycine between the genotypes. I don’t know much about these chemicals, except that they are “malodour conjugate precursors”. Not surprisingly there’s some difference in the microbial flora of the individuals as a function of genotype.

There have been attempts to understand the selection processes which may have shaped the distribution of the regional variation of this trait, but I’m not entirely convinced of what I’ve seen. Especially when the authors presume that earwax phenotype is in some ways causal (or at least it can give insight to causality, if that makes sense), when it may just be a developmental side effect. A consideration is that some models assume a recessive expression of the trait, which is true for body odor and earwax. But we don’t know if selection occurred that it was on these traits. Because of pleiotropy traits due to variation at a given gene may exhibit different levels of dominance, from full dominance, to additivity, to recessive expression. The target of selection may exhibit a different dominance coefficient than many of the side effect phenotypes (to give you a concrete example, the locus responsible for blue vs. non-blue eye color in Europeans exhibits some recessivity, but it is also responsible for variation in skin color where it is additive).

A 2009 paper using the HGDP data set found evidence of selection on ABCC11 using XP-EHH but not iHS. In other words, extended haplotype differences across populations, but not within them, which often imply sweeps near fixation between populations, rather than ongoing ones within them. To get a better sense of the distribution of the allele I decided to query the SNP in the 1000 Genomes Browser. I invite you to look at the data yourself. The sample sizes start to get pretty large in some of these populations. It is interesting that in West African populations the ancestral variant is nearly fixed, or totally so. The cases where it is not so can pretty easily be hypothesized as due to recent (last 10,000 years) Eurasian admixture. In Europe the frequency of the derived variant is low, on the order of ~10%, but in the Finnish sample it peaks at ~25%. This aligns with patterns in the HGDP data set. African populations tend to be fixed for the ancestral variant, C, while European populations have a low frequency of the derived variant, T, with a cline toward the northeast from the southwest (i.e., peaks in the Russians, lowest fraction in Sardinians). But, Middle Eastern samples in the HGDP data set have European proportions of T as well, though the Mozabites in North Africa do not. The South Asian samples in the HGDP have higher levels of the derived variant than Europeans, intermediate between that group and East Asians. But the 1000 Genomes data results in a thickening of the plot (and, with large sample sizes!). The Bangladeshis are at even a higher fraction than the Pakistani populations. The genotype counts are like so: 12 CC, 54 CT, TT. When I saw this I assumed it was the East Asian admixture, on the order of 10-20%, which might account for the enrichment of T in relation to Pakistan groups. But that is not correct. Here are the counts for Indian Telegus: 20 CC, 49 CT, and 33 TT. And Sri Lankan Tamils: 23 CC, 49 CT and 30 TT. Many hypotheses about the derived variant involve adaptations to cold climates in Northeast Asia. This may still be the case in Northeast Asia, but what you see here is a NW to SE cline of ancestral to derived variant of ABCC11 in South Asia. The Punjabis and Gujaratis have higher fractions of the ancestral variant, as you’d except from the HGDP data.** (the fraction in the Bangladeshi sample might be elevated by East Asian admixture)

The results form East Asian samples in the 1000 Genomes is also illuminating. With sample sizes of around 200 each the Dai minority (related to the Tai people culturally as their antecedents) has a frequency of 56% for T, the Han from Beijing have 97%, the Han from South China are at 86%, the Japanese 88%, and the Vietnamese from the southern region of the country 64%. First, my intuition is that this seems a strange pattern for a allele which was selected on a recessive trait. Rather, it looks more likely for selection on a dominant trait, where the equilibrium frequency remains below 100% because of recessive expression of the unfavored state. Second, the fraction for the Dai seems rather high for the ancestral state. This particular population is sampled from the Mekong region of southern China, as far south as you can go in the nation. This sort of cline correlated with latitude goes a long way to explaining why the thesis often emerged that this variation is somehow related to climate (there is something of a north-south cline in Japan as well).

Where does this leave us? I honestly don’t think we can make a general conclusion about the nature of selection around this variation. To me it looks as it was functionally constrained in Africa. African populations have the derived variant, but those that do can be explained via recent Eurasian admixture pretty easily (e.g., the LWK sample are Kenyan Bantus who have mixed with Nilotic peoples, who do have Eurasian ancestry. The same for the samples from Gambia or Senegal in relation to Eurasian mixed Fula). But once you leave Africa it look as if the constraint was removed, and lots of populations have low frequencies of the derived nonsynonymous mutation. The 2006 paper which focused in on the SNP of interest had Oceanian samples, and the derived variant fraction is too high to simply be a matter of Austronesian admixture. Could it be some form of balancing selection outside of Africa? Who knows. It might be neutral in some areas, under positive selection in others, balanced in a few locations, and under constraint in Africa.

But despite the evolutionary enigma of this locus, the phenotypic correlations keep building up. It’s a classical genetics illustration because of its Mendelian character. In terms of morphology I should emphasize that the body odor related information probably relates to the apocrine glands, which are localized in the armpits and genitals, and also are precursors to mammary secretion glands. Someone who understands these sorts of pathways and how they influence development could probably say much more. I’m sure at some point we’ll be able to answer the big evolutionary questions about this locus, and how it relates to human biological variation, but that will probably necessitate a better catalog of its phenotypic consequences.

Addendum: If you have a 23andMe account, here is the link that will show you your genotype (and anyone else on your account): https://www.23andme.com/you/explorer/snp/?snp_name=Rs17822931 (be logged in ahead of time).

* I flipped the strand, so converted T to A and G to C.

** To be fair, there was some evidence from Tamils in earlier studies, but two South Indian populations in the 1000 Genomes with high sample sizes nails it.