UPDATE: take a look at the response of the lead author at his site, kudos to him for engaging in this constructive discussion.

Epigenetics studies of interesting questions sometimes get great traction, whether it has to do with transmitting your memory of surviving the Holocaust or your sexual orientation.

At the current ASHG meeting an abstract was presented (see the end of this posting) in which it was claimed that testing epigenetic markers can allow you to predict sexual orientation.

There are many reasons why this study is uninterpretable, some of which are described in general terms here. What is not described in the abstract is that the cells used were from saliva samples, which includes a variable mixture of buccal epithelium with a majority of leukocytes. The presence of microbial DNA also has the potential to cross-hybridise to human probes on a microarray, so the screening approach used could be criticised for several potential technical flaws. Also not described is the marginal, uncorrected significance for this underpowered study. These only came to light when the presentation happened.

However, there were some warning flags in the abstract. Why use a new algorithm to identify these predictive markers, did current approaches not yield any results? Where is the mention of performing some sort of locus-specific, orthogonal, more quantitative assay to verify and validate the DNA methylation changes predicted by the microarray studies?

Then they fall into the rabbit hole. Up to the mention of “…9 regions” they had adhered to a description of biomarker discovery, but like everyone else in the history of epigenetics studies they could not resist trying to interpret the findings mechanistically. So they go there: they talk about the genes implicated. Now you need to invoke all of the issues to do with mechanism, including understanding why these loci are of relevance to the phenotype in the cells of saliva. Ewan Birney also made the point in a Tweet yesterday that cross-sectional studies like this are subject to the influence of reverse causation, so study design issues also need to be taken into account.

Some poor young lad gets up on stage at #ASHG15 having worked hard to generate this story and is now being eviscerated by people like me. It’s not personal about him or his colleagues, but we can no longer allow poor epigenetics studies to be given credibility if this field is to survive. By ‘poor,’ I mean uninterpretable. We should only present biomarker studies when they are shown to perform robustly as biomarkers. We should only present mechanistic studies when we have excluded the many biological and technical sources of variability that can mislead us. If we have an intriguing preliminary observation, we present it as such and do not claim that we have generated “…strong support to the hypothesis that epigenetics is involved in sexual orientation.”

Who is culpable in the current situation? The first news report publicising this study came from @NatureNews, but more concerning is the abstract review process that permitted this study to get accepted for presentation, compounded by the press release issued by the American Society for Human Genetics. Both organisations should know better – they need to be substantially more rigourous and not blindly accept that the numbers generated from DNA methylation studies are inherently meaningful.

As a field, we need something like a consensus checklist to guide scientists and reviewers in epigenetics studies. It would be unusually prescriptive an approach, but appears necessary to counterbalance the current over-interpretation of epigenomics studies of human phenotypes. The historical lessons of GWAS self-correcting to account for population effects and to move towards adequately powered studies is what we need to learn. The epigenome-wide association study is at a critical juncture, it is time for epigenetics researchers to be much more self-critical and rigourous.





A novel predictive model of sexual orientation using epigenetic markers.

Authors: T. C. Ngun [1]; W. Guo [2]; N. M. Ghahramani [3]; K. Purkayastha [1]; D. Conn [4]; F. J. Sanchez [5]; S. Bocklandt [1]; M. Zhang [2,6]; C. M. Ramirez [4]; M. Pellegrini [7]; E. Vilain [1]

1) Department of Human Genetics, David Geffen School of Medicine at University of California Los Angeles (UCLA), Los Angeles, CA, USA; 2) Bioinformatics Division and Center for Synthetic & Systems Biology, TNLIST, Tsinghua University, Beijing 100084, China; 3) Department of Pathology and Laboratory Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA; 4) Fielding School of Public Health, UCLA, Los Angeles, CA, USA; 5) Department of Counseling Psychology, The University of Wisconsin-Madison, WI, USA; 6) Department of Molecular and Cell Biology, Center for Systems Biology, The University of Texas at Dallas, Richardson, TX 75080, USA; 7) Department of Molecular, Cellular, and Developmental Biology, UCLA, Los Angeles, CA, USA.

Sexual orientation is one of the most pronounced sex differences in the animal kingdom. Although upwards of 95% of the general population is heterosexual, a small but significant proportion of individuals (3-5%) is homosexual. Male sexual orientation has been linked to several genomic loci, with Xq28 and 8p12 being the most replicated. As with other complex traits, environmental factors may also play an important role. Firstly, monozygotic twins show substantial levels of discordance for this trait. Secondly, each male pregnancy a woman has increases the chance that her next son will be homosexual by 33% (the fraternal birth order effect). Thirdly, early life androgen exposure in women is associated with increased rates of non-heterosexual identity. Taken together, the evidence suggests a role for non-genetic and, possibly, epigenetic influences on sexual orientation. Our aim in this study was to create a predictive model for sexual orientation using epigenetic markers. We created our model based on genome-wide DNA methylation patterns in 37 monozygotic male twin pairs that were discordant for sexual orientation. 10 monozygotic twin pairs concordant for homosexuality were included as a control population. Genomic sites where methylation occurred were consolidated into short regions based on proximity and correlation of their methylation patterns to increase the signal to noise ratio. We then applied the FuzzyForest algorithm to our dataset. Briefly, regions were clustered into modules using Weighted Gene Coexpression Network Analysis and recursive feature elimination was performed with the random forest algorithm (RF) to identify regions most relevant to sexual orientation. The highest prediction accuracy was achieved using information from just 9 regions. Some of these regions were associated with the regulatory domains of two genes, CIITA and KIF1A. The former is a transcriptional regulator that is sometimes referred to as the master control factor of class II major histocompatibility complex genes. The latter is a neuron-specific transport protein that is important for movement of synaptic vesicle precursors along axons. Our results demonstrate that studies of the epigenome can yield new insights into the biological underpinnings of sexual orientation and provide strong support to the hypothesis that epigenetics is involved in sexual orientation. To our knowledge, this is the first example of a biomarker-based predictive model for sexual orientation.

