Subjects

All participants in the study provided written informed consent in accordance with the St Thomas’ Hospital Local Ethics Committee. Altogether, 100 volunteer, female MZ and dizygotic twins from the TwinsUK cohort49 were included in the study, including a discovery set of 50 MZ twins (25 MZ pairs) and a second independent set of 50 unrelated individuals. QST for heat pain tolerance was performed according to standard protocols (Supplementary Methods and included measures of the HPST. The discovery 25 MZ twin pairs (median age 62 years) were selected as the most discordant MZ twin pairs for HPST scores in the TwinsUK cohort at the time of the study (Supplementary Fig. S1). Discordance was defined as a difference of at least 2 °C in HPST scores, and one twin in each pair fell in the upper tail of the HPST distribution (low pain sensitivity). The second sample of 50 unrelated individuals (median age 64 years) was unselected and HPST scores were representative of the TwinsUK cohort HPST distribution (Supplementary Fig. S1). Whole blood was collected after QST and DNA was extracted using standard protocols. WBC subtype counts were obtained in 48 individuals from discovery sample using FACS of peripheral blood50. WBC subtype cell counts were calculated for four cell types: neutrophils, eosinophils, monocytes and lymphocytes.

MeDIP sequencing

MeDIP-seq was performed and analysed separately in the MZ twin and unrelated samples. In the discovery MZ twin sample, 5 μg genomic DNA from each sample was sonicated on a Diagenode Bioruptor to produce a median fragment length of 180–230, and verified using a 2100 Bioanalyzer with DNA1000 chips (Agilent, Santa Clara, USA). Samples were prepared for next-generation sequencing by blunt ending, dA-tailing and then ligating paired-end adapters according to manufacturer’s reagents and protocols (Illumina) with purification (QIAGEN) between steps. Samples were quantified and size checked using a 2100 Bioanalyzer and DNA1000 chips (Agilent). One microgram of each adapter-ligated sample was spiked with ~5 × 10−7 pmoles methylated/unmethylated control DNA (see ref. 51 for details) and subjected to MeDIP (ref. 52) with 150 ng anti-5-methylcytidine using automation according to manufacturer’s reagents and protocols (Diagenode); 100 ng was reserved as input for MeDIP quality control (QC), which was conducted using quantitative PCR (qPCR) with primers designed to amplify methylated/unmethylated control DNA. MeDIP DNA was then purified (Zymo Research) and amplified by 18 cycles adapter-mediated PCR. PCR products were purified (QIAGEN) and run out on separate 2% low-melting point agarose gels. PCR fragments were excised and purified (QIAGEN). The resulting libraries (300–350 bp) were QC using an Agilent 2100 Bioanalyzer and DNA1000 chips and qPCR (see ref. 53). The samples were sequenced on Illumina GA2, using two lanes per sample, to produce on average 50 million paired-end reads of length 50 bp.

In the unrelated samples, 1.5 μg of original genomic DNA was fragmented to DNA smear between 200 and 500 bp using a Bioruptor NGS (Diagenode) sonication system. End repair, <A> base addition and adaptor ligation steps were performed using Illumina’s Paired-End DNA Sample Prep kit following the manufacturer’s instructions. Next, the methylated DNA fractions were immunoprecipitated using the Magnetic Methylated DNA Immunoprecipitation Kit (Diagenode, catalogue number: mc-magme-048) with small adjustments. Briefly, 1.5 μl of each two control templates (methylated and unmethylated DNA controls) with different sequences were mixed with adapter-ligated DNA and heat denatured (95 °C, 3 min) in a final reaction volume of 90 μl within the provided reagents of 24 μl MagBuffer A and 6 μl MagBuffer B. Aliquot (7.5 μl) of the denatured genomic DNA was saved as input and another 75 μl was immunoprecipitated with 5 μl of prepared anti-5mC antibody and 20 μl of prepared Magbeads. The DNA–antibody magbeads mixture was incubated overnight at 4 °C on a rotating wheel. The DNA–antibody magbeads mixture was then washed twice using each of 150 μl ice-cold MagWash Buffer-1 and washed once with 150 μl of ice-cold MagWash Buffer-2. For each buffer, the washing reaction was incubated for 4 min at 4 °C on a rotating wheel. The input DNA and immunoprecipitated products were then in parallel treated with protease K and purified with ZYMO DNA Clean & Concentrator-5 (ZYMO). The efficiency and sensitivity of immunoprecipitation reaction were detected by qPCR using the enriched DNA and the unbound input DNA. MeDIP-seq library was constructed by PCR amplification with the enriched DNA as template. PCR reaction was performed in 50 μl of reaction volume consisting of 20 μl MeDIP-enriched DNA, 5 μl of 2.5 mM dNTP, 2 μl primers, 5 μl of 10 × pfx amplification buffer, 2 μl of 50 mM MgSO 4 , 15.2 μl sterilized water and 0.8 μl of Platinum pfx DNA polymerase (Invitrogen). The programme of amplification was 94 °C 2 min, 10 cycles of 94 °C 15 s, 62 °C 30 s and 72 °C 30 s, then prolong with 10 min at 72 °C. The products could be kept at 12 °C. The PCR products were purified using Agencourt Ampure Beads (Beckman Coulter). After analysing by the Bioanalyzer analysis system (Agilent) and quantified by the real time PCR, the prepared library was sequenced using Illumina HiSeq2000 with SE50 read length.

MeDIP-seq DNA methylation quantification

In the discovery MZ twin sample, we obtained on average 50 million paired-end reads of length 50 bp per individual. Alignment to the human genome (hg18) was performed using Maq54 (v0.7.1). QC of the aligned data took into account base quality scores (no exclusions), read mapping scores (threshold maq q>10), removal of duplicate paired-end (PE) reads, proper PE pairing criteria and insert-size distribution checks (see Supplementary Fig. S2 and Supplementary Methods). On average, 60% of reads passed alignment QC per individual. We used the post-QC PE reads to reconstruct MeDIP fragments per individual. We quantified coverage of MeDIP fragments per base pair across the genome and binned coverage into overlapping 1-kb bins (overlap of 500 bp). For each bin, we calculated a normalized methylation score, the relative methylation quantification (RMQ) score, which was the sum of the MeDIP-fragment coverage of each base pair in that bin, divided by the overall number of autosomal post-QC reads per individual. For pain-DMR analysis, we only considered bins with RMQ variance >0, where at least 10% of the sample had RMQ scores >0. There were 5,260,672 autosomal 1-kb bins used in the discovery pain-DMR analyses.

In the discovery sample, we also performed additional quantifications of MeDIP-seq data, taking into account variation in the size distribution of MeDIP fragments across individuals and variation in CpG density across genomic regions (Supplementary Methods and Supplementary Figs S6 and S7). For fragment analyses, we matched the distribution of MeDIP fragments across the 50 individuals to match a standardized distribution (Supplementary Methods), and in the analyses controlling for CpG density, we used MEDIPS29 to calculate absolute methylation scores (AMS). We used AMS to examine DNA methylation patterns across the genome (Fig. 1), and we used AMS and fragment-corrected scores to assess the reproducibility of pain DMRs to different MeDIP-seq quantifications (Supplementary Figs S7 and S8).

In the unrelated samples, we obtained on average 25 million single-end reads of length 50 bp per individual. Sequence alignment to the human genome (hg18) was performed in Bwa55 (v0.5.9; Supplementary Fig. S3). Data were subject to QC checks for base composition (no exclusions), read mapping quality (threshold bwa q>10) and removal of duplicate reads. On average, 53% of reads passed alignment QC per individual. Post-QC single-end reads were then extended by 175 bp equilaterally to produce MeDIP fragments of size 225 bp. We quantified coverage of MeDIP fragments per base pair across the genome, and binned coverage per base pair into overlapping 1-kb bins (overlap of 500 bp). For each bin, we calculated the normalized methylation score (RMQ), and in the pain-DMR analyses we only considered bins with RMQ variance >0, where at least 10% of the sample had RMQ scores >0. There were 5,312,714 autosomal 1-kb bins used in the follow-up pain-DMR analyses.

DMR analyses

Pain-DMR analyses in the discovery MZ twin set were performed using linear mixed-effects models in R (lme4 package). We regressed the RMQ scores in each 1-kb bin on HPST as a fixed effect, family as a random effect, and with and without age as fixed-effect covariate. Before performing the pain-DMR regression, we normalized the RMQ scores to N(0,1), but results were also calculated using the raw RMQ scores (Supplementary Methods). In the discovery set, additional genome-wide pain-DMR analyses also took into account MeDIP-seq fragment variability and CpG density, and only considered specific genomic categories such as CpG-islands, CpG-island shores and promoters (Supplementary Methods). In the second sample of 50 unrelated individuals, we fit linear models regressing RMQ on HPST across the genome, with or without covariates such as age and the first three autosomal methylation principal components, as well as including RMQ normalization (to N(0,1)). The final pain-DMR results (Table 1) include normalized (to N(0,1)) RMQ scores regressed on HPST.

MA of the MZ twin and unrelated samples was performed in GWAMA56, using a fixed-effect MA on the HPST pain-DMR regression betas. We report MAP-DMRs at permutation-based MA significance threshold of FDR=5%. We investigated evidence for heterogeneity in the MA using Cochran’s Q-statistic and the I2-statistic57, and only considered results with no strong evidence for heterogeneity (Cochran’s Q, P>0.01 and I2<0.8) in the observed data and in the permutations. For completeness and to account for heterogeneity, we also performed a random-effects MA and found that the nine FDR 10% MAP-DMRs were within the top-ranked 0.002% of results (Supplementary Table S7). We also checked DMRs for potential mappability problems, but the MAP-DMRs did not overlap previously reported unannotated high-copy-number regions, which may result in alignment problems58.

To determine EWAS genome-wide significance thresholds in the discovery and replication samples, we performed 20 replicates of genome-wide permutations. In each replicate, we permuted the HPST scores (preserving family structure in the MZ twin sample) and performed the genome-wide EWAS analyses under the null, and used these to calculate FDR 5% thresholds in the MZ twin and unrelated EWAS. We then combined each pair of MZ twin and unrelated replicates by MA, to determine the MA permutation-based FDR 5% threshold. Genome-wide permutation-based FDR 5% thresholds were estimated for the MZ twin, unrelated and MA results. In each case, FDR was determined as the fraction of significant hits in the permuted data compared with the observed data at each P-value threshold. At each nominal P-value level, we pooled significant hits across the 20 replicates, divided by the number of replicates (20) and then divided by the number of observed hits in the real data at this nominal P-value. There was some variability in FDR estimates at the level of individual replicates, for example, random samples of 10 replicates resulted in MA FDR 5% thresholds ranging from P=2.5 × 10−9 to P=1.0 × 10−12, whereas the FDR 5% threshold based on 20 replicates was estimated at P=2.0 × 10−11.

For within-pair DNA methylation and HPST analyses in the discovery MZ twin set, we calculated the difference in RMQ scores at each 1-kb bin between co-twins in the 25 MZ twin pairs. We calculated Pearson’s and Spearman’s correlations of the RMQ and HPST differences within the MZ twin pair. We also fit a linear model, regressing the normalized RMQ difference on the HPST difference with age as covariate, and the results were similar.

To assess whether the MAP-DMRs identified in our study capture differential proportion of whole-blood cell (WBC) subtypes, we compared DNA methylation with WBC subtype proportions for neutrophils, eosinophils, monocytes and lymphocytes. Blood-count DMR analyses were performed at the nine FDR 5% MAP-DMRs in 48 individuals from the discovery set. We fit a linear mixed-effect model, regressing the normalized RMQ scores on WBC subtype proportion as a fixed effect and family as a random effect. Results are presented at a DMR Bonferroni-corrected P-value=0.05 (nominal P=0.0056).

Illumina 450k DNA methylation

Illumina 450k data were obtained for 46 discovery sample individuals. We removed two outliers and performed principal-component analysis, comparing the first two principal components with potential confounders, which identified methylation chip, position of the sample on the chip and post-bisulphite conversion DNA concentration as potential confounders. Therefore, in the pain DMR 450k EWAS analyses, we regressed DNA methylation on HPST levels and included chip, order of sample on the chip, bisulphate-converted DNA concentration and age as fixed effect covariates, and family as random effect. We excluded all probes that mapped to multiple locations within 2 bp mismatches, resulting in 468,113 probes, and in the 450k EWAS analyses we also excluded probes with missing data, which resulted in a 440,307 autosomal probes. We permuted the data ten times to assess significance of the genome-wide 450k EWAS results.

We used two approaches to compare MeDIP-seq and Illumina 450k DNA methylation levels. First, we considered CpG sites that overlapped the 1-kb MeDIP-seq bins (457,045 probes). Second, to validate MeDIP-seq FDR 5% DMR effects in the Illumina 450K results, we considered MeDIP-seq DMRs that were within 100 kb of a gene TSS. We compared the direction of methylation pain association of the MeDIP-seq pain DMR and the direction of methylation pain association of all the 450k CpG sites that were assigned to the same gene. For validation, both MeDIP-seq and 450k DMRs had to be in the same direction and the 450k nominal P surpass weak evidence for association. The DNA methylation data are available at http://www.twinsuk.ac.uk/data-access/medipseq-data/ and under GEO accession number GSE53130.

Bisulphite pyrosequencing

Bisulphite pyrosequencing DNA methylation in the TRPA1 DMR genomic region was performed by a commercial laboratory service (EpigenDx). Briefly, 200–500 ng of genomic DNA was used for bisulphite modification using the Zymo Research EZ DNA Methylation Kit (Zymo Research), according to manufacturer’s instructions. Converted genomic DNA was PCR amplified using two sets of TRPA1 DMR unbiased primers, followed by pyrosequencing using PSQ HS96 (Biotage). Bisulphite pyrosequencing was performed as previously described59. The Pyro Q-CpG methylation software (Qiagen) was used to determine the percentage of DNA methylation at each CpG site. Supplementary Table S8 provides the genomic location of the CpG sites measured in the TRPA1 CpG-island shore DMR. PCR reaction details and annealing temperatures for all bisulfite pyrosequencing reactions are also provided in Supplementary Table S8.

Genotypes

Genotypes were available for 48 individuals in the discovery sample and for 47 individuals in the follow-up sample. Genotypes were obtained on a combination of Illumina platforms (HumanHap300, HumanHap610Q, 1M-Duo and 1.2MDuo 1M custom arrays) and stringent QC checks were applied to these data as previously described35. HapMap genotypes were imputed using Impute (v2 (ref. 60)) with two reference panels, P0 (HapMap2, rel 22, combined CEU, YRI and ASN panels) and P1 (610K+, including the combined HumanHap610K and 1M array). Altogether, there were 2,028,354 (discovery) and 2,054,344 (follow-up) directly genotyped and imputed autosomal single-nucleotide polymorphisms (SNPs), which were further filtered for subsequent analyses.

Methylation QTLs

DNA meQTL analyses were performed in 47 individuals from the follow-up set. We selected 1,889,799 common HapMap SNPs, either directly genotyped or imputed (Impute info ≥0.8), with minor allele frequency ≥0.05. At each MAP-DMR, we tested for association between DNA methylation levels using RMQ scores in 1-kb bins, and genotypes at all SNPs within 100 kb of the bin (50 kb on either side). We normalized DNA methylation levels (to (N(0,1)) and regressed DNA methylation on genotype using an additive model. Potential MAP-DMRs with meQTLs in cis were considered if there was at least one SNP associated with DNA methylation levels at nominal significance, and we report nominal P-values of association.

Allele-specific methylation

ASM was estimated in 48 MZ discovery twins. We obtained heterozygous genetic variants using all available data from directly genotyped and imputed common variants, and exome-sequencing data available for 22 unrelated individuals (44 MZ twins)61. In the HapMap-imputed genotypes, we considered SNPs for ASM analysis if Impute info ≥0.8 and if the probability of a heterozygous genotype ≥0.5 in at least 50% of the sample. In the exome-sequencing data, we used direct genotype calls61 using a depth threshold of eight and selected variants that were heterozygous in at least 50% of the sample. There were 3,099 heterozygous genetic variants in the exome-sequencing data and 278,304 heterozygous SNPs in the HapMap data, giving an overall final set of 279,885 unique candidate heterozygous genetic variants. For each individual at each of the 279,885 genetic variants, we scored the number of MeDIP-seq reads spanning the reference and non-reference alleles using maq pileup with depth coverage ≥8 reads per variant. We used the frequency of the reference allele in the MeDIP-seq data as a measure of ASM, and scored a site within an individual as ASM only if the reference allele frequency was <0.25 or >0.75. Altogether, potential ASM calls were obtained at 17,261 variants. We excluded all ASM variants that fell in genomic regions reported to be problematic in alignment (Supplementary Methods).

Blood–brain DNA methylation comparison

MeDIP-seq patterns from blood and multiple brain tissues were obtained from two female donors37. The brain regions included the inferior frontal gyrus, middle frontal gyrus, left frontal gyrus, entorhinal cortex, superior temporal gyrus of the temporal cortex, visual cortex and the cerebellum. To assess tissue specificity of methylation effects at the 100 top-ranked MAP-DMRs, we first estimated the correlation between brain and blood methylation levels, by using the mean methylation levels across both individuals in all brain regions and in blood. We then considered regions to have tissue-shared methylation if the methylation difference between blood and one brain region was <10% of the overall blood methylation level, and under more stringent criteria, if the mean difference between blood and all brain region was <10% of the overall blood methylation level (see Supplementary Methods).

Gene expression