Evolution of H19 among mammals

The evolution of the H19 gene was first analyzed on a large phylogenetic scale to assess its evolutionary status. Basing from the H19 gene transcript sequence and mitochondrial sequence of 18 mammals, we constructed a gene tree and a phylogenetic tree by using MEGA6. The gene tree (Fig. 1a) was largely consistent with the phylogenetic tree (Fig. 1b). Results showed that primates and rodents cluster into independent clades and exhibited a close relationship than Sus scrofa.

Fig. 1 Molecular Phylogenetic analysis by Maximum Likelihood method. The evolutionary history of H19 gene (a) and mitochondria of 18 species (b) were inferred by using the Maximum Likelihood method based on the Kimura 2-parameter model [43]. The bootstrap consensus tree inferred from 1000 replicates [43, 44, 54, 55] is taken to represent the evolutionary history of the taxa analyzed. Branches corresponding to partitions reproduced in less than 50 % bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches Full size image

Conservation scores were calculated using PhastCons [22] to prove that the H19 gene was conserved during large-scale evolution and to identify the conserved and accelerated evolutionary regions in the gene among the 18 species. Results showed that the upstream region of the H19 gene was lowly conserved whereas the gene body, specifically exon1, was highly conserved (Fig. 2a, Additional file 1: Figures S1A and S2A). PhyloP was used to calculate the p-value of conservation or acceleration, and results indicated that the H19 gene was conserved during evolution within a broad range (Fig. 2b, Additional file 1: Figures S1B and S2B). The evolution of H19 on a large phylogenetic scale indicates that the gene was highly conserved within abroad range, including the 3' region of H19.

Fig. 2 PhastCon score and phyloP value of H19 among 18 species [22]. The phastCons score (a) and the phylop value (b) of H19 gene body among 18 species was calculated and then was calculated using sliding window for segments of 1000 bp with 300 bp intervals Full size image

Screening for selective signal

We analyzed the published re-sequencing data [24] on H19 to detect a selective sweep of the H19 gene during domestication on a genomic scale. We identified eight alleles among the domestic pigs, wild boars, and Tibetan boars. Although all of the eight alleles were detected in the domestic pigs, a large portion of these alleles were homozygous in pig individuals (Additional file 1: Tables S2 and S3). This high degree of homozygosity implies that the gene is associated with pig breeding [25]. We further calculated the π and pairwise F st of the linked upstream, gene body, and downstream genomic regions of the H19 gene in 30 Tibetan wild boars, 15 domesticated pigs, and 3 wild boars from Sichuan with 1 kb sliding windows and 300 bp steps (Fig. 3 and Additional file 1: Figure S3). The π value within the H19 gene body was significantly higher in domestic pigs than in wild boars (Fig. 3a and c). This result suggests that the H19 gene sequence conservation within wild boars is sharply contrasted with the sequence diversity among the H19 alleles found within domestic pigs. Within the domesticated pig population, the upstream and downstream regions of the H19 gene showed a low diversity, which is similar to that in wild populations (Fig. 3b). By contrast, the H19 gene body exhibited a high diversity (Student’s t-test, P < < 0.05, Fig. 3a). This phenomenon may be caused by selection relaxation. However, considering the abundant occurrence of breed differentiation within domesticated pigs, we suspected that H19 might have undergone genetic differentiation during breed differentiation and pig domestication. We also detected that the F st of the H19 gene body between domesticated pigs and wild boars was higher than that of H19 downstream (Student’s t-test, P < < 0.05, Fig. 3c).

Fig. 3 Patterns of nucleotide diversity and pair-wise population differentiation of H19 locus in domesticated pigs and wild boars. Sliding window analysis of nucleotide diversity and genetic differentiation coefficient between groups at H19 for domesticated pigs (Dome) and wild boars (Wild). π (right) and F st (left) was calculated for segments of 1000 bp with 300 bp intervals (c). The distribution of nucleotide diversity (π) and F st within H19 gene (a) and linked upstream and downstream 6 kb (b) was shown in the boxplot, the middle thick black line represent median.θ π ratio was calculated by π .dome /π .wild , the red dash line and green dash line respectively represent the average θ π ratio and the average Fst of the whole genome of Susscrofa Full size image

To detect whether or not H19 is under selection, we obtained the whole-genome nucleotide divergence (π dome_G = 0.0025, π wild_G = 0.0030) of domesticated and wild pigs, as well as the population difference (Fst d_w_G = 0.501) between domesticated and wild pigs, by using Tibetan re-sequencing data with GATK software [26]. Comparison of the sliding windows of π value and F st between domesticated and wild pigs (Fig. 3c and Additional file 1: Figure S3), showed that the domesticated pigs have a significantly higher diversity than wild pigs in the H19 gene (Z-test, P = 0.0128). This results suggests that H19 in domestic pig breeds might have experienced a more rapid evolution than that in wild ones. The F st between these two populations in H19 (F st.H19 = 0.393) had no significant difference with the average F st value of the whole genome (F st.whole = 0.501) (Fig. 3a) [27, 28]. However, the drastic diversity pattern between domestic and wild pigs in the H19 gene body, which is a highly conserved region during large-scale evolution, suggests that this gene might have played roles in the breed differentiation of domestic pigs.

Nucleotide diversity

We obtained all 2608 bp sequences in the H19 gene, including five exons and four introns from the published re-sequencing data of Tibetan wild boars [24], to identify the polymorphism sites in the gene. Results showed that the 3’-region of the H19 gene has abundant polymorphisms. Partial sequences of the H19 gene from 1660 bp to 2460 bp (the transcription start site is 1 bp) were obtained from 234 pig individuals consisting of China native pigs and three foreign breeds to detect the roles of the H19 gene in breed differentiation. SNP detection results of the H19 gene in different pig breeds are summarized in Table 1, Additional file 1: Tables S2, S3 and Figure S4. The SNP statistics table revealed eight SNPs in the region; of these SNPs, seven were observed only in domesticated pigs and one was present in both domesticated and wild pigs. However, clear differentiation of the genotype of this SNP site was observed. The proportion of the T allele was higher than or equal to that of the C allele in wild pigs, but nearly all T alleles were observed in the Tibetan wild boars. This result indicates that this site underwent differentiation in the wild pigs. Among the Chinese breeds, a number of populations were C-alleles homozygous, and other populations showed a particular frequency of T-alleles. At the population level, the SNPs were present in the heterozygous state, indicating that these sites have not been completely fixed. Interestingly, the SNP at 1818 bp exhibited a high mutation proportion in the CC and SC populations but was nearly completely homozygous in the other populations.

Table 1 SNP detection of H19 Full size table

Estimates of nucleotide diversity (π) were consistently higher in pigs from CC (π = 0.00127) and SC (π = 0.00203) than in pigs from other locations, especially Tibetan wild boars (π = 0.00071) (Additional file 1: Table S4). The same findings were observed for θ and θπ (Additional file 1: Table S4). Such results suggest that genetic diversity increased during Chinese pig differentiation, although no significant selection signal was detected through Tajima’s D test [29], Fu and Li’s H and D test [30] or Fay and Wu’s H test [31] (Additional file 1: Table S4). The results of HKA test were also not significant (P Dome = 0.8796). We constructed a neighbor-joining tree by using the net distance between populations and found that CC and SC clustered together (Fig. 4). A considerable degree of mixture possibly decreased the selection signal.

Fig. 4 The Neighbor-joining tree of H19 in different pig breeds. The neighbor-joining tree of H19 base on the net distance between breeds was constructed according to catalogs (a) and pig breeds (b) by Mega 6.0 Full size image

Methylation analysis of the H19 gene between wild and domesticated pigs

We selected the DNA immunoprecipitation (MeDIP) data of one tissue (longissimus dorsi muscle) of every pig sample and conducted Pearson correlation test to test whether or not a significant difference exists between the mapping results of the whole genome and the H19 gene. Pearson correlation coefficient ranged from 0.949 to 1. This result indicates that the mapping results are highly correlated on the basis of the Pearson correlation coefficient. These results suggest that the result of whole genome mapping was consistent with that of H19 gene mapping after validation of the selected representative data. The methylation levels of the H19 gene body and linked upstream and downstream genomic regions were analyzed using the MeDIP data of eight fat tissues and two muscle tissues of three Tibetan pigs (the Plateau pig breed from the Tibetan wild boar lineage), Landrace (foreign pig breed), and Rongchang pig (Chinese native pig breed) to understand the methylation pattern of the H19 gene between wild and domesticated pigs (Fig. 5 and Additional file 1: Figure S5). Results showed that the methylation levels within the H19 gene among the three above mentioned pig breeds were lower than 6 kb upstream (Student’s t-test, P < <0.01) and 6 kb downstream (Student’s t-test, P < <0.01), whereas the methylation level was the highest in the 6 kb upstream region. Several tissues, including the greater omentum (GOM) and mesentery adipose (MAD), from the Rongchang pig showed significantly higher gene body methylation levels than the corresponding tissues in Landrace (Student’s t-test, P < <0.01) and Tibetan pigs (Student’s t-test, P < <0.01). Several tissues showed distinct levels of methylation. In Landrace, the methylation levels at both 6 kb upstream and 6 kb downstream regions in the abdominal subcutaneous adipose tissue (ASA), psoas major muscle (PMM), and retroperitoneal adipose tissue (RAD) were lower than those in other tissues, whereas the methylation levels in the upper lipid of back (ULB) were higher than those in other tissues (two-way ANOVA, P tissue = 4.48 × 10-6). In the Chinese native pig (Rongchang pig), the methylation levels were obviously higher in the GOM and MAD but lower in the ASA, longissimus dorsi muscle (LDM), and PMM in all three regions (two-way ANOVA, P tissue = 0.0308). Interestingly, the methylation levels at the upstream and downstream regions were clearly lower in the inner lipid of back (ILB) than in the gene body and other tissues of Tibetan pigs (two-way ANOVA, P genebody-upanddown = 2.02 × 10-12, P ILB-others = 1.36 × 10-7). The methylation patterns in ASA and PMM differed between Chinese native pigs and Tibetan pigs (Student’s t-test, P = 0.067 to 1.962 × 10-5) (Fig. 5).

Fig. 5 The methylation level of H19 in 10 tissues. The methylation level of upstream 6 kb (a), H19 gene body (b) and downstream 6 kb (c) of 10 tissues of Landrace (Red), Rongchang (Blue) and Tibetan (Green) pigs according to the MEDIP data. The number of reads which mapped to the reference sites represent the methylation level of sites Full size image

The methylation levels of the H19 gene and linked 2 kb upstream and downstream regions, as well as the expression levels in the liver, were determined and compared on the basis of our unpublished single-base liver methylome and transcriptome data of the wild boar (Guizhou wild pig), the Chinese native pig (Enshi black pig), and a foreign breed (Landrace) (Fig. 6, Additional file 2: S7). The methylation levels of the gene were the lowest in the foreign pig (Landrace), followed by that in the Chinese wild pig (Guizhou wild boar). The methylation levels in the Chinese native pig (Enshi pig) were the highest at the 2 kb upstream region of the H19 gene (Fig. 6a–c and f). By contrast, the methylation levels in the gene body of Landrace were higher than those in the two other samples (pairwise comparisons using t tests, **P < 0.001 with the adjusted Bonferroni method). The expression levels of the H19 gene were the highest in the liver of foreign pigs (Landrace), followed by those in the Chinese wild pig and Chinese native pig (Enshi pig) (one-way ANOVA, P < <0.01) (Fig. 6g). These results are consistent with the theory that methylation in the promoter region decreases expression levels and represses transcription noise in the gene body [32].

Fig. 6 The methylation level and expression level of H19 in liver tissue. Methylation level of upstream 2 kb (d), H19 gene body (e) and downstream 2 kb (f) of ES (En Shi black pig, a), GZWB (Gui Zhou wild pig, b) and LB (Landrace, c) and the expression level of H19 gene (g). The methylation level of every site was calculated by number of methylated reads divide number of total reads which was mapping to this site. The expression level was estimated by FPKM value which was calculated by Cuffdiff software using transcriptome data. The grey horizontal lines below the a, b, c were the ASM region, the vertical bars in a, b, c were DMS, the different color represent DMS between this pig breed and the other referred by the related color. (***P < 0.001;**P < 0.02;*P < 0.05) Full size image

We further explored the relation between imprinting and methylation difference. In this step, we identified the DMSs among pig breeds and the ASM region by using our unpublished single-base resolution methylome data of the liver from the three pig breeds. MeDIP data were not in single base resolution; thus, we were unable to use the data to address tissue-specific methylated sites. We previously surveyed the methylation status of CpGs of the H19 gene in the three breeds. In the present study, we first identified the DMSs among the three pig breeds by using Chi-square test. We found a total of 6, 9, and 12 DMSs between the Enshi pig and the Guizhou wild pig, the Guizhou wild pig and Landrace, and the Enshi pig and Landrace, respectively, in which 3, 6, and 9 sites were in the upstream 2 kb region; 2, 2, and 3 sites were in the gene body region; and 1, 1 and 0 sites were in the downstream 2 kb region, respectively. This result indicates that DMSs were mainly located in the upstream 2 kb region (Additional file 1: Table S5). We also identified regions with ASM in each breed. We identified 8 ASM regions, all of which were located in the upstream 2 kb region of the H19 gene. Interestingly, all of these ASM regions were in the Enshi pig. In the 8 ASM regions, no SNP was found in the two other pig breeds; hence, we were unable to distinguish the allelic methylation status in these two breeds. We then used the Enshi pig as a model to explore the DMSs and ASM of H19. Most of the DMSs between the Enshi pig and the other pigs were located in the ASM region; specifically, 2 of 3 DMSs between the Enshi pig and the Guizhou wild pig and 7 of 9 DMSs between the Enshi pig and Landrace were situated in the ASM region. Our preliminary results imply that imprinting may be associated with methylation differences among different breeds.