Mitochondrial phylogeography and haplotypes

Our fourfold sampling (Fig. 1 and Supplementary Fig. S2) confirmed and refined previous findings26, in particular that the blue lineage (Natrix natrix helvetica) meets with the yellow lineage (currently identified with N. n. natrix) in a secondary contact zone in the Rhine region, and that another, more easterly and much wider, contact zone of the yellow and red lineages runs across Central Europe to the Balkans. The latter two lineages are distributed within the ranges of what is currently identified with the grass snake subspecies N. n. natrix and N. n. persa, but these lineages conflict with morphology-based taxonomy so that neither the distribution ranges of the lineages and subspecies nor morphology and genetics are congruent26. Rather, each subspecies corresponds to several distinct mitochondrial lineages and some of these lineages are shared between the two subspecies. Thus, there is obviously a pronounced conflict between morphological variation and genetics. Whereas the geographical distribution of haplotypes of the blue and yellow lineages abuts, with virtually no sympatric occurrences (Supplementary Fig. S2), haplotypes of the yellow and red lineages widely overlap in Central Europe (Fig. 1), in a region corresponding to central, eastern and southern Germany, southern Poland, Austria, the Czech Republic, and Slovakia. However, further southeastwards, only haplotypes of the red lineage occur and in the southern Balkans, only haplotypes of the yellow lineage are recorded (Fig. 1).

While the vast majority of our 1,588 samples represents native grass snakes, there are some obvious cases of translocated individuals, like a few isolated records of the red lineage within the range of N. n. helvetica (Great Britain, Rhine region). One population in the Neander valley close to Düsseldorf, Germany, seems to consist exclusively of such allochthonous grass snakes and has been suspected to be introduced for a long time46, 47. In addition to these non-native grass snakes, eight individuals of other mitochondrial lineages from the Mediterranean (Italy) were found in southern Great Britain and Hesse, Germany (not shown in Fig. 1). We cannot completely exclude that also our new record of a third mitochondrial lineage on the island of Gotland (green lineage; Fig. 1) refers to an introduced grass snake. However, when it is considered that grass snakes colonized Gotland via transoceanic dispersal29 and that now all three lineages occurring all around the Baltic Sea have been recorded from Gotland, their natural occurrence seems possible there.

In parsimony network analyses, each lineage corresponded to a highly distinct haplotype cluster. For ND4 (Fig. 2, top), there are 12 haplotypes in the blue lineage, 43 haplotypes in the yellow lineage, and 33 haplotypes in the red lineage. Haplotypes of the yellow and red lineages differed by a minimum of 40 mutational steps. Haplotypes of the blue lineage (N. n. helvetica) were separated from the yellow haplotype cluster by at least 40 mutational steps and from the red cluster by a minimum of 54 steps. With only 12 haplotypes differing by maximally two mutation steps, there was distinctly less variation in the blue lineage compared to the two eastern lineages. In the yellow lineage, 43 haplotypes with maximally 15 mutations occurred; in the red lineage, 33 haplotypes differing by maximally 10 mutations.

Figure 2 Parsimony networks of mtDNA sequences. Symbol sizes reflect haplotype frequencies. Small black circles are missing node haplotypes; each line connecting two haplotypes corresponds to one mutation step, if not otherwise indicated by numbers. Haplotype colours correspond to lineages, i.e. Natrix natrix helvetica (h) in blue; eastern lineages in yellow (y) and in red (r). Full size image

For cyt b (Fig. 2, bottom) a similar pattern emerged. The blue lineage differed by a minimum of 73 mutations from the yellow lineage and by 64 mutations from the red lineage. The yellow and red haplotype clusters were connected by a minimum of 53 mutational steps. Compared to ND4, there was distinctly more variation observed, with 29 haplotypes in the blue lineage and 66 haplotypes in the red lineage. The yellow lineage comprised 42 haplotypes. Haplotypes of the blue lineage differed by a maximum of five mutations, haplotypes of the yellow lineage by a maximum of nine mutations and haplotypes of the red cluster by a maximum of 16 mutations. European Nucleotide Archive (ENA) accession numbers for haplotypes are listed in Supplementary Table S3.

For ND4, the blue lineage differed on average from the yellow lineage by an uncorrected p distance of 5.03% and from the red lineage, by 5.88%; the divergence between the yellow and red lineages was 5.16%. For cyt b, mean p distances between the blue, yellow and red lineages were 6.90% and 6.39%, respectively; the yellow and red lineages differed by 5.48%.

Genotyping and admixture

The 13 studied microsatellite loci were highly polymorphic, with allele numbers ranging from 12 to 39 per locus (Supplementary Table S4) and a total allele number of 285. For the complete data set including Natrix natrix helvetica and all five eastern lineages, the ΔK method suggested two as the optimal number of clusters (Supplementary Fig. S3a). One of these clusters represented helvetica, and the other contained all eastern lineages (Fig. 3a). The assignment of the eastern lineages to only one cluster matched with the close phylogenetic relationship of their mtDNA lineages26. According to an AMOVA using microsatellite data, 59.84% of the molecular variance occurred within and 40.16% between the two clusters, corresponding to an F ST value of 0.40. The high distinctiveness of both clusters was also supported by a large number of private alleles, especially for the eastern cluster (Supplementary Table S5).

Figure 3 Genotypic structuring of grass snakes. On the left, the mitochondrial lineage of each sample is shown above the structure diagrams, with haplotypes of Natrix natrix helvetica indicated in blue and haplotypes of the eastern lineages in colours corresponding to Fig. 1 (yellow, red, lilac, grey, green; white = missing data). In (a), orange and dark blue corresponds to non-native snakes (Italian lineages). Samples in structure diagrams are arranged within each country from west to east (a) or from north to south (b,c). In structure diagrams, an individual sample is represented by a vertical bar reflecting its inferred ancestry. In (a), the blue cluster corresponds to N. n. helvetica and the light green cluster to all other lineages. The isolated red/light green block (first row) represents the allochthonous population from the Neander valley, Germany. In (b), samples with genetic impact of helvetica are excluded. The pink cluster corresponds to samples from the yellow and red lineages. Brown percentages indicate genetic impact of adjacent lineages (lilac, grey, green). In (c) only samples from the yellow and red lineages and their hybrids, without genetic signatures of other lineages, were processed. Country abbreviations: Ba – Balkans (Albania, Bosnia and Herzegovina, Montenegro, Serbia, Kosovo, Former Yugoslav Republic of Macedonia, Romania, Bulgaria, and Greece), CH – Switzerland, CRO – Croatia, CZ – Czech Republic, FI – Finland, H – Hungary, N – Norway, NL – Netherlands, PL – Poland, S – Sweden. Maps were created using arcgis 10.2 (http://www.esri.com/arcgis) and adobe illustrator CS6 (http://www.adobe.com/products/ illustrator.html). Full size image

In general, the helvetica cluster matched well with the mitochondrial clade (Fig. 3). Genotypic introgression occurs largely unidirectional from helvetica into the eastern cluster. Most hybrids between helvetica and the eastern cluster originate from northern Switzerland, where the sampling is very comprehensive and dense, with approximately 200 samples from the contact zone. However, this dense sampling also revealed that the contact zone is narrow, with hybrid signatures occurring only in a maximally 50-km-wide strip (Fig. 3a, right). The allochthonous population of grass snakes of the red lineage in the Neander valley is clearly assigned to the eastern cluster, in agreement with mitochondrial haplotypes (Fig. 3a, left), and without admixture with neighbouring helvetica populations.

The second structure run (Fig. 3b) included the eastern lineages without impact of helvetica. Based on the hybridlab results (Supplementary Table S6), only samples with an eastern cluster membership of at least 95% were considered in that analysis, for which K = 2 was again the best solution (Supplementary Fig. S3b). One cluster (pink) embraced the yellow and red lineages and another one (brown) the grey, lilac and green lineages. Many southern samples of the yellow and red lineages, especially from Slovenia, Croatia and the southern Balkans, showed a high degree of admixture with the brown cluster. This admixture area largely corresponds to those regions where only haplotypes of the yellow and red lineages are present.

Samples with admixed ancestry with the brown cluster >5% were excluded from the third structure analysis (Fig. 3c), for which the optimal number of clusters was again K = 2 (Supplementary Fig. S3c). Now, one cluster corresponded to the yellow and the other to the red lineage. According to the hybridlab results (Supplementary Table S6), grass snakes with at least 80% cluster membership were treated as ‘pure yellow’ and with at least 83% as ‘pure red’. Introgression was common in both directions, indicating massive gene flow across hundreds of kilometres (Fig. 3c, right), including regions where exclusively or predominantly mitochondrial haplotypes of one lineage are present. For the yellow and red clusters, 89.18% of the molecular variance occurred within, and only 10.82% between the two clusters, equalling an F ST value of 0.11 (Supplementary Table S5).

The PCAs using microsatellite data (Fig. 4) are in line with the structure analyses in that the blue lineage (N. n. helvetica) was highly distinct from the yellow and red lineages, with some mismatches reflecting mitochondrial introgression mainly from helvetica into the eastern group. In contrast, the eastern lineages showed weak differentiation and massive overlap. The PCAs corroborate furthermore that our definition of admixed individuals is appropriate because hybrids were intermediate also in the PCA, independently from population affiliation, HWE or linkage equilibrium.

Figure 4 PCA axes 1–2 for microsatellite data. Samples are coloured according to mitochondrial lineages (top) or structure clusters (bottom). Admixed individuals were identified according to hybridlab results. PCAs for the yellow and red lineages correspond to the samples from Fig. 3c. Non-native samples were excluded. The oval outlines represent 95% confidential intervals. For helvetica and the eastern lineages (left) the x axis explains 16.6% and the y axis 4.5% of variation. For the eastern lineages (right) the x axis explains 3.8% and the y axis 2.9% of variation. Analyses along axes 1–3 produced nearly identical results (see Supplementary Fig. S4). Full size image

Cline analyses

The cline analyses revealed completely different patterns in the two contact zones (Fig. 5). For the western contact zone (contact zone I), the cline is concordant and very steep for both marker systems. For microsatellites, the cline centre was estimated to be located 639.9 km (95% confidence interval: 635.1–644.2 km) north-east from the starting point in southern France with a cline width of 39.4 km (24.4–59.6 km). For mitochondrial data, the cline centre was revealed almost at the same point, at 637.2 km (632.9–641.2 km), with a similar cline width of 37.5 km (27.3–53.6 km). A much smoother cline was found for the contact zone of the yellow and red lineages (contact zone II). The cline centre for microsatellites was located 518.7 km (454.5–590.4 km) distant from the reference site in northern Germany with a considerable cline width of 677.1 km (404.5–1,009.6 km). The width for the mtDNA cline was with 358.3 km (261.5–489.7 km) approximately half as wide as the microsatellite cline. The location of the centre was nearly identical at 503.8 km (474.6–537.0 km) from the reference site.