DNA breakage and adaptation Adaptation to new environments often occurs in similar ways across different colonization events. Stickleback fish represent a classic example of this, in which repeated colonizations of freshwater have resulted in the loss of pelvic hind fins. Previous work has shown that a pelvic enhancer gene is involved. Xie et al. now show that this gene lies within a region of the genome that is prone to double-stranded DNA breakage owing to a high thymine-guanine content. This enhanced region of breakage could lead to enhanced mutation rates that facilitate repeated adaptations to new environments. Science, this issue p. 81

Abstract Evolution generates a remarkable breadth of living forms, but many traits evolve repeatedly, by mechanisms that are still poorly understood. A classic example of repeated evolution is the loss of pelvic hindfins in stickleback fish (Gasterosteus aculeatus). Repeated pelvic loss maps to recurrent deletions of a pelvic enhancer of the Pitx1 gene. Here, we identify molecular features contributing to these recurrent deletions. Pitx1 enhancer sequences form alternative DNA structures in vitro and increase double-strand breaks and deletions in vivo. Enhancer mutability depends on DNA replication direction and is caused by TG-dinucleotide repeats. Modeling shows that elevated mutation rates can influence evolution under demographic conditions relevant for sticklebacks and humans. DNA fragility may thus help explain why the same loci are often used repeatedly during parallel adaptive evolution.

Many phenotypic traits evolve repeatedly in organisms adapting to similar environments, and studying these cases can reveal ecological and genetic factors shaping parallel evolution (1, 2). For example, loss of pelvic appendages has evolved repeatedly in mammals, amphibians, reptiles, and fishes. Marine stickleback fish (Gasterosteus aculeatus) develop a robust pelvic apparatus, whereas many freshwater populations have lost pelvic structures (3). Pelvic reduction is associated with particular ecological conditions, is likely adaptive, and maps to recurrent and independent deletions of a pelvic enhancer (Pel) upstream of the homeodomain transcription factor gene (Pitx1) that show repeatable molecular signatures of positive selection (4–7). This unusual spectrum of regulatory deletions contrasts with the accumulation of single-nucleotide changes in other studies (6, 8, 9), hinting that special DNA features may shape adaptive variation at the Pitx1 locus (6).

Pel enhancer sequences show high predicted helical twist flexibility (6), a DNA feature associated with delayed replication and fragile site instability (10). To examine whether Pel forms alternative DNA structures in vitro, we used two-dimensional (2D) electrophoresis to analyze distributions of plasmid topoisomers (11) (Fig. 1A). A control stickleback genomic region showed smooth curves characteristic of B-DNA (Fig. 1B). In contrast, Pel sequences from marine populations showed mobility shifts characteristic of alternative DNA structure formation (Fig. 1B). Structural transitions started at a negative superhelical density of –σ = 0.043 and changed apparent linking numbers by 10 to 16 helical turns, similar to shifts produced by Z-DNA (left-handed DNA, starting –σ = 0.046) of ~105 to 170 base pairs (bp) (12, 13). Pel sequences from pelvic-reduced populations did not show unusual electrophoretic transitions (Fig. 1B), suggesting that natural Pel mutations remove sequences forming alternative DNA structures.

Fig. 1 Marine but not freshwater Pel alleles form alternative structures in vitro. (A) 2D electrophoresis of circular DNA topoisomers. A distribution of plasmid topoisomers is separated on an agarose gel; each topological class forms one spot. Canonical B-DNA forms a smooth distribution. Alternative structures cause mobility shifts. Distribution shifts at the linking number that induces alternative structure. Dagger symbol, mobility shift. (B) Pel from marine and freshwater pelvic-reduced populations. Control, Atp1a1.

To test the effect of Pel sequences on chromosome stability in vivo, we measured the rate of DNA double-strand breaks in yeast artificial chromosomes (Fig. 2A). Constructs without added test regions broke at background rates of 3.37 breaks per 106 divisions (Fig. 2B), consistent with previous reports (14). Chromosomes containing marine Pel broke ~25 to 50 times more frequently (Fig. 2B), a rate even higher than that of previously analyzed human fragile sites (14). Pel from freshwater pelvic-reduced populations [but not freshwater pelvic-complete populations (fig. S1)] broke at rates similar to that of the control (Fig. 2B), suggesting that natural Pel mutations remove breakage-prone regions.

Fig. 2 Marine but not freshwater Pel alleles break at high rates in yeast, in an orientation-dependent fashion. (A) Test DNA is inserted in a yeast artificial chromosome between two selectable markers (LEU2 and URA3) and downstream of a telomere seed site. Breakage results in loss of URA3. (B) Box-and-whisker plot of Pel breakage rates. Whisker ends indicate maximum and minimum of six fluctuation assays (10 cultures each). RC, reverse complement. *P < 0.01 (table S5). See table S6 for population names. (C) Reversing replication direction through the test region, but not URA3 transcription direction, reverses orientation of fragility. *P < 0.01 (table S5). ori, DNA replication origin.

Reverse complements of marine Pel broke ~10 to 20 times less frequently than identical sequences in the forward orientation (Fig. 2B). RNA transcription can influence fragile site breakage (15), but reversing transcription orientation of the nearby URA3 marker did not significantly affect Pel fragility (Fig. 2C). In contrast, adding a replication origin on the opposite side of Pel did switch fragility, making the forward sequence stable and the reverse complement fragile (Fig. 2C). Thus, Pel fragility is markedly dependent on DNA replication direction.

Pel contains abundant runs of alternating pyrimidine-purine repeats (Fig. 3A and data S1), which can adopt alternative structures, such as Z-DNA, previously associated with deletions in bacteria, mice, and humans (16, 17). Three stretches of ~15, ~20, and ~50 TG-dinucleotide repeats in marine Pel total ~170 bp (consistent with linking number changes seen in the topoisomer assays above). TG-repeats alone induced mobility shifts in topoisomer assays (Fig. 3B) (18) and elevated chromosome breakage in yeast, with longer repeats stimulating more breaks (Fig. 3C). In contrast, both long and short versions of the reverse complement sequence (CA-repeats) were stable (Fig. 3C), recapitulating the orientation dependence of Pel fragility.

Fig. 3 TG-dinucleotide repeats recapitulate structure formation, high breakage rate, orientation dependence, and deletion spectrum. (A) To-scale maps of Pel in different freshwater pelvic-reduced populations (table S6). Green, Pel sequence driving pelvis expression (6). Tan, TG-repeats. White boxes, DNA deletions in indicated populations. Blue, DNA remaining. Letters indicate microhomologies at deletion junctions. (B) 2D gel for (TG) 30 . Dagger symbol, mobility shift. (C) Yeast artificial chromosome (YAC) breakage rates for TG- or CA-repeats of varying lengths. *P < 0.01 (table S5). (D) Reporter shuttle plasmid schematic. (E) Mammalian mutation frequencies. Error bars indicate SEM of four or five independent experiments. *P < 0.05 (Student’s t test). Dagger symbol, deletions dominate mutation spectrum (fig. S2A). (F) To-scale map of (TG) 41 -induced deletions in mammalian cells.

We also tested the effect of TG- and CA-repeats in mammalian COS-7 cells (Fig. 3D) (19). Dinucleotide repeats elevated mutation frequencies, with TG-repeats being more mutagenic than CA-repeats of comparable length, and longer repeats being more mutagenic than shorter repeats (Fig. 3E), in accordance with results from yeast assays. Mutations stimulated by the most mutagenic sequence, (TG) 41 , were predominantly >100-bp deletions that removed part or all of the repeat and adjacent reporter gene (Fig. 3F and fig. S2A). Approximately 70% of deletion junctions contained microhomologies and insertions (Fig. 3F and fig. S2, A and B), consistent with error-prone microhomology-mediated end-joining repair and similar to junctions seen in stickleback pelvic-reduction alleles (6) (Fig. 3A). Ligation-mediated polymerase chain reaction suggested that breaks initiated near the dinucleotide repeats (fig. S2C). Taken together, our results indicate that TG-repeats form alternative DNA structures in vitro and can recapitulate the high mutation rates, orientation dependence, and propensity to stimulate breaks and deletions of the full Pel region.

To determine the orientation of Pel sequences relative to DNA replication in sticklebacks (Fig. 4A and fig. S3), we sequenced S- and G-phase cells from developing embryos and calculated S/G read-depth ratios to determine replication timing (20). Pel is located in a timing transition region (Fig. 4B and fig. S4), consistent with unidirectional replication. The replication direction through Pel matches the fragile orientation (Fig. 4C), suggesting that Pel would form a TG-repeat–associated fragile site in vivo. Experimental CRISPR targeting confirmed that initiation of breaks in Pel was sufficient to trigger local DNA deletions and macroscopic loss of pelvic structures in genetic crosses (fig. S5).

Fig. 4 Pel is located in the breakage-prone orientation in sticklebacks, generating a fragile site likely to contribute to parallel evolution in natural populations. (A) Workflow for profiling genome-wide replication timing. FACS, fluorescence-activated cell sorting. (B) Stickleback chromosome VII replication timing. Red line indicates the Pel locus, which is subtelomeric. Hash marks indicate reference genome assembly gap. (C) Diagram of stable and fragile replication orientations. Purple, newly synthesized leading strand; pink, newly synthesized lagging strand. (D) Probability of at least one de novo mutation arising at a particular locus in 10,000 generations and eventually becoming fixed, as a function of typical stickleback population sizes (N) and mutation rates (μ, gray bars) for single-nucleotide polymorphisms (SNPs), copy number variants (CNVs), and fragile sites. De novo point mutations are unlikely to occur and become fixed in small vertebrate populations, even when conferring a selective advantage (s = 0.01, modeled here). In contrast, mutations occurring at fragile sites are likely to arise and contribute to repeated evolution when conferring a selective advantage. For additional parameters, including neutrality (s = 0), see figs. S6 and S7.

Could elevated mutation rates contribute to reuse of Pel deletions in parallel evolution? Population genetic modeling indicates that new mutations occurring at the low rates of typical single-nucleotide changes (~10−9 mutations per site per generation) would rarely arise at a particular locus in postglacial stickleback populations, whereas mutations occurring at elevated rates (~10−5 mutations per site per generation, for fragile sites) would arise often. When new mutations do occur, their subsequent fate is controlled by drift and selection (21). Neutral or small-effect point mutations will usually be lost or rise to fixation slowly, whereas deletions may cause larger phenotypic effects and can sweep if environmental conditions favor pelvic reduction (Fig. 4D and figs. S6 and S7). The combined effects on both the “arrival of the fittest” and the “survival of the fittest” may explain why recurrent Pel deletions are the predominant mechanism for evolving stickleback pelvic reduction. For other traits, ancient standing variants provide an alternative way to overcome the demographic constraints of waiting for de novo mutations in small populations and can also lead to reuse of similar alleles in different populations (22, 23).

The demographic parameters typical of sticklebacks apply to many vertebrates evolving with small population sizes or facing rapid environmental changes. For example, migration of modern humans out of Africa occurred with relatively small populations adapting to new environments in 3000 generations or fewer (24). Notably, nearly half of currently known mutations underlying adaptive traits in modern humans also appear to be produced by mechanisms with elevated mutation rates (table S1).

High mutation rates have been described at contingency loci in bacteria and other systems (25–30). Our study reveals an example of DNA fragility contributing to repeated morphological evolution in vertebrates. Our data also highlight several mechanisms that could alter local mutation rates, including expansion and contraction of TG-repeats, changes in sequence orientation, or changes in DNA replication. Natural variation in such parameters may affect the evolvability of different loci and the particular genetic paths likely to be taken when ecological conditions favor a given phenotype. The sequence features associated with DNA fragility in the Pel region are also found in thousands of other positions in stickleback and human genomes (fig. S8). Notably, TG-repeats are enriched in other loci that have undergone recurrent ecotypic deletions during marine-to-freshwater stickleback evolution (31) (table S2 and fig. S9) and are enriched near DNA breakage sites in humans (fig. S10). As causative changes are identified for a greater number of phenotypic traits, it will be interesting to see the extent to which DNA fragility has influenced the genes and mutations that underlie evolutionary change in nature.

Supplementary Materials www.sciencemag.org/content/363/6422/81/suppl/DC1 Materials and Methods Figs. S1 to S11 Tables S1 to S6 References (32–61) Data S1

http://www.sciencemag.org/about/science-licenses-journal-article-reuse This is an article distributed under the terms of the Science Journals Default License.

Acknowledgments: We thank V. Tien, J. Le, M. Yau, M. Thakur, A. Muralidharan, M. Whitlock, B. Belotserkovskii, R. Driscoll, K. Cimprich, J. Wang, S. Quake, and A. Casper for experimental assistance or advice; R. Daugherty, J. Rollins, B. Lohman, R. Mollenhauer, M. Reyes, and F. von Hippel for help with fieldwork; C. Freudenreich for yeast strains; and Z. Weng and B. Carter for help with high-throughput sequencing and cell sorting. Funding: NIH grants 5P50HG2568 (D.M.K.), CA093729 (K.M.V.), and 2T32GM007790 (J.I.W.); NSF grant DEB0919184 (M.A.B.); NSF and Stanford CEHG Graduate Fellowships (K.T.X.); NIH Predoctoral Fellowship (A.C.T.); HHMI investigator (D.M.K.). Author contributions: K.T.X. and D.M.K. designed the study. K.T.X., G.W., A.C.T., and J.I.W. performed experiments. K.T.X., G.W., A.C.T., D.S., K.M.V., and D.M.K. analyzed data. T.E.R., A.D.C.M., D.S., and M.A.B. provided key populations and comments. K.T.X. and D.M.K. wrote the paper with input from all authors. Competing interests: None declared. Data and materials availability: Raw sequencing data and processed S/G read-depth ratio data have been deposited at GEO accession GSE121537.