Synthetic gene drive systems using site-specific endonucleases to spread traits into a population were first proposed more than a decade ago1. This proposal was initially inspired by the action of a class of natural selfish genetic elements, found in many single-cell organisms, named homing endonuclease genes (HEGs). HEG-encoded proteins can recognize and cleave a 15- to 30-bp DNA sequence. HEGs are located within the DNA recognition sequence, rendering it resistant to further cleavage. However, when the HEG comes into contact with a chromosome containing the uninterrupted recognition sequence, the double-strand break (DSB) induced by the cleavage is often repaired using the homologous chromosome as a template, effectively converting a heterozygote into a homozygote in a process known as 'homing'. Through this mechanism, the frequency of an HEG can rapidly increase in a population. Naturally occurring HEGs can in principle be adapted to function as a gene drive system in mosquitoes because they can be re-engineered to recognize mosquito genes2. An HEG expressed in the male mosquito germline that recognizes an artificially introduced recognition site shows high rates of super-Mendelian inheritance and rapidly invades a caged population2. The increased transmission rate provided by endonuclease-based gene drive systems could theoretically outweigh the fitness costs arising from the cleavage activity and disruption of the targeted sites. If this proviso is met, a drive construct can spread through a population until it reaches an equilibrium frequency, with a reduced mean fitness for the population1.

Any nuclease with a sufficiently long recognition sequence could hypothetically be redesigned to function as a gene drive system akin to an HEG, provided that it can be engineered to recognize and insert in a specific genomic locus. For example, we have previously shown that modular nucleases such as zinc finger nucleases or transcription activator–like effector nucleases (TALENs), for which the DNA-binding specificity of each module is well-characterized, can be combined to function as a synthetic selfish element in Drosophila, albeit with low replication fidelity owing to their repetitive nature3. More recently, the development of the CRISPR-Cas9 (clustered, regularly interspaced, short palindromic repeats (CRISPR) and CRISPR-associated protein (Cas)) system4,5,6 has radically simplified the process of engineering nucleases that can cleave specific genomic sequences. A guide RNA (gRNA) complementary to a DNA target site directs the activity of the Cas9 endonuclease to that sequence, providing a means to edit almost any chosen DNA sequence without the need to undertake complex protein engineering and selection procedures. In addition to applications in genome editing, the specificity and the flexibility of the CRISPR-Cas9 system offers unprecedented opportunities to expedite the development of gene drive systems for the control of insect vectors of disease7. In a proof-of-principle experiment for such a use, a CRISPR-based construct was used to demonstrate gene drive activity in a single generation at an eye color locus in Drosophila and using a split-drive system in yeast8,9.

Translation of this technology for the control of the insect vector of human malaria requires development of an endonuclease-based gene drive system that interferes with the ability of A. gambiae mosquitoes to transmit the disease. This could be achieved either by blocking parasite development or by reducing the reproductive capability of the insect vector. Modeling of vector populations indicates that the latter might be achieved through the use of an endonuclease designed to 'home' to and yield a recessive mutation in a gene that is essential for viability or female fertility, with the latter being more effective, provided that homing is temporally and spatially confined to the germline during, or before, the process of gamete formation1,10. This is essential to avoid somatic disruption of the wild-type (WT) allele and allow the normal development of heterozygous mosquitoes, critical for transmitting the endonuclease to subsequent generations. Along these lines we have developed a CRISPR-based gene drive system designed to home, in both sexes of the human malaria vector A. gambiae, to haplosufficient, somatically expressed female-fertility genes.

To identify putative female-fertility genes in A. gambiae, we used a combination of orthology and a sterility index based on a logistic regression model that correlated gene expression features with the likelihood of female sterile alleles in the model dipteran Drosophila –melanogaster11,12. Three candidate genes with high ovary expression and tissue specificity were chosen from this analysis: AGAP005958 (ortholog of Drosophila yellow-g, a haplosufficient female-fertility gene expressed in somatic follicle cells13); AGAP007280 (ortholog of Drosophila nudel, a haplosufficient female-fertility gene expressed in somatic follicle cells involved in dorsoventral patterning of the embryo14); AGAP011377 (no apparent Drosophila ortholog but contains a probable chitin binding domain).

We used either CRISPR-Cas9 nuclease or TALENs to selectively disrupt the coding sequence of these candidate genes and analyzed reproductive phenotypes to validate the suitability of these genes as homing targets. The gene knockout strategy generated 'docking lines' through homologous recombination inserting a GFP transcription unit flanked by two attP sites suitable for subsequent insertion of active drive –constructs (Fig. 1a). Though not strictly necessary for the purpose of inserting a gene drive element, the generation of the docking lines allows an unambiguous assessment of the phenotype caused by gene disruption, in the absence of ongoing Cas9 activity, and the tracking of mutant alleles by the presence of the fluorescent marker. In each case the insertion of the attP-GFP docking cassette was designed to produce a null phenotype. Both TALENs and Cas9 nuclease were effective in cutting the corresponding target sequences and in promoting the insertion of the docking construct at the cleavage site. At each of the three selected target loci, transformed GFP+ individuals were recovered at a relatively high frequency with rates at least comparable to those in our experience of transposon-mediated germline transformation (Supplementary Table 1), and they were confirmed in PCR experiments to carry the desired homologous recombination events (Supplementary Fig. 1). G 1 individuals of the docking lines were fertile and were intercrossed to produce G 2 progeny, expected to include individuals both heterozygous and homozygous for the insertion. Visual inspection of G 2 progeny identified two classes of mosquitoes on the basis of GFP intensity, 'intermediate' and 'strong', which we attributed to the presence of one or two copies of the GFP gene in the heterozygous or homozygous state, respectively, and later confirmed by molecular analysis (Fig. 1b). Fertility assays (egg laying and hatching) performed on individual mosquitoes showed that all homozygous female mosquitoes were sterile, whereas heterozygous females showed normal rates of egg laying and hatching (Fig. 1c). On the basis of these results we concluded that the selected genes should be regarded as haplosufficient female-sterility genes. Manifestation of the impaired fertility phenotype differed across the three genes targeted, consistent with their function at distinct stages of egg production and embryo development: homozygous females carrying two disrupted alleles of either AGAP005958 or AGAP011377 failed to lay eggs, whereas homozygous mutant females at AGAP007280 laid eggs that did not hatch (Supplementary Fig. 2).

Figure 1: Gene disruption by homology-directed repair (HDR) at three separate loci causes recessive female sterility. (a) A plasmid-based source of either a TALEN or Cas9 coupled with a gRNA induces a DSB at the target locus. A plasmid (hdrGFP) containing regions of homology immediately upstream and downstream of the cut site acts as a template for homology-directed repair. Internal to the homology regions a 3xP3::GFP cassette identifies hdrGFP integration events and two attP sites facilitate secondary modification of the locus through RMCE. (b) PCR was used to confirm the targeted loci in WT individuals as well as those homozygous and heterozygous for the hdrGFP allele. The primer pair used is indicated in a (blue arrows). (c) Counts of larval progeny from individual females homozygous or heterozygous for hdrGFP alleles mated to WT males. Heterozygous docking lines for all three loci showed at least full fertility compared to WT females. A minimum of 20 individuals were tested for each line. Vertical bars represent the mean and error bars the s.e.m. Full size image

After validating the female-fertility phenotype of the target genes, we inserted a gene drive construct (CRISPR homing allele, CRISPRh) (Fig. 2a) into the docking site by recombinase-mediated cassette exchange (RMCE)15. Each drive construct was designed to home, in both sexes, into the cognate WT locus and contained the following components: (i) the Cas9 nuclease gene under the control of the vasa2 promoter, shown in a previous report to be active in the germline of both sexes16; (ii) a gRNA sequence designed to direct the cleavage activity of the nuclease to the same sequence targeted in the gene-knockout experiments and under the promoter of the ubiquitously expressed, PolIII-transcribed U6 gene17; and (iii) a visual marker (3xP3::RFP). AttB sites flanking the CRISPRh construct were used to direct ΦC31 integrase-mediated recombination at the docking site. Aware of the potential for these mosquitoes to show gene drive activity, we housed our mosquitoes in a containment facility consistent with recent recommendations for safeguards in such experiments18. Successful cassette exchange events were visually identified among G 1 progeny as GFP+ to RFP+ phenotype conversions, and confirmed using PCR (Supplementary Fig. 3). At all three female-fertility loci we recovered double-crossover events that resulted in cassette exchange and insertion of the CRISPRh allele, at transformation frequencies of 2–7% (Table 1). In the cassette exchange reaction we observed the insertion of both complete and incomplete CRISPRh alleles, the latter probably the result of intramolecular recombination between regions of homology in the gRNA construct and its endogenous target at the insertion site.

Figure 2: CRISPRh alleles inserted at female-fertility loci show highly efficient gene drive activity and can spread in a caged population. (a) RMCE was used to replace the GFP transcription unit in hdrGFP docking lines with a CRISPR homing construct (CRISPRh consisting of a 3xP3::RFP marker, Cas9 under the transcriptional control of the vasa2 promoter and a gRNA under the control of the ubiquitous U6 PolIII. The gRNA cleaves at the nondisrupted WT allele. Repair of the cleaved chromosome through HDR leads to copying of the CRISPRh allele and homing. (b) Confinement of homing to the germline should lead to super-Mendelian inheritance of a homing construct (indicated in red) that, when targeting a haplosufficient, somatic female-fertility gene, will reduce the number of fertile females. (c) High levels of homing at all three female-fertility loci were observed. Male or female CRISPRh/+ heterozygotes were mated to WT. Progeny from individual heterozygous females were scored for the presence of the RFP linked to the CRISPRh construct and the average transmission rate indicated by vertical bar (± s.e.m.). A minimum of 34 females were analyzed for each cross. The average homing rate is also shown. (d,e) Counts of eggs and hatching larvae for the individual crosses revealed a strong fertility effect in heterozygous CRISPRh/+ females (d) that was not seen in equivalent heterozygous males (e). (f) Dynamics calculated using recurrence equations in Deredec et al.10, using the observed homing rates in males and females and effects on female fertility. We assume no fitness effects in males and that the initial release consists of heterozygous males equal to 10% of the prerelease adult male population (i.e., 5% of the overall population). The model assumes discrete generations (one per month) and random mating; results are plotted starting from the first generation after release and do not account for evolution of either the CRISPR allele or the target sequence. (g) Increase in frequency of CRISPRh allele in cage population experiments. An equal number of CRISPRh/+ and WT individuals were used to start a population, and the frequency of individuals containing a CRISPRh allele was recorded in each subsequent generation. Black line shows deterministic prediction based on observed parameter values (homing rates 98.4%, heterozygous female fitness of 9.3%, homozygous females completely sterile), assuming no fitness effects in males. Gray lines show results from 20 stochastic simulations assuming 300 males and 300 females are used to start the next generation, females mate randomly with a single male and 15% of females fail to mate, using random numbers drawn from the appropriate multinomial distributions. Red lines show results from two replicate cages. Full size image

Table 1 RMCE to insert CRISPR h alleles at their target locus Full size table

Each complete integration event generated a CRISPRh allele encoding a Cas9-gRNA endonuclease designed to target the corresponding integration site on a WT chromosome. Accordingly, the CRISPRh allele was resistant to nuclease cleavage as its target sequence had been interrupted by the insertion of the CRISPRh construct itself. In heterozygous mosquitoes the activation of the vasa2 promoter during gamete formation should induce the synthesis of the Cas9 nuclease that, in concert with the ubiquitously expressed gRNA, should cleave the target sequence in the fertility genes, thereby initiating homologous recombination repair events that lead to the homing of the CRISPRh construct into the WT allele (Fig. 2a). Visual screening was used to analyze the frequency of the RFP-linked CRISPRh allele in the progeny of heterozygous parents crossed to WT mosquitoes to detect signs of non-Mendelian inheritance, above the expected frequency of 50%, that would reveal gene drive activity (Fig. 2b).

In several of the CRISPRh/+ G 1 individuals that we recovered at each locus we noticed super-Mendelian inheritance of the RFP-marked CRISPRh allele, with rates of 94.4–100% (Table 1) among the progeny. To further investigate the activity of these CRISPRh alleles, we looked at homing ability and sterility in the G 2 generation and beyond, scoring the progeny of large numbers of single crosses to WT mosquitoes. Invariably, we saw high rates of transmission in every fertile cross we examined (Fig. 2c), representing average homing rates (defined as the proportion of non-CRISPR alleles converted to CRISPRh in the gametes) ranging from 87.3% to 99.3% across the three target genes. Importantly, though we observed more variability (69–98%) across generations over time, we observed no obvious decrease in homing performance (Table 2), suggesting that the majority of CRISPR homing events regenerate an intact allele. Furthermore, the transmission rate of the CRISPRh allele at AGAP007280 and AGAP011377 was high in both male and female CRISPRh/+ individuals, in agreement with the predicted activity of the vasa2 promoter in both sexes during early gametogenesis16. In those rare progeny that did not contain a CRISPR homing allele, we looked for evidence of repair by nonhomologous end joining (NHEJ), microhomology-mediated end joining (MMEJ)19 or other noncanonical homing events at the three target loci. In a total of 32 offspring derived from a minimum of 7 individuals, we found a total of 13 indel mutations (6 unique, including two examples of a 6-bp deletion that preserved reading frame and could represent a resistant allele), presumably arising from NHEJ or MMEJ repair, and two events from the same parent producing a 195-bp insertion at AGAP007280, most parsimoniously explained by an incomplete homing event that was resolved using homology between the gRNA sequence in the construct and its cognate target in the genome (Supplementary Fig. 4). Consistent with rare incomplete homing events generating a nonfunctional homing allele, we recovered an identical event in a single individual that produced progeny with a normal Mendelian segregation of the transgenic phenotype.

Table 2 CRISPR h homing rates remain high across several generations Full size table

Though homing rates were high in the germline of both males and females, the fertility of females heterozygous for a homing construct was markedly reduced, with the number of larvae produced only 4.6% of WT (bootstrap 95% confidence limits 2.3–7.7%) for AGAP011377 and 9.3% (5.7–14.2%) of WT for AGAP007280. We did not recover a single larva from females heterozygous for a CRISPRh allele at AGAP005958 (Fig. 2d). In contrast, males heterozygous for CRISPRh alleles showed normal fertility (Fig. 2e). The fertility reduction observed for heterozygous CRISPRh females was at odds with the phenotype observed in heterozygous docking line females where the disruption of single alleles of AGAP011377, AGAP007280 and AGAP005958 apparently did not affect female fertility. This reduction in fertility is probably due to somatic expression of the Cas9 nuclease, as we have observed for a similar construct targeting GFP (Supplementary Fig. 5), and as others have observed in Drosophila9,20. The nos promoter has recently been found to be substantially more germline-specific in directing Cas9 activity in Drosophila20, and our system is flexible so it can accommodate alternative promoters.

Our measures of homing rates and fertility effects can be used with the model of Deredec et al.10 to derive an initial prediction about whether the constructs would be expected to spread if released into a population. This analysis revealed that the fitness cost in terms of reduced reproductive capability imposed by the CRISPRh constructs at AGAP011377 and AGAP005958 outweigh the homing rate, and the constructs would be expected to disappear from a population over time—in many aspects these constructs match the requirements of female-specific RIDL (release of insects with a dominant lethal) with enhanced transmission21, a potent form of the sterile insect technique, though conditional rescue of the sterility may be required for efficient production. However, the higher homing rates observed for CRISPRh at AGAP007280, combined with the milder fertility reduction observed in heterozygous females indicate that this construct could spread through a population, at least initially, and impose a reproductive load on the population as it does so, fulfilling one of the major requirements for a functional gene drive measure for vector control (Fig. 2f). To investigate the ability of the CRISPRh allele to spread at the AGAP007280 locus, caged populations were initiated with CRISPRh/+ and WT individuals at equal frequency and monitored over several generations. Consistent with the modeling predictions we observed a progressive increase in the frequency of individuals positive for the CRISPRh allele from 50% to 75.1% over four generations (Fig. 2g). Such a reproductive load will impose a strong selection pressure for resistant alleles, some of which will be generated by the gene drive system itself through NHEJ or MMEJ repair of endonuclease-induced chromosome breaks, as we previously showed molecularly (Supplementary Fig. 4). The longer term dynamics will depend on the efficiency of spreading on the one hand and the fitness cost of mutations arising at the cleavage site on the other hand10,22. Ultimately the effect of these mutations could be mitigated by designing nucleases that target conserved, functionally constrained regions in the target gene and that are tolerant of mutations1. This could be achieved using a CRISPR-Cas9 gene drive through the use of multiple gRNAs targeting sequence variants7.

The high frequency with which gene knockouts were achieved at three separate loci and the ease with which these could be both tracked using a visual marker and secondarily modified to include genes of choice verify the CRISPR-Cas9 gene drive system as a robust gene editing tool that will be valuable for functional genetics in the malaria mosquito. The rates of super-Mendelian inheritance that we observed with CRISPR-based homing constructs at female-fertility loci establish a solid basis for the development of a gene drive system that has the potential to substantially reduce mosquito populations. Moreover, our gene drive element was able to carry substantial additional sequence in the form of the RFP marker unit, indicating that this technology is also resilient to bringing along additional cargo, making it suitable for population-modification strategies that are aimed at modifying vector populations with transgenes conferring useful phenotypes such as parasite resistance. Being able to use CRISPR-Cas9 in mosquitoes means that genome editing and nuclease engineering will no longer be technical bottlenecks in this major pest insect.

The success of gene drive technology for vector control will depend on the choice of suitable promoters to effectively drive homing during the process of gametogenesis, the phenotype of the disrupted genes, the robustness of the nuclease during homing and the ability of the target population to generate compensatory mutations.