Spotting off-targets from gene editing Unintended genomic modifications limit the potential therapeutic use of gene-editing tools. Available methods to find off-targets generally do not work in vivo or detect single-nucleotide changes. Three papers in this issue report new methods for monitoring gene-editing tools in vivo (see the Perspective by Kempton and Qi). Wienert et al. followed the recruitment of a DNA repair protein to DNA breaks induced by CRISPR-Cas9, enabling unbiased detection of off-target editing in cellular and animal models. Zuo et al. identified off-targets without the interference of natural genetic heterogeneity by injecting base editors into one blastomere of a two-cell mouse embryo and leaving the other genetically identical blastomere unedited. Jin et al. performed whole-genome sequencing on individual, genome-edited rice plants to identify unintended mutations. Cytosine, but not adenine, base editors induced numerous single-nucleotide variants in both mouse and rice. Science, this issue p. 286, p. 289, p. 292; see also p. 234

Abstract CRISPR-Cas genome editing induces targeted DNA damage but can also affect off-target sites. Current off-target discovery methods work using purified DNA or specific cellular models but are incapable of direct detection in vivo. We developed DISCOVER-Seq (discovery of in situ Cas off-targets and verification by sequencing), a universally applicable approach for unbiased off-target identification that leverages the recruitment of DNA repair factors in cells and organisms. Tracking the precise recruitment of MRE11 uncovers the molecular nature of Cas activity in cells with single-base resolution. DISCOVER-Seq works with multiple guide RNA formats and types of Cas enzymes, allowing characterization of new editing tools. Off-targets can be identified in cell lines and patient-derived induced pluripotent stem cells and during adenoviral editing of mice, paving the way for in situ off-target discovery within individual patient genotypes during therapeutic genome editing.

CRISPR-Cas genome editing holds great promise for therapeutic applications but accurate characterization of on- and off-target DNA double-strand breaks (DSBs) induced by a Cas-nuclease remains difficult (1). Although existing methods identify off-targets in vitro (2–4) and in restricted cellular models (5, 6), they have limitations such as abundant false positives and an inability to operate during in vivo editing. Here, we describe DISCOVER-Seq (discovery of in situ Cas off-targets and verification by sequencing), a universal approach for unbiased off-target identification and molecular characterization of Cas activity that works in primary cells and in vivo.

Genome editing relies upon repair of a Cas-induced DSB and we reasoned that monitoring this process could be used to track genome editing and to detect off-targets. Catalytically inactive Cas9 binds many more sequences than it cuts and thus chromatin immunoprecipitation sequencing (ChIP-Seq) of Cas9 itself yields false positives (7, 8). However, capturing sequences bound by endogenous DNA repair machinery could specifically identify sites of Cas-induced damage.

We investigated a set of DNA repair proteins for their ability to identify Streptococcus pyogenes Cas9 target sites by ChIP-Seq (Fig. 1, A to C). We subsequently focused on the MRE11 subunit of the MRN complex, which is tightly distributed around the Cas9 cut site, is broadly expressed, and has a commercially available antibody that cross-reacts with murine and human proteins to potentially enable preclinical and clinical off-target detection without changing the workflow (fig. S1, A and B). MRE11 binding peaked before the appearance of insertions and deletions (indels) and was readily detected at a known guide RNA (gRNA) off-target (Fig. 1D and fig. S2) (9). Most MRE11 ChIP-Seq reads precisely ended at the predicted Cas9 cut, enabling identification of the nuclease site with single-base resolution (Fig. 2A and fig. S3, A to C). By examining multiple on- and off-targets, we found that Cas9 induces an asymmetric signature depending upon the strand bound by the gRNA (Fig. 2B and fig. S3B), which matches in vitro strand release after Cas9 activity (10). MRE11 binding also correctly identified Cas9’s tendency to create 1–base pair overhangs when paired with one protospacer that we tested (fig. S4, A and B) (11). MRE11 was robustly recruited to sites of Cas12a (Cpf1) editing, but with symmetric and overlapping reads consistent with Cas12a’s mechanism of action (12, 13) (Fig. 2C). MRE11 ChIP-Seq therefore captures the cellular activity of diverse genome-editing enzymes at a molecular level. We combined MRE11 ChIP-Seq with custom software, BLENDER (blunt end finder, https://github.com/staciawyman/blender v1.0.1, fig. S5), into the DISCOVER-Seq pipeline for in vivo identification of Cas9 off-targets (Fig. 1A).

Fig. 1 ChIP of DNA repair proteins at Cas-induced DSBs. (A) DISCOVER-Seq workflow. (B) DNA repair proteins assemble at a Cas9-induced DSBs. ChIP-Seq tracks for DNA repair proteins in K562 cells edited with VEGFA (+) or nontargeting gRNA (–). (C) Magnified image from (B). (D) ChIP–quantitative polymerase chain reaction (PCR) dynamics of repair protein binding and indel formation (also see fig. S2).

Fig. 2 MRE11 ChIP molecularly characterizes Cas-induced DSBs. (A) Coverage and reads for MRE11 ChIP-Seq at the VEGFA on-target site (also see fig. S3, B and C). Most reads end in the Cas9 cut site. (B) Aggregate reads over multiple on- and off-target sites binned into gRNA binding the sense (n = 7 sites) or antisense (n = 7 sites) strand reveal asymmetric MRE11 recruitment consistent with in vitro models (schematic) (10). (C) MRE11 ChIP-Seq in K562 cells edited with AsCas12a-RNP visualizes multiple overhangs produced by Cas12a (12).

We tested DISCOVER-Seq’s performance at unbiased off-target detection using the well-characterized, promiscuous gRNAs “VEGFA_site2” (5) in human K562 cells and “Pcsk9-gP” (14) in murine B16-F10 cells. DISCOVER-Seq identified 57 off-targets for VEGFA_site2 and 45 off-target sites for Pcsk9-gP, which we individually validated by amplicon next-generation sequencing (amplicon-NGS) (Fig. 3A, fig. S6A, and tables S1 and S2). All off-targets identified by DISCOVER-Seq and tested had indel rates above background (>0.1%). We further tested DISCOVER-Seq using gRNAs with no off-targets targeting RNF2 and “Pcsk9-gM” and an HBB-targeting guide with two off-targets (9). DISCOVER-Seq correctly characterized these gRNAs and identified no false positives (Fig. 3, B and C, and fig. S6B). We found that gRNAs with an extra 5′-guanine added to protospacers used with polymerase III promoters or in vitro transcription reduced off-targets, but nonuniformly (fig. S7, A to C). Regarding enzyme specificity, DISCOVER-Seq also correctly identified fewer off-targets from a high-fidelity (HiFi) Cas9 mutant (15) (fig. S8, A to D). DISCOVER scores were highly correlated with eventual indel frequencies across multiple contexts (Fig. 3D and fig. S9, A to C). Comparing DISCOVER-Seq with GUIDE-Seq in K562 cells using the VEGFA_site2 gRNA, we found substantial overlap and also sites uniquely found by each method (fig. S10A and table S3). DISCOVER-Seq correlated more closely with indel rates than GUIDE-Seq and we empirically determined a DISCOVER-Seq sensitivity threshold of ∼0.3% indels (fig. S10, B and C). Overall, DISCOVER-Seq performs well at unbiased and quantitative off-target identification with multiple Cas nucleases and gRNA architectures.

Fig. 3 Unbiased off-target discovery using DISCOVER-Seq. (A) Off-target sequences identified with DISCOVER-Seq and indel frequencies for VEGFA_site2 in K562 cells in one representative replicate. NC, noncoding; n.d., not determined because of difficulties in PCR amplification. (B and C) Sequences, DISCOVER scores, and indel frequencies for RNF2 and HBB gRNAs in K562 cells. (D) DISCOVER scores and indel frequencies are correlated (Spearman correlation) (also see figs. S9 and S10). On-target site is shown in red.

Induced pluripotent stem cells (iPSCs) and other primary cells are often not amenable to off-target discovery because of the toxicity of the reagents (fig. S11, A and B). We tested DISCOVER-Seq’s ability to identify patient-specific off-targets in iPSCs. We edited iPSCs derived from a Charcot-Marie-Tooth (CMT) patient with a dominant-negative, heterozygous mutation in HSPB1 using HiFi-Cas9 ribonucleoproteins (RNPs) targeting the wild-type (WT) or mutant (Mut) allele. This challenging application dictates both patient- and allele-specific editing to knock out the mutated allele while sparing the WT (Fig. 4, A and B). DISCOVER-Seq of the WT guide identified the HSPB1 on-target and two off-targets, one of them in a duplicated region on chromosome 9, but the Mut guide solely identified the on-target site (Fig. 4C and table S4). Allelic-dropout analysis of amplicon-NGS showed that the WT guide edited ~30% of Mut alleles, but the Mut guide cross-edited ~7% (fourfold less) WT alleles (Fig. 4D). Sequences from DISCOVER-Seq data recapitulated the allele specificity of the Mut guide (Fig. 4E), which illustrates DISCOVER-Seq’s ability to perform off-target identification and even to determine allelic specificity during preclinical testing in patient-derived cells.

Fig. 4 DISCOVER-Seq in patient-derived iPSCs and in vivo. (A) CMT patient-derived iPSCs were edited with HiFi-RNPs targeting the WT or Mut allele. (B) Patient-specific HSPB1 alleles. (C) DISCOVER scores and indel frequencies of off-targets for HSPB1 WT- and Mut-gRNAs. (D) Allele specificity of WT- and Mut-gRNAs by dropout amplicon-NGS indicates allele specificity for the Mut-gRNA but not the WT-gRNA. (E) DISCOVER-Seq reads contain identifiable WT or Mut sequences distinguishing allele specificity. (F) On-target (liver) and off-target (lung) tissues for DISCOVER-Seq (n = 2 mice each) and amplicon-NGS (n = 3 mice) were harvested at the indicated times after adenoviral injection. (G) DISCOVER-Seq distinguishes nuclease activity in target and nontarget tissues at the on-target locus. (H) DISCOVER scores and indel frequencies by amplicon-NGS or ICE (*) for on- and off-targets in mouse livers. Black diamonds indicate those not characterized by VIVO (14). NC, noncoding; n.d., not determined.

Identifying CRISPR-Cas off-targets during in vivo genome editing is a barrier to clinical translation. DISCOVER-Seq relies on endogenous DNA repair factors and thus is theoretically applicable during in vivo editing. We tested in vivo DISCOVER-Seq in a murine setting using the promiscuous Pcsk9-gP guide. This system was characterized by “VIVO” (14), which identifies putative off-targets in purified DNA and exhaustively validates potential hits using amplicon-NGS. We used adenoviral infection to deliver Pcsk9-gP and Cas9 or Cas9 alone (Fig. 4F) and found that indels were apparent after 4 days (fig. S12A); therefore, we performed DISCOVER-Seq on mice sacrificed at 24, 26, and 48 hours and amplicon-NGS after 4 days. The on-target locus DISCOVER score in on-target liver was greatest at 24 hours and declined over time, with no measurable signal in nontargeted lung (Fig. 4G and fig. S12B). Using a pooled-read approach accounting for subtle differences between animals, DISCOVER-Seq identified 36 loci and we could individually amplify 27 of these for indel validation (Fig. 4H and table S5). All 27 tested sites validated with indel rates of 0.9 to 78.1%. Very-low-frequency off-targets identified by VIVO’s in vitro protocol were not detected by in vivo DISCOVER-Seq, but 17 of DISCOVER-Seq’s bona fide NGS-validated off-targets were not VIVO prioritized because they were lost among VIVO’s nearly 3000 hits and false positives.

DISCOVER-Seq is a universal approach for unbiased detection of genome-editing off-targets that reveals the molecular action and dynamics of Cas-nucleases in cells and animals. DISCOVER-Seq measures DSBs, providing insight into events occurring before the appearance of genome-editing outcomes. This could allow identification of cut sites leading to difficult-to-detect outcomes such as translocations and large deletions (16). Although in vitro off-target assays are exquisitely sensitive, they are prone to extensive false positives. Additionally, the genomic DNA is stripped of context that affects Cas9 activity, such as chromatin state and modifications (17–19). By contrast, DISCOVER-Seq measures off-targets in situ in a single-step procedure (table S6). MRE11’s broad conservation suggests the applicability of DISCOVER-Seq in organisms beyond mice (20). DISCOVER-Seq’s performance in vivo and in patient-derived iPSCs suggests therapeutic possibilities such as identification in personal genotypes differentiated to target tissues and real-time discovery in patient biopsy samples during clinical in vivo editing.

Supplementary Materials science.sciencemag.org/content/364/6437/286/suppl/DC1 Materials and Methods Figs. S1 to S12 Tables S1 to S8 References (21–36)

http://www.sciencemag.org/about/science-licenses-journal-article-reuse This is an article distributed under the terms of the Science Journals Default License.