End-to-end guide design for CRISPR/Cas9 with machine learning

Azimuth and Elevation: On-target and off-target guide prediction

The CRISPR/Cas9 system provides state-of-the art genome editing capabilities. However, several facets of this system are under investigation for further characterization and optimization. One in particular is the choice of guide RNA that directs Cas9 to target DNA: given that one would like to target the protein-coding region of a gene, hundreds of guides satisfy the constraints of the CRISPR/Cas9 PAM sequence. Only some of these guides efficiently target DNA to generate gene knockouts. Also, some guides may accidentally have activity in unintended places. We have developed two predictive modelling approaches, Azimuth [1] and Elevation [2], for these respective problems of on-target and off-target activity prediction.

Publications

Please cite these papers if using our on-target or off-target model:

John G. Doench *, Nicolo Fusi *, Meagan Sullender*, Mudra Hegde*, Emma W. Vaimberg*, Katherine F. Donovan, Ian Smith, Zuzana Tothova, Craig Wilen , Robert Orchard , Herbert W. Virgin, Jennifer Listgarten *, David E. Root . Optimized sgRNA design to maximize activity and minimize off-target effects for genetic screens with CRISPR-Cas9 . Nature Biotechnology Jan 2016, doi:10.1038/nbt.3437. (*equal contributions, corresponding author ) Jennifer Listgarten *, Michael Weinstein *, Benjamin P. Kleinstiver, Alexander A. Sousa, J. Keith Joung, Jake Crawford, Kevin Gao, Luong Hoang, Melih Elibol, John G. Doench *, Nicolo Fusi *. Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs. Nature Biomedical Engineering Jan 2018, doi:10.1038/s41551-017-0178-6. (*equal contributions, corresponding author )

How to use our software

On-target prediction (Azimuth):

On-target predictions are available through our web service at https://crispr.ml (using the “Input Sequence” toggle), or through the GPP sgRNA Designer tool maintained by the Broad Institute of MIT and Harvard.

In addition, a Python implementation of our model (and other competing methods) is available from GitHub.

Note: We are no longer supporting our Azure ML prediction server, previously accessed through Excel or the web API, as it is running an outdated version of our code. See the release notes on GitHub for further information.

Off-target prediction (Elevation):

We have precomputed all on-target and off-target scores for the human exome (GRCh38), and made the results available through the web service at https://crispr.ml.

A Python implementation of the Elevation-score and Elevation-aggregate off-target models is available from GitHub. Software for running Elevation-search (the genome search for potential off-targets) is also available on GitHub here. See the documentation in the GitHub repositories for installation and usage instructions.

Press Microsoft Blog post on Elevation: https://blogs.microsoft.com/ai/crispr-gene-editing/

Other articles on Elevation: Gizmodo https://gizmodo.com/microsoft-wants-to-use-artificial-intelligence-to-make-1821956922 Engadget https://www.engadget.com/2018/01/10/microsoft-ai-crispr-accuracy/?sr_source=Twitter

Microsoft Blog post on Azimuth: http://blogs.microsoft.com/next/2016/01/18/molecular-biology-meets-computer-science-tools-in-new-system-for-crispr/

Broad Blog post on Azimuth: https://www.broadinstitute.org/blog/machine-learning-approach-improves-crispr-cas9-guide-pairing

Associated data The combined FC and RES data, along with the predictions from our final, published on-target model can be found here.

Primary contributors

Jennifer Listgarten (Microsoft Research, New England)

Michael Weinstein (UCLA; Zymo Research)

John Doench (Broad Institute of MIT and Harvard)

Nicolo Fusi (Microsoft Research, New England)

Licensing information

The Azimuth source code is licensed under the BSD 3-clause license, and the Elevation source code is licensed under the MIT license. Manuscripts and datasets downloaded from third-party sources (e.g. using the shell script included with the Elevation source code) are subject to the licensing terms of the journals they originated from.

Contact

For questions, please contact crispr@lists (dot) research (dot) microsoft (dot) com .