Premature termination codon (PTC)-bearing transcripts are often degraded by nonsense-mediated decay (NMD) resulting in loss-of-function (LoF) alleles. However, not all PTCs result in LoF mutations, i.e., some such transcripts escape NMD and are translated to truncated peptide products that result in disease due to gain-of-function (GoF) effects. Since the location of the PTC is a major factor determining transcript fate, we hypothesized that depletion of protein-truncating variants (PTVs) within the gene region predicted to escape NMD in control databases could provide a rank for genic susceptibility for disease through GoF versus LoF. We developed an NMD escape intolerance score to rank genes based on the depletion of PTVs that would render them able to escape NMD using the Atherosclerosis Risk in Communities Study (ARIC) and the Exome Aggregation Consortium (ExAC) control databases, which was further used to screen the Baylor-Center for Mendelian Genomics disease database. This analysis revealed 1,996 genes significantly depleted for PTVs that are predicted to escape from NMD, i.e., PTVesc; further studies provided evidence that revealed a subset as candidate genes underlying Mendelian phenotypes. Importantly, these genes have characteristically low pLI scores, which can cause them to be overlooked as candidates for dominant diseases. Collectively, we demonstrate that this NMD escape intolerance score is an effective and efficient tool for gene discovery in Mendelian diseases due to production of truncated or altered proteins. More importantly, we provide a complementary analytical tool to aid identification of genes associated with dominant traits through a mechanism distinct from LoF.

Introduction

1 Kervestin S.

Jacobson A. NMD: a multifaceted response to premature translational termination. , 2 Kurosaki T.

Maquat L.E. Nonsense-mediated mRNA decay in humans at a glance. , 3 Lykke-Andersen S.

Jensen T.H. Nonsense-mediated mRNA decay: an intricate machinery that shapes transcriptomes. Translation-dependent nonsense-mediated decay (NMD) is an evolutionarily conserved mRNA surveillance mechanism that ensures dynamic regulation and high fidelity of gene expression in eukaryotic cells. It is a well-established “rule” that multi-exon transcripts that harbor termination codons out of their normal reading frame context, generally termed premature termination codons (PTCs), are likely to be subject to mRNA degradation by the NMD mRNA surveillance machinery and thus result in a predicted loss-of-function (LoF) variant or null allele. PTCs can be introduced into transcripts by various mechanisms including protein-truncating variants (PTVs; stopgain and indels), mRNA isoforms, and alternative translation.

4 Le Hir H.

Izaurralde E.

Maquat L.E.

Moore M.J. The spliceosome deposits multiple proteins 20-24 nucleotides upstream of mRNA exon-exon junctions. , 5 Singh G.

Kucukural A.

Cenik C.

Leszyk J.D.

Shaffer S.A.

Weng Z.

Moore M.J. The cellular EJC interactome reveals higher-order mRNP structure and an EJC-SR protein nexus. , 6 Saulière J.

Murigneux V.

Wang Z.

Marquenet E.

Barbosa I.

Le Tonquèze O.

Audic Y.

Paillard L.

Roest Crollius H.

Le Hir H. CLIP-seq of eIF4AIII reveals transcriptome-wide mapping of the human exon junction complex. + transcripts. 3 Lykke-Andersen S.

Jensen T.H. Nonsense-mediated mRNA decay: an intricate machinery that shapes transcriptomes. , 4 Le Hir H.

Izaurralde E.

Maquat L.E.

Moore M.J. The spliceosome deposits multiple proteins 20-24 nucleotides upstream of mRNA exon-exon junctions. , 7 Nagy E.

Maquat L.E. A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance. , 8 Ishigaki Y.

Li X.

Serin G.

Maquat L.E. Evidence for a pioneer round of mRNA translation: mRNAs subject to nonsense-mediated decay in mammalian cells are bound by CBP80 and CBP20. , 9 Le Hir H.

Gatfield D.

Izaurralde E.

Moore M.J. The exon-exon junction complex provides a binding platform for factors involved in mRNA export and nonsense-mediated mRNA decay. , 10 Kim V.N.

Kataoka N.

Dreyfuss G. Role of the nonsense-mediated decay factor hUpf3 in the splicing-dependent exon-exon junction complex. , 11 Gehring N.H.

Neu-Yilik G.

Schell T.

Hentze M.W.

Kulozik A.E. Y14 and hUpf3b form an NMD-activating complex. , 12 Schweingruber C.

Rufener S.C.

Zünd D.

Yamashita A.

Mühlemann O. Nonsense-mediated mRNA decay - mechanisms of substrate mRNA recognition and degradation in mammalian cells. , 13 Lykke-Andersen J.

Shu M.D.

Steitz J.A. Communication of the position of exon-exon junctions to the mRNA surveillance machinery by the protein RNPS1. − transcripts). The EJC-dependent model is well supported by a preponderance of experimental data that examine NMD efficiency, and the 50-bp rule alone accurately predicts NMD sensitivity in ∼85% of cancer-related mutations 14 Rivas M.A.

Pirinen M.

Conrad D.F.

Lek M.

Tsang E.K.

Karczewski K.J.

Maller J.B.

Kukurba K.R.

DeLuca D.S.

Fromer M.

et al. GTEx Consortium Geuvadis Consortium

Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome. , 15 Lindeboom R.G.

Supek F.

Lehner B. The rules and impact of nonsense-mediated mRNA decay in human cancers. , 16 Lappalainen T.

Sammeth M.

Friedländer M.R.

’t Hoen P.A.

Monlong J.

Rivas M.A.

Gonzàlez-Porta M.

Kurbatova N.

Griebel T.

Ferreira P.G.

et al. Geuvadis Consortium

Transcriptome and genome sequencing uncovers functional variation in humans. , 17 Hu Z.

Yau C.

Ahmed A.A. A pan-cancer genome-wide analysis reveals tumour dependencies by induction of nonsense-mediated decay. In mammalian cells, NMD requires an exon junction complex (EJC) that is comprised of a dynamic group of proteins that are positioned 20–24 nucleotides (nt) upstream of exon-exon boundaries by the splicing machinery in the nucleus.After an mRNA is exported from the nucleus, EJCs are removed during the pioneer round of translation by a translating ribosome. According to the EJC-dependent model for governing NMD, if a PTC is located more than 50–55 bp upstream of the last exon-exon junction, the transient interaction between the downstream EJC and the terminating ribosome is predicted to elicit NMD and degrade the mRNA harboring a PTC, i.e., NMDtranscripts.On the other hand, a truncating variant that results in a PTC located within the last 50–55 bp of the penultimate exon, or the entire last exon, is predicted to escape from NMD (NMDtranscripts). The EJC-dependent model is well supported by a preponderance of experimental data that examine NMD efficiency, and the 50-bp rule alone accurately predicts NMD sensitivity in ∼85% of cancer-related mutationsalthough a number of exceptions have been reported.

18 Mort M.

Ivanov D.

Cooper D.N.

Chuzhanova N.A. A meta-analysis of nonsense mutations causing human genetic disease. , 19 Frischmeyer P.A.

Dietz H.C. Nonsense-mediated mRNA decay in health and disease. −, giving rise to stable mRNA translated into mutant proteins that can have a potent dominant-negative activity, thereby leading to human disease traits responsible for a broad spectrum of clinical phenotypes. 2 Kurosaki T.

Maquat L.E. Nonsense-mediated mRNA decay in humans at a glance. − variants, so current tools to predict variant pathogenicity that rely on LoF intolerance or haploinsufficiency scores 20 Lek M.

Karczewski K.J.

Minikel E.V.

Samocha K.E.

Banks E.

Fennell T.

O’Donnell-Luria A.H.

Ware J.S.

Hill A.J.

Cummings B.B.

et al. Exome Aggregation Consortium

Analysis of protein-coding genetic variation in 60,706 humans. , 21 Samocha K.E.

Robinson E.B.

Sanders S.J.

Stevens C.

Sabo A.

McGrath L.M.

Kosmicki J.A.

Rehnström K.

Mallick S.

Kirby A.

et al. A framework for the interpretation of de novo mutation in human disease. , 22 Petrovski S.

Gussow A.B.

Wang Q.

Halvorsen M.

Han Y.

Weir W.H.

Allen A.S.

Goldstein D.B. The intolerance of regulatory sequence to genetic variation predicts gene dosage sensitivity. Approximately one-third of mRNAs containing pathogenic variants in genetic disorders and cancer are subject to frameshift or nonsense mutations that result in the generation of PTCs.Transcripts with PTCs located in the penultimate and last exon of genes can be NMD, giving rise to stable mRNA translated into mutant proteins that can have a potent dominant-negative activity, thereby leading to human disease traits responsible for a broad spectrum of clinical phenotypes.Importantly, such PTVs that escape NMD (PTVesc) may be erroneously interpreted as LoF alleles when in fact they behave as gain-of-function (GoF) alleles; examples include DVL1 (MIM: 601365 ), causing Robinow syndrome autosomal-dominant 2 (DRS2 [MIM: 616331 ]); DVL3 (MIM: 601368 ), causing Robinow syndrome autosomal-dominant 3 (DRS3 [MIM: 616894 ]); and CRX (MIM: 602225 ), causing Leber congenital amaurosis 7 (LCA7 [MIM: 613829 ]). Moreover, some of those genes cause disease only when carrying NMDvariants, so current tools to predict variant pathogenicity that rely on LoF intolerance or haploinsufficiency scoreswill fail to inform probability of pathogenicity due to transcripts escaping from NMD and being translated into mutant proteins.

The systematic application of the 50-bp rule for NMD prediction of transcripts with a truncating variant requires the identification of the precise location of the predicted PTC as well as the relative position of the last EJC. Notably, there may be an unexpectedly large distance between the location of a given frameshifting insertion-deletion variant (indels) and the next predicted PTC in some transcripts, sometimes surpassing the last EJC, that can result in NMD− transcripts. As a result, those transcripts may translate into proteins with GoF properties.

+ or to NMD− transcripts based on the relative location of the variant within the gene. Using this algorithm, we computationally classified PTVs, including frameshift insertion-deletions (indels) and stopgains, as NMD+ and NMD− in two control databases, the Atherosclerosis Risk in Communities Study (ARIC) 23 Gambin T.

Jhangiani S.N.

Below J.E.

Campbell I.M.

Wiszniewski W.

Muzny D.M.

Staples J.

Morrison A.C.

Bainbridge M.N.

Penney S.

et al. Secondary findings and carrier test frequencies in a large multiethnic sample. 20 Lek M.

Karczewski K.J.

Minikel E.V.

Samocha K.E.

Banks E.

Fennell T.

O’Donnell-Luria A.H.

Ware J.S.

Hill A.J.

Cummings B.B.

et al. Exome Aggregation Consortium

Analysis of protein-coding genetic variation in 60,706 humans. − variants relative to the NMD+ variants in a given control database. Our analysis revealed a total of 1,996 genes significantly depleted (i.e., ranked in the top 5%) for NMD− variants in either control database, a relevant (98%) portion of those are likely to be tolerant to LoF. The resulting list includes genes for which C-terminal truncation does not lead to haploinsufficiency, for instance, DVL1 and REST (MIM: 24 Bayram Y.

White J.J.

Elcioglu N.

Cho M.T.

Zadeh N.

Gedikbasi A.

Palanduz S.

Ozturk S.

Cefle K.

Kasapcopur O.

et al. Baylor-Hopkins Center for Mendelian Genomics

REST final-exon-truncating mutations cause hereditary gingival fibromatosis. , 25 White J.

Mazzeu J.F.

Hoischen A.

Jhangiani S.N.

Gambin T.

Alcino M.C.

Penney S.

Saraiva J.M.

Hove H.

Skovby F.

et al. Baylor-Hopkins Center for Mendelian Genomics

DVL1 frameshift mutations clustering in the penultimate exon cause autosomal-dominant Robinow syndrome. Here, to investigate the potential role of escape from NMD for variant alleles implicated in human disease, we designed an efficient tool, NMDEscPredictor, to predict whether a given frameshifting indel will lead to NMDor to NMDtranscripts based on the relative location of the variant within the gene. Using this algorithm, we computationally classified PTVs, including frameshift insertion-deletions (indels) and stopgains, as NMDand NMDin two control databases, the Atherosclerosis Risk in Communities Study (ARIC)and the Exome Aggregation Consortium (ExAC).We then developed an NMD escape intolerance score to rank each multi-exon canonical mRNA transcript in the genome based on the disequilibrium between expected and observed number of NMDvariants relative to the NMDvariants in a given control database. Our analysis revealed a total of 1,996 genes significantly depleted (i.e., ranked in the top 5%) for NMDvariants in either control database, a relevant (98%) portion of those are likely to be tolerant to LoF. The resulting list includes genes for which C-terminal truncation does not lead to haploinsufficiency, for instance, DVL1 and REST (MIM: 600571 ), leading to hereditary gingival fibromatosis (HGF [MIM: 617626 ]), that provide poignant examples.

These findings support the hypotheses that mapping the location of PTV mutations within a gene is relevant for assessing the variant pathogenicity as well as providing information concerning disease mechanism as haploinsufficiency or potentially GoF. Moreover, we show that ranking genes based on the derived NMD escape intolerance score is an effective and efficient tool for gene discovery and may facilitate elucidating the underlying biology of Mendelian disease traits due to production of truncated or C-terminally altered proteins.