Because lncRNAs may encode conserved peptides (), we examined the coding potential of NORAD using PhyloCSF, an algorithm that discriminates protein coding from noncoding transcripts based on their evolutionary signatures (). This analysis confirmed the low coding potential of NORAD, which received a maximum codon substitution frequency (CSF) value similar to other well-characterized lncRNAs ( Figure 1 E). NORAD also lacks the potential to encode any recognizable protein domains, based on a BLASTX analysis of all possible reading frames. Based on these findings that established NORAD as a highly conserved, ubiquitously expressed, abundant lncRNA, we set out to investigate its functions in human cells.

NORAD is easily detectable as a discrete transcript of the expected size by northern blotting ( Figure 1 C). Absolute copy-number analysis in a panel of human cell lines with or without doxorubicin treatment revealed that NORAD is present at ∼300–1,400 copies per cell, similar in abundance to highly expressed mRNA transcripts such as ACTB () ( Figure 1 D).

As in mouse, the human lncRNA is induced after DNA damage in a p53-dependent manner in the colon cancer cell line HCT116 ( Figure 1 B). We therefore named this lncRNA “noncoding RNA activated by DNA damage”, or NORAD. Despite its p53-dependent induction, we were unable to identify an obvious p53 binding site in the vicinity of the NORAD promoter nor was one identified in a recent p53 ChIP-seq study performed in this cell line (), likely indicating indirect regulation of NORAD by p53.

This study was initiated in an attempt to identify human lncRNAs that regulate the DNA damage response. To this end, we examined a set of previously identified mouse lncRNAs that are induced after doxorubicin treatment in a p53-dependent manner (). Among these transcripts, we noted a poorly characterized 4.9 kilobase (kb) unspliced lncRNA, annotated as 2900097C17Rik, that exhibits a high degree of evolutionary conservation in mammals. A clear ortholog of this transcript, with 65% nucleotide identity to 2900097C17Rik, is expressed from the syntenic location in the human genome ( Figure 1 A). Annotated in RefSeq as LINC00657, this 5.3 kb lncRNA is broadly and abundantly expressed in human cell lines and tissues ( Figures 1 A and S1 A). Like the mouse ortholog, the human transcript has features of an RNA polymerase II transcription unit, including an enrichment of H3K4me3-modified histones at the transcription start site ( Figure 1 A) and a canonical polyadenylation signal at the 3′ end, use of which was confirmed by 3′ rapid amplification of cDNA ends (RACE) ( Figure S1 B).

(D) qRT-PCR analysis of NORAD expression relative to 18S rRNA in targeted HCT116 clones of the indicated genotypes.

(C) Schematic showing 7 kb SphI restriction fragment created by correct NORAD targeting and its detection by Southern blot in HCT116 knockout clones.

(A) Illumina BodyMap 2.0 1X75bp RNA-seq data were downloaded from The Galaxy Project ( https://usegalaxy.org/library/index ), aligned to hg19 using Tophat2 (Trapnell et al., 2009), and FPKM values were calculated using Cufflinks (Trapnell et al., 2010).

Characterization and Inactivation of NORAD in Human Cells, Related to Figures 1 and 2

(E) Maximum CSF scores of NORAD as well as other known coding and noncoding RNAs determined by analysis with PhyloCSF ().

(D) Absolute quantification of NORAD transcript copy number per cell, determined by qRT-PCR, in various human cell lines with or without treatment with 1 μM doxorubicin for 24 hr.

(B) qRT-PCR analysis of NORAD expression relative to 18S rRNA in p53and p53HCT116 cells () with or without treatment with 1 μM doxorubicin for 24 hr. For this and all subsequent qPCR figures, error bars represent SDs from three independent measurements.

(A) Schematic representation of NORAD (annotated in RefSeq as LINC00657) with associated UCSC Genome Browser tracks depicting mammalian conservation (PhastCons) as well as ENCODE RNA-seq and H3K4me3 ChIP-seq coverage in human cell lines ().

To confirm that this phenotype is not unique to HCT116 cells, we introduced the transcriptional stop cassette into the NORAD locus in BJ-5ta cells, a telomerase-immortalized non-transformed diploid fibroblast cell line ( Figures S3 A and S3B). Although NORADBJ-5ta cells were grossly diploid by flow-cytometric analysis of DNA content (data not shown), they exhibited significantly elevated levels of aneuploidy, as determined by quantification of chromosomes 7 and 20 by FISH ( Figure S3 C).

(C) Cells of the indicated genotypes were assayed for aneuploidy using chromosome 7/20 FISH as in Figures 2 E and 2F. 100 nuclei were scored per clone. P value calculated by chi-square test.

(B) qRT-PCR analysis of NORAD expression relative to 18S rRNA in targeted BJ-5ta clones of the indicated genotypes.

Even apparently diploid NORADclones displayed a range of chromosome numbers ( Figure 2 D), suggesting that this karyotypically stable cell line had adopted a chromosomal instability (CIN) phenotype, defined as the frequent loss or gain of whole chromosomes (). Human cancer cells frequently exhibit CIN, which is believed to be an important driver of tumorigenesis (), yet the role of lncRNAs in regulating this phenotype is poorly understood. Therefore, to more quantitatively assess whether loss of NORAD induces CIN, we employed an established fluorescent in situ hybridization (FISH) assay, in which marker chromosomes are labeled and scored in hundreds of interphase cells (). Assaying chromosomes 7 and 20 with this approach verified that wild-type HCT116 cells exhibit a low frequency of chromosomal gain or loss ( Figures 2 E–2F). In contrast, up to 25% of NORADcells displayed gain or loss of one of these chromosomes, confirming the presence of a CIN phenotype. Importantly, since only two chromosomes were assayed in these experiments, these measurements likely represent a significant underestimate of the frequency of aneuploidy in NORADcells. In addition, live-cell imaging documented a high rate of mitotic errors, including anaphase bridges and mitotic slippage, in NORADclones ( Figures 2 G–2I). Finally, karyotyping of representative NORADclones revealed the presence of non-recurrent de novo structural chromosomal rearrangements ( Figure S2 D). These findings were documented in multiple NORADclones generated with distinct TALEN pairs, strongly suggesting that this CIN phenotype is not caused by an off-target effect of TALEN-mediated genome editing.

Despite the induction of NORAD after DNA damage, we observed no consistent defect in the p53-dependent G1 or G2 checkpoints in NORADcells ( Figures S2 A–S2C), indicating that NORAD is not required for these aspects of the DNA damage response. In the course of these experiments, however, we unexpectedly observed that 2/15 NORADclones appeared to have stably tetraploid DNA content ( Figure 2 C). These findings were confirmed by examining metaphase chromosome spreads. With rare exception, wild-type HCT116 cells had 45 chromosomes, consistent with the reported karyotype () ( Figure 2 D). In contrast, tetraploid NORADcells had variable chromosome numbers, with DNA content approaching 4 N. As described below, our subsequent experiments have demonstrated that the spontaneous generation of tetraploid HCT116 subclones is exceedingly rare, and we have never observed stable tetraploidization of these cells without NORAD inactivation.

(D) Parental HCT116 and representative NORAD −/− clones were karyotyped by Giemsa-trypsin-Wright staining of metaphase spreads. As reported, HCT116 cells harbored 3 major chromosomal rearrangements involving chromosomes 10, 16 and 18 (black arrowheads) (Abdel-Rahman et al., 2001; Bunz et al., 2002). Non-recurrent de novo rearrangements in NORAD −/− clones are indicated by red arrowheads.

(C) The fraction of cells in M phase in doxorubicin-treated cells, quantified from Panel A. Unlike p53 −/− cells, which lack an effective G2 checkpoint, NORAD −/− cells fail to enter M phase after DNA damage.

(B) The fraction of cells in G1 in doxorubicin-treated cells, quantified from Panel A. p53 −/− cells, which lack an effective G1 checkpoint, exit G1 after DNA damage while NORAD −/− cells accumulate in this cell cycle phase.

(A) The G1 checkpoint was assessed by treating cells with 1 μM doxorubicin for 24 hr and subsequently measuring DNA content by propidium iodide staining and flow cytometry. The G2 checkpoint was assessed by measuring the fraction of mitotic cells by phospho-histone 3 S10 (pH3) staining.

Cytogenetic characterization of two colon cell lines by using conventional G-banding, comparative genomic hybridization, and whole chromosome painting.

To elucidate potential functions of NORAD, we designed three pairs of transcription-activator-like effector nucleases (TALENs) that target within the first 300 nucleotides of the lncRNA to facilitate the homology-directed insertion of a transcriptional stop element and a puromycin-resistance cassette flanked by loxP sites ( Figure 2 A). Initially, this approach was used to inactivate NORAD in HCT116 cells, a stably diploid human cell line that has been extensively used to study the p53 pathway and the human DNA damage response (). All three TALEN pairs produced correctly targeted subclones, with high efficiency after puromycin selection, allowing the derivation of a large number of NORADlines (15/147 clones with homozygous insertions). Correct targeting was confirmed by Southern blotting ( Figure S1 C) and resulted in the expected loss of NORAD expression ( Figures 2 B and S1 D).

(H and I) Quantification of the percentage of mitoses exhibiting the indicated mitotic errors in time-lapse-imaging experiments. Values represent the average of three independent experiments with 39–100 mitoses imaged per genotype per experiment. Error bars represent SDs. ∗ p < 0.05; ∗∗ p < 0.01, Student’s t test.

(F) NORAD −/− cells exhibit significantly elevated levels of aneuploidy. At least 100 interphase nuclei in each of three independent knockout clones were assayed for chromosome 7 and 20 using DNA FISH, and the frequency of cells exhibiting a non-modal chromosome number was scored. ∗∗ p < 0.005, chi-square test.

(E) Representative images of chromosome 7 and 20 FISH in NORAD +/+ and NORAD −/− HCT116 cells. White arrowheads highlight cells with chromosome loss or gain.

(D) Metaphase spreads of wild-type HCT116 cells and representative tetraploid and diploid NORAD −/− clones. The number in the lower right corner of each image shows the number of chromosomes present. Abnormal chromosome numbers indicated in red.

(C) Flow cytometry histograms showing DNA content, as measured by propidium iodide staining, in representative diploid and tetraploid NORAD −/− HCT116 clones.

(B) Northern blot analysis of NORAD in HCT116 clones of the indicated genotypes.

(A) NORAD was inactivated in human cell lines using custom TALEN pairs (represented as scissors) that cleave within the first 300 nucleotides of the gene, thereby stimulating the insertion of a puromycin resistance cassette (Puro R ) followed by tandem polyadenylation signals (STOP). Green triangles represent loxP sites.

Lastly, we used Cre recombinase to excise the transcriptional stop cassette in NORADcells, resulting in restored NORAD expression ( Figure 3 E). Rescued subclones exhibited significantly lower levels of aneuploidy than subclones derived from control-treated cells ( Figure 3 F). Based on these findings, we conclude that NORAD is essential for the maintenance of genomic stability in human cells.

Next, we depleted the NORAD transcript using multiple small interfering RNA (siRNAs) and assessed chromosome content by FISH after 12 days of subsequent growth ( Figures 3 B and 3C). As observed following TALEN-mediated inactivation of NORAD, knockdown of this transcript resulted in significant aneuploidy. Subclones of control or NORAD knockdown cells were then produced, revealing infrequent but reproducible de novo generation of tetraploid lines derived specifically from cells transfected with NORAD-targeting siRNAs ( Figure 3 D).

We next performed a series of experiments to confirm that the CIN phenotype that we observed in NORADcells was specifically due to loss of this lncRNA rather than a consequence of genome manipulation with TALENs. First, we used a published TALEN pair to insert a puromycin resistance cassette at the AAVS1/PPP1R12C locus (). Quantification of chromosomes 7 and 20 documented normal chromosome numbers in homozygously targeted HCT116 and BJ-5ta cells ( Figures 3 A and S3 C). Live-cell imaging confirmed a low rate of mitotic errors ( Figures 2 G–2I). Furthermore, analysis of 70 subclones of HCT116 cells transfected with these TALENs revealed that all retained diploid DNA content (data not shown). Thus, neither CIN nor tetraploidy is a general property of cells that have undergone TALEN-mediated genome editing.

(F) Subclones generated from untreated or adenovirus-Cre infected NORADHCT116 cells were scored for aneuploidy as in Figures 2 E and 2F. p value calculated by Student’s t test.

(E) qRT-PCR analysis of NORAD expression in NORAD +/+ and NORAD −/− HCT116 cells with or without adenovirus-Cre infection.

(D) Flow cytometry histograms showing DNA content, as measured by propidium iodide staining, in representative HCT116 subclones generated after transfection with the indicated siRNAs.

(C) Aneuploidy in siRNA-transfected HCT116 cells 12 days after siRNA transfection, assayed as in Figures 2 E and 2F. At least 200 nuclei were scored per condition. p value calculated by chi-square test.

(B) qRT-PCR analysis of NORAD expression, relative to 18S rRNA, in HCT116 cells 48 hr after transfection with control (siNT) or NORAD-targeting siRNAs.

(A) Insertion of a puromycin resistance cassette at the AAVS1/PPP1R12C locus was performed using a published TALEN pair (), and the frequency of aneuploidy in homozygous targeted HCT116 clones was assessed using DNA FISH as in Figure 2 E and 2F. n.s., not significant (chi-square test).

Efficient targeting of expressed and silent genes in human ESCs and iPSCs using zinc-finger nucleases.

It has been proposed that CIN can result from whole genome duplication events that produce a transient tetraploid state that subsequently resolves into an unstable pseudo-diploid state (). Therefore, since we recovered both tetraploid and diploid NORADclones that each exhibited CIN, it was unclear whether loss of NORAD primarily causes tetraploidization, which then results in CIN, or whether NORAD directly regulates both ploidy and chromosomal stability. The fact that CIN can be rescued by NORAD reactivation in diploid knockout cells ( Figures 3 E and 3F) supports the latter possibility. If CIN were due to a prior, now resolved, tetraploid state, restoration of NORAD should no longer have the capacity to revert genomic instability in diploid cells. Furthermore, if the CIN phenotype of NORADcells is solely a secondary consequence of polyploidization, tetraploid knockout cells should revert to a diploid state at a measureable frequency. However, 32/32 analyzed subclones derived from tetraploid NORADcells retained tetraploid DNA content. In contrast, ∼10% of subclones of diploid NORADcells gained tetraploid DNA content (data not shown). These results support a primary role for NORAD in regulating both ploidy and chromosomal stability in diploid cells.

NORAD Is a Cytoplasmic Multivalent PUMILIO Binding Platform

Figure S4 Domain Structure and Subcellular Localization of NORAD, Related to Figure 4 Show full caption (A) Dot plot of nucleotide identity generated by aligning the NORAD sequence to itself using BLAST (discontinuous megablast; http://blast.ncbi.nlm.nih.gov/ ), revealing multiple repetitive regions within the NORAD sequence. (B) Schematic of the NORAD transcript, showing the locations of the repetitive regions, termed NORAD domains (ND1-5). The mammalian conservation plot was obtained from the UCSC Genome Browser (PhastCons track). (C) NORAD fragments used for in vitro transcription and RNA pull-down experiments. (D) qRT-PCR analysis of NORAD and cytoplasmic (ACTB) and nuclear (NEAT1) control transcripts in subcellular fractions of HCT116 cells. (E) Representative NORAD single-molecule RNA FISH images of HCT116 cells of the indicated genotypes. Alignment of the NORAD sequence to itself using the BLAST algorithm uncovered a repetitive ∼400 nucleotide domain that recurs five times in the transcript ( Figures S4 A and S4B). We termed this sequence the NORAD domain (ND1–ND5). Notably, a large fraction of the conserved sequence within NORAD is encompassed within these repetitive regions. Furthermore, subcellular fractionation ( Figure S4 D) and single molecule RNA FISH ( Figure S4 E) demonstrated a nearly exclusively cytoplasmic localization of the NORAD RNA. Based on these results, we hypothesized that the NORAD domain represents a binding platform through which this lncRNA is able to assemble a multivalent cytoplasmic ribonucleoprotein (RNP) complex.

Mili and Steitz, 2004 Mili S.

Steitz J.A. Evidence for reassociation of RNA-binding proteins after cell lysis: implications for the interpretation of immunoprecipitation analyses. Hafner et al., 2010 Hafner M.

Landthaler M.

Burger L.

Khorshid M.

Hausser J.

Berninger P.

Rothballer A.

Ascano Jr., M.

Jungkamp A.C.

Munschauer M.

et al. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Since RNAs and proteins are known to reassociate after cell lysis (), we took advantage of a previously generated photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) dataset generated with human FLAG-tagged PUM2 () that detects specific PUM2:RNA binding interactions that occur in intact cells. In total, 7,523 PUM2 binding sites, occurring in ∼3,000 transcripts, were identified in this study. Five PUM2 binding sites within NORAD were reported, with a site in ND4 representing the eleventh most frequently crosslinked site transcriptome-wide ( http://www.mirz.unibas.ch/restricted/clipdata/RESULTS/PUM2/PUM2.html ). These results further document direct PUM2:NORAD interactions.

Kazazian, 2014 Kazazian Jr., H.H. Processed pseudogene insertions in somatic cells. −/− clones had single copy insertions of the lox-STOP-lox cassette at the desired site. In the course of these studies, we noted the presence of a large number of sequences with homology to NORAD distributed throughout the human genome, with at least 43 genomic loci that exhibit 84%–98% identity to NORAD over at least a 500 bp span. Many of these homologous sequences have features of processed pseudogenes, including target site duplications and terminal poly(A) sequences (). Analysis of Illumina BodyMap 2.0 RNA-seq data revealed little evidence of transcription of most of these loci (data not shown), with the notable exception of a nearly full-length NORAD-related sequence on chromosome 6, which is annotated in Refseq as HCG11. However, HCG11 has an average fragments per kilobase of transcript per million mapped reads (FPKM) of 2.0 in BodyMap data compared to an average FPKM of 31.8 for NORAD. Accordingly, use of sequence-specific Taqman assays demonstrated that HCG11 abundance is >200-fold lower than NORAD abundance in HCT116 cells (data not shown). Thus, at present, there is no evidence that any of these NORAD-related sequences are functional in human cells, although it remains possible that some may be transcribed at biologically relevant levels in specific tissues or cell types. Importantly, our Southern blot strategy ( Figure S1 C) confirmed that these NORAD-related sequences did not confound our genome editing approach, since all analyzed NORADclones had single copy insertions of the lox-STOP-lox cassette at the desired site.

Hafner et al., 2010 Hafner M.

Landthaler M.

Burger L.

Khorshid M.

Hausser J.

Berninger P.

Rothballer A.

Ascano Jr., M.

Jungkamp A.C.

Munschauer M.

et al. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. +/+ and NORAD−/− HCT116 cells. Recovery of PUM2 was less efficient in this experiment than the prior study, which used heterologous expression of epitope-tagged PUM2, resulting in less comprehensive transcriptome-wide PUM2 target identification (−/− cells ( The presence of numerous NORAD pseudogenes likely confounded the mapping of sequencing reads in the prior PAR-CLIP study (), since reads that mapped to multiple genomic locations were excluded from further analyses. We therefore reanalyzed these data, first extracting all reads that map to NORAD prior to transcriptome-wide mapping. Remarkably, this revealed that NORAD was the most highly represented PUM2 CLIP target by a large margin ( Figure 4 B). To complement these data, we performed PAR-CLIP on endogenous PUM2 in NORADand NORADHCT116 cells. Recovery of PUM2 was less efficient in this experiment than the prior study, which used heterologous expression of epitope-tagged PUM2, resulting in less comprehensive transcriptome-wide PUM2 target identification ( Table S2 ). Nevertheless, NORAD was again the most highly represented target of endogenous PUM2 and, as expected, was not detected in NORADcells ( Figure 4 C; Table S2 ), demonstrating that the NORAD pseudogenes do not confound our modified mapping approach.