Significance One of the most highly debated questions in the field of transcriptomics is the functionality of antisense transcripts. Are these transcripts merely transcriptional noise and a byproduct of the leakiness of transcriptional repression, or are they functional? Antisense RNAs are being ubiquitously reported, but their functionality remains elusive. Here we report a high-throughput approach to enrich antisense RNAs that are in a double-stranded form with their cognate sense RNAs and thus in a functional complex. This has led to the identification of more than 300 RNase III-dependent potentially functional antisense RNAs in Escherichia coli. These findings reveal a clear picture of the magnitude and degree of functionality of this mostly hidden class of transcripts.

Abstract Advances in high-throughput transcriptome analyses have revealed hundreds of antisense RNAs (asRNAs) for many bacteria, although few have been characterized, and the number of functional asRNAs remains unknown. We have developed a genome-wide high-throughput method to identify functional asRNAs in vivo. Most mechanisms of gene regulation via asRNAs require an RNA–RNA interaction with its target RNA, and we hypothesized that a functional asRNA would be found in a double strand (dsRNA), duplexed with its cognate RNA in a single cell. We developed a method of isolating dsRNAs from total RNA by immunoprecipitation with a ds-RNA specific antibody. Total RNA and immunoprecipitated dsRNA from Escherichia coli RNase III WT and mutant strains were deep-sequenced. A statistical model was applied to filter for biologically relevant dsRNA regions, which were subsequently categorized by location relative to annotated genes. A total of 316 potentially functional asRNAs were identified in the RNase III mutant strain and are encoded primarily opposite to the 5′ ends of transcripts, but are also found opposite ncRNAs, gene junctions, and the 3′ ends. A total of 21 sense/antisense RNA pairs identified in dsRNAs were confirmed by Northern blot analyses. Most of the RNA steady-state levels were higher or detectable only in the RNase III mutant strain. Taken together, our data indicate that a significant amount of dsRNA is formed in the cell, that RNase III degrades or processes these dsRNAs, and that dsRNA plays a major role in gene regulation in E. coli.

The advent and development of high-throughput sequencing technologies has uncovered the presence of widespread antisense transcription in many bacteria, with the number of annotated genes associated with antisense RNA (asRNA) differing greatly among bacterial species (1, 2). asRNAs are encoded on the DNA strand opposite an annotated gene and overlap a portion of a gene or the entire gene, or span multiple genes with perfect complementarity. asRNAs range in size from tens to thousands of nucleotides. Although numerous chromosomally encoded asRNAs have been identified, few have been confirmed by traditional methods or functionally characterized. Raghavan et al. (3) reported that few asRNAs are conserved between Escherichia coli and Salmonella enterica, and that the predicted promoter sequences of the asRNAs are not conserved between these species, suggesting that most asRNA transcripts are products of spurious transcription and are not biologically functional RNAs.

The majority of functionally characterized asRNAs are found on plasmids, phages, and transposons (4, 5). The mode of regulation by asRNAs can be classified according to molecular mechanism as transcription interference, transcription attenuation, alteration of transcript stability, and translation inhibition (1, 2, 6). With the exception of transcription interference, a physical RNA–RNA interaction between the sense RNAs and asRNAs is necessary for all of these mechanisms, requiring that both RNAs be expressed in the same cell at the same time. The lengths and complete complementarity of the sense RNAs and asRNAs can lead to long double-stranded RNAs (dsRNAs).

Ribonuclease III (RNase III) is a highly conserved endoribonuclease that specifically cleaves dsRNAs and regulates gene expression in E. coli and other bacteria (7⇓⇓–10). Lasa et al. (9) recently demonstrated that RNase III plays a central role in a type of antisense regulation specific for Gram-positive bacteria. Deep sequencing of both short and long RNA fractions in WT and RNase III mutant strains detected a genome-wide RNase III-dependent processing of overlapping transcripts into short, 22-nt RNAs. Three-quarters of sense RNAs from annotated genes appear to be processed via RNase III-dependent asRNA regulation in Staphylococcus aureus. Lasa et al. reported that several other Gram-positive bacteria show a similar pattern of RNase III-dependent short RNAs. However, S. enterica, the sole Gram-negative species tested in the study, did not exhibit the same pattern of short sense and antisense complementary RNAs as the Gram-positive bacteria, suggesting that the mechanism may not exist or may differ in Gram-negative bacteria. In agreement with the foregoing findings, deep sequencing of RNA coimmunoprecipitated with WT or cleavage mutants of RNase III in S. aureus was found to capture low abundant asRNAs that cover 44% of annotated genes (11).

In the present study, we identified functional asRNAs using an in vivo approach in E. coli. We hypothesized that a subset of functional asRNAs would be in dsRNAs, because an RNA–RNA interaction is required for most mechanisms of regulation via known asRNAs. Thus, we developed a method to isolate dsRNAs from total RNA by immunoprecipitation with a monoclonal antibody specific for dsRNA. We expected that dsRNAs would be more abundant in an RNase III mutant strain, and thus we deep-sequenced cDNA libraries of the total RNA (input) and immunoprecipitated dsRNA (IP) from WT and RNase III mutant strains. We have identified and confirmed the expression of numerous asRNAs that are potentially functional and have developed a methodology that is broadly applicable for identifying functional asRNAs in both eukaryotic and prokaryotic organisms.

Discussion The reported estimates of asRNA transcripts vary greatly among bacteria, from 1% to 75% of annotated genes, and recent reports suggest that some, if not most, antisense transcripts are nonfunctional products of pervasive transcription (1⇓–3, 9, 11). In contrast, we have identified a subset of asRNAs that are physiologically relevant and base pair with their complementary target RNAs, forming dsRNAs. We have demonstrated that the dsRNAs are processed by RNase III and thus are more abundant in an RNase III mutant strain. The RNase III dependence of the transcripts, as well as the in vivo immunoprecipitation of dsRNAs, indicate that the IP-dsRNAs that we have identified are dsRNAs in vivo. Furthermore, our data indicate that in the absence of an active RNase III, the dsRNA regions of transcripts are stable and more abundant than the single-stranded regions of the same transcripts. We hypothesize that other RNases degrade and process the single-stranded regions of the transcripts, whereas the dsRNA regions remain in the mutant strain. In contrast, in the WT strain, the dsRNA regions are mostly undetectable or similar in abundance to the single-stranded regions of transcripts. To identify a functional subset of asRNAs, we developed an immunoprecipitation method to isolate dsRNA in both WT and RNase III mutant strains and used stringent statistical modeling to identify IP-dsRNAs. Initial inspection of the data revealed two coverage patterns among IP-dsRNAs; thus, we used two models to assign scores to the potential dsRNA regions, resulting in two classes of IP-dsRNAs (Fig. 2). Class I IP-dsRNAs are formed by two transcripts that are differentially expressed in the input libraries. The sense strand is highly abundant, whereas the antisense strand is absent or hardly detectible. We expect that under certain environmental conditions, the asRNA may be up-regulated and repress expression of the sense RNA. On the other hand, class II IP-dsRNAs are formed by transcripts expressed equally on both strands in the input libraries, implying that they are coregulated. Applying these two models to our data, we have identified many previously reported types of functional asRNAs, including asRNAs transcribed opposite noncoding sRNAs and asRNAs resulting from overlapping long 5′ UTRs of divergently transcribed genes. In addition, we have identified a large category of novel chromosomal asRNAs transcribed opposite the 5′/intergenic ends of nondivergently transcribed genes and gene junctions within operons. These cis-asRNAs appear to be ncRNAs that do not code for any annotated ORFs. As proof of principle, we identified 9 of the 18 known and characterized dsRNAs in E. coli. Six of the known dsRNAs that we did not identify in our analyses had very low coverage in both the input and the IP libraries. The other three were not identified as IP-dsRNAs because their plus and minus strand coverages were too different. Although our data suggest that a dsRNA could have been pulled down in our experiment in these regions, given that reads were found in both strands, these regions did not stand sufficiently apart from the background to be identified as IP-dsRNAs. Most of the known RNA-based mechanisms of gene regulation, including transacting sRNAs, influence regulation at the 5′ end of genes, likely because it is energetically favorable. In agreement with this, 50% of the IP-dsRNA regions are located in the 5′ UTR of genes, whereas only 0.5% of the IP-dsRNAs are located in the 3′ UTR, suggesting that the main mechanism of asRNA gene regulation via dsRNA intermediates occurs at the 5′ end of transcripts. The genomic organization of genetic elements has long been thought to play a role in the coordinate regulation of genes, whether coexpressed or differentially expressed. Coordinate regulation of overlapping transcripts from divergently transcribed genes has been described in S. aureus and Listeria monocytogenes (11, 25). Lasa et al. (9) reported involvement of RNase III in the formation of short RNA fragments (∼22 nt) mapping to overlapping transcripts. They observed similar expression patterns in other Gram-positive bacteria but not in S. enterica, indicating a Gram-positive–specific mechanism, possibly owing to different collections of RNases, helicases, and other RNA-binding proteins. We have identified RNase III-dependent IP-dsRNAs localizing to overlapping 5′ UTRs of numerous divergent genes, suggesting a basis for their coregulation. Because our approach does not allow for reliable identification of dsRNA fragments shorter than 40 bp, we cannot exclude the possibility of RNase III-processed short RNA fragments in E. coli. The molecular mechanism involving asRNAs, formation of dsRNAs, and RNase III remains to be elucidated. Recent transcriptome-wide studies of L. monocytogenes identified a dual functional group of long antisense transcripts (lasRNAs), termed excludons, which negatively regulate one ORF via an antisense mechanism while simultaneously contributing to the transcription of adjacent, divergently transcribed ORFs. Some of the divergent and full ORF class I differentially expressed asRNAs may be excludons. Experimental validation of specific examples is needed to determine whether the excludon paradigm of asRNA-mediated gene regulation occurs in E. coli. asRNAs transcribed opposite to noncoding sRNAs (anti-ncRNAs) also have been reported previously (11), but few examples have been functionally characterized or validated by traditional methods, such as Northern blot analysis or quantitative RT-PCR. We have identified 23 anti-ncRNAs and validated the sequencing data by Northern blot analysis for seven examples. All of the anti-ncRNAs that we tested were regulated by RNase III and were detected by Northern blot analyses only in the rnc105 mutant strain. In a recent study, anti-ncRNAs were identified through an RNase III coimmunoprecipitation (11); however, the authors did not detect many of the anti-ncRNAs by Northern blot analysis, and they suggested these ncRNAs may be a result of pervasive transcription. In contrast to that finding, our data suggest that some anti-ncRNAs are biologically relevant and base pair with the ncRNAs. We hypothesize that the levels of anti-ncRNAs may increase and regulate the levels of the ncRNAs under certain stress or recovery conditions. We also have identified eight dsRNA regions overlapping or neighboring phage and transposase genes on the chromosome. Most cis-asRNAs were first identified and characterized in plasmids, phages, and transposons and are responsible for the repression of these elements; thus, the identification of asRNA transcripts opposite transposase and phage genes was not surprising. However, most of the observed dsRNA regions associated with transposase genes are found downstream of the gene in the intergenic space. We confirmed two such examples by Northern blot analyses (SI Appendix, Fig. S7). tRNA, tmRNA, and several sRNAs, located adjacent to attachment and integration sites of phages or transposons, have been proposed to be acquired through horizontal gene transfer (26). The small IP-dsRNA regions observed downstream of the phage and transposase genes may be novel horizontally acquired sRNAs. In addition, the sRNAs identified in IP-dsRNAs may have been associated with transposases when they entered the genome. Surprisingly, we found asRNAs encoded opposite of five type II toxin-antitoxin (TA) systems in E. coli (SI Appendix and Dataset S1). The type II TA systems consist of a toxin and an antitoxin protein expressed from two tandem genes. The toxin and antitoxin form a stable protein complex that results in inhibition of the toxin. In contrast, type I TA systems regulate synthesis of the toxin by inhibiting its efficient translation via an asRNA. The type II TA systems have not been shown to include such an asRNA regulation mechanism, however. Northern blot analyses confirmed the presence of the asRNA for both mqsR and yoeB toxins (Fig. 5 and SI Appendix, Fig. S8). Our identification of asRNAs encoded opposite the type II TA systems suggests an additional level of regulation. The mqsR/mqsA and yefM/yoeB TA steady-state transcripts are more abundant in the rnc105 mutant strain, suggesting that RNase III plays a role in their regulation. However, many TA systems are activated by cell stress. The absence of an active RNase III may stress the cell, and thus the increased number of TA transcripts observed may be attributed to increased transcription, rather than to derepression via RNase III. Further mechanistic analyses are needed to understand the role of asRNAs in regulation of type II TA systems. Finally, the amount of dsRNA identified in our study suggests that many sense/antisense pairs of RNAs in the cell base pair and form dsRNAs. The binding kinetics of two folded RNAs is worth considering. The ability of the two RNAs to form a dsRNA depends on several factors, including the individual secondary structures of both RNAs (accessibility of certain nucleotides), the relative amounts of the two RNAs in the cell, the presence of ribosomes on the mRNA occluding nucleotides, and the presence of proteins that may interact with both RNAs. The RNA chaperone Hfq is required for many trans-encoded sRNAs that regulate target mRNA through partially complementary base pairing. Recently, Ross et al. (27) demonstrated that Hfq also regulates the binding of a cis-asRNA RNA-OUT with RNA-IN of the Tn10/IS10 transposition system in E. coli. The authors suggested that Hfq may be involved in regulating other asRNA-dependent gene regulation systems. If Hfq does not play a role in facilitating antisense/sense RNA pairing, then another RNA chaperone likely does so. The majority of functionally characterized regulatory RNAs are differentially expressed and regulated by environmental signals. We expect that applying our protocol to bacteria grown under different environmental conditions will identify more functional asRNAs. The IP-dsRNA sequences identified in this study could regulate gene expression by several known antisense mechanisms, including transcript stabilization or destabilization, ribosome binding site accessibility, and transcription attenuation. We hypothesize that RNase III plays a direct role in antisense gene regulation by processing or degrading the dsRNA region altering the stability, structure, or RBS availability of the mRNA. In addition, we postulate that the dsRNA region alone may alter the mRNA via the same mechanisms, but that RNase III degrades the complex only after the regulation has occurred. Regardless of the role of RNase III, our data suggest that asRNAs via dsRNA constitute a broad mechanism of gene regulation.

Materials and Methods Bacterial Strains and RNA Isolation. E. coli strains SDF204 (W3110rnc+ TD1-17::Tn10) and SDF205 (W3110rnc105 TD1-17::Tn10) were grown in LB medium to log phase (OD 600 ∼0.5). For Northern blot analyses, total RNA was isolated using a hot phenol protocol described by Jahn et al. (28), unless stated otherwise. Total RNA isolated for immunodot blots and immunoprecipitations followed the hot phenol protocol described above with several modifications to minimize dsRNA artifacts, as described in detail in SI Appendix, Materials and Methods. Immunoprecipitation Assays. Total RNA or in vitro transcribed RNAs were incubated with J2 monoclonal anti-dsRNA antibodies at different ratios at 4 °C overnight in 1× PBS and 0.1% Tween 20 with 2 units of RNasin (Promega). Dynabeads Protein A (Invitrogen) were used to immunoprecipitate the antibody–RNA complexes. The Dynabeads were prepared as suggested by the manufacturer. The antibody–RNA solutions were then added to the beads, gently mixed, and incubated for 10 min at room temperature while rotating. The tubes were then moved to a magnetic stand, after which the supernatant was removed. The beads were washed four times and then resuspended with 1× PBS and 0.1% Tween-20. The beads with the antibody–RNA complexes bound were then subjected to phenol/chloroform/isoamyl alcohol (25:24:1) extraction using a Phase Lock Gel tube. The aqueous phase was ethanol-precipitated and analyzed either by SDS/PAGE and SYBR Safe staining or with an Agilent 2100 Bioanalyzer on a picoRNA chip. cDNA Library Preparation. Directional (strand-specific) RNA-seq cDNA libraries were constructed following a ligation-based protocol described in detail in SI Appendix, Materials and Methods. Deep-Sequencing Analyses. Reads were mapped to the E.coli K12 genome using Bowtie 2 (29) in local mode with default parameters. Scores for the class I and class II models determined by scoring models, which are described in detail, along with the remainder of the analysis, in SI Appendix, Materials and Methods.

Acknowledgments We thank Scott Samuels, Vitaly Sedlyarov, and members of the Schroeder laboratory for thoughtful and critical readings of the manuscript; Andreas Sommer, Arnt von Haeseler, Micheal Wolfinger, Hakim Tafer, and members of the Schroeder laboratory for useful discussions; and Johanna Stranner for excellent technical assistance. This work was supported by the Austrian Science Fund (Grants FWF I538-B12, F4301, and F4308) and the University of Vienna.

Footnotes Author contributions: M.L., I.B., and N.T. designed research; M.L. and N.T. performed research; M.L., B.Z., I.B., and N.T. analyzed data; and M.L., B.Z., I.B., and R.S. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: Sequences have been deposited at the National Center for Biotechnology Information Sequence Read Archive (Study accession no. SRP028119: experiment accession nos. SRX326854, SRX326853, SRX326852, and SRX326842).

See Commentary on page 2868.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1315974111/-/DCSupplemental.