When the spike protein of SARS-CoV-2 binds to the receptor of the host cell, the virus enters the cell, and then the envelope is peeled off, which let genomic RNA be present in the cytoplasm. The ORF1a and ORF1b RNAs are made by genomic RNA, and then translated into pp1a and pp1ab proteins, respectively. Protein pp1a and ppa1b are cleaved by protease to make a total of 16 nonstructural proteins. Some nonstructural proteins form a replication/transcription complex (RNA-dependent RNA polymerase, RdRp), which use the (+) strand genomic RNA as a template. The (+) strand genomic RNA produced through the replication process becomes the genome of the new virus particle. Subgenomic RNAs produced through the transcription are translated into structural proteins (S: spike protein, E: envelope protein, M: membrane protein, and N: nucleocapsid protein) which form a viral particle. Spike, envelope and membrane proteins enter the endoplasmic reticulum, and the nucleocapsid protein is combined with the (+) strand genomic RNA to become a nucleoprotein complex. They merge into the complete virus particle in the endoplasmic reticulum-Golgi apparatus compartment, and are excreted to extracellular region through the Golgi apparatus and the vesicle. Credit: IBS

Jean and Peter Medawar wrote in 1977 that a virus is "simply a piece of bad news wrapped up in proteins." The 'bad news' in the SARS-CoV-2 case is the new coronavirus carries its mysterious genome in the form of a very long ribonucleic acid (RNA) molecule. Grappling with COVID-19 pandemic, the world seems to be lost with no sense of direction in uncovering what this coronavirus (SARS-Cov-2) is composed of. Being an RNA virus, SARS-Cov-2 enters host cells and replicates a genomic RNA and produces many smaller RNAs (called 'subgenomic RNAs'). These subgenomic RNAs are used for the synthesis of various proteins (spikes, envelopes, etc.) that are required for the beginning of SARS-Cov-2 lineage. Thus, the smaller RNAs make good targets for interfering with the new coronavirus's conquering of our immune system. Though recent studies reported the sequence of the RNA genome, they only predicted where their genes might be, leaving the world still drowning in disorientation.

Led by Professors Kim V. Narry and Chang Hyeshik, the research team of the Center for RNA Research within the Institute for Basic Science (IBS), South Korea, succeeded in dissecting the architecture of the SARS-CoV-2 RNA genome, in collaboration with the Korea National Institute of Health (KNIH) within the Korea Centers for Disease Control & Prevention (KCDC). The researchers experimentally confirmed the predicted subgenomic RNAs that are in turn translated into viral proteins. Furthermore, they analyzed the sequence information of each RNA and revealed where genes are exactly located on a genomic RNA. "Not only detailing the structure of SARS-CoV-2, we also discovered numerous new RNAs and multiple unknown chemical modification on the viral RNAs. Our work provides a high-resolution map of SARS-CoV-2. This map will help understand how the virus replicates and how it escapes the human defense system," explains Professor Kim V. Narry, the corresponding author of the study.

It was previously known that 10 subgenomic RNAs make up the viral particle structure. However, the research team confirmed that 9 subgenomic RNAs actually exist, invalidating the remaining one subgenomic RNA. Researchers also found that there are dozens of unknown subgenomic RNAs, owing to RNA fusion and deletion events. "Though it requires further investigation, these molecular events may lead to the relatively rapid evolution of coronavirus. Moreover, we find multiple unknown chemical modifications on the viral RNAs. It is unclear yet what these modifications do, but a possibility is that they may assist the virus to avoid attack from the host," says Prof. Kim.

SARS-CoV-2 RNAs are known to consists of ORF1a, ORF1b, ORFS, ORFE, ORFM, ORFN, ORF3a, ORF6, ORF7a, ORF7b, ORF8, and ORF10. This study, all RNAs except ORF10 were experimentally validated. The prediction that ORF10 exists seems to be wrong. There are nine subgenomic RNAs (S, E, M, N, 3a, 6, 7a, 7b, 8) indeed transcribed from genomic RNAs. Among them, S, E, M, and N RNAs are translated into each protein, respectively, forming a structure of virus particle (S: spike protein, E: envelope protein, M: membrane protein, and N: nucleocapsid protein). Credit: IBS

The research team suggests that modified RNAs may have new properties that are different from unmodified RNAs even though they have the same genetic information in terms of RNA base sequence. They believe if they figure out the unknown characteristics of RNA, the findings may offer a new clue for combatting the new coronavirus. Newly discovered chemical modification will also help to understand the life cycle of the virus.

Behind the success of the study is the research team's pairing of two complementary sequencing techniques—DNA nanoball sequencing and nanopore direct RNA sequencing. Nanopore direct RNA sequencing allows for direct analysis of the entire long viral RNA without fragmentation. Conventional RNA sequencing methods usually require a step-by-step process of cutting and converting RNA to DNA before reading RNA. Meanwhile, DNA nanoball sequencing can read only short fragments, but has the advantage of analyzing a large number of sequences with high accuracy. These two techniques turned out to be highly complementary to each other to analyze the viral RNAs.

Modification levels are different between RNA transcripts, and the most frequent modification site is designated by red arrowhead. Credit: IBS

"Now we have secured a high resolution gene map of the new coronavirus that guides us to find each bit of genes on all of the total SARS-CoV-2 RNAs (transcriptome) and all modification RNAs (epitranscriptome). It is time to explore the functions of the newly discovered genes and the mechanism underlying viral gene fusion. We also have to work on the RNA modifications to see if they play a role in virus replication and immune response. We firmly believe that our study will contribute to the development of diagnostics and therapeutics to combat the virus more effectively," notes Professor Kim V. Narry.

Explore further Missing link in coronavirus jump from bats to humans could be pangolins, not snakes

More information: Kim, D et al. The architecture of SARS-CoV-2 transcriptome. Cell (2020) Journal information: Cell Kim, D et al. The architecture of SARS-CoV-2 transcriptome.(2020) DOI: 10.1016/j.cell.2020.04.011