Phylogenetic analyses of 2019-nCoV/SARS-CoV-2

To date, seven pathogenic HCoVs (Fig. 2a, b) have been found:1,29 (i) 2019-nCoV/SARS-CoV-2, SARS-CoV, MERS-CoV, HCoV-OC43, and HCoV-HKU1 are β genera, and (ii) HCoV-NL63 and HCoV-229E are α genera. We performed the phylogenetic analyses using the whole-genome sequence data from 15 HCoVs to inspect the evolutionary relationship of 2019-nCoV/SARS-CoV-2 with other HCoVs. We found that the whole genomes of 2019-nCoV/SARS-CoV-2 had ~99.99% nucleotide sequence identity across three diagnosed patients (Supplementary Table S1). The 2019-nCoV/SARS-CoV-2 shares the highest nucleotide sequence identity (79.7%) with SARS-CoV among the six other known pathogenic HCoVs, revealing conserved evolutionary relationship between 2019-nCoV/SARS-CoV-2 and SARS-CoV (Fig. 2a).

Fig. 2: Phylogenetic analysis of coronaviruses. a Phylogenetic tree of coronavirus (CoV). Phylogenetic algorithm analyzed evolutionary conservation among whole genomes of 15 coronaviruses. Red color highlights the recent emergent coronavirus, 2019-nCoV/SARS-CoV-2. Numbers on the branches indicate bootstrap support values. The scale shows the evolutionary distance computed using the p-distance method. b Schematic plot for HCoV genomes. The genus and host information of viruses was labeled on the left by different colors. Empty dark gray boxes represent accessory open reading frames (ORFs). c–e The 3D structures of SARS-CoV nsp12 (PDB ID: 6NUR) (c), spike (PDB ID: 6ACK) (d), and nucleocapsid (PDB ID: 2CJR) (e) shown were based on homology modeling. Genome information and phylogenetic analysis results are provided in Supplementary Tables S1 and S2. Full size image

HCoVs have five major protein regions for virus structure assembly and viral replications29, including replicase complex (ORF1ab), spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins (Fig. 2b). The ORF1ab gene encodes the non-structural proteins (nsp) of viral RNA synthesis complex through proteolytic processing30. The nsp12 is a viral RNA-dependent RNA polymerase, together with co-factors nsp7 and nsp8 possessing high polymerase activity. From the protein 3D structure view of SARS-CoV nsp12, it contains a larger N-terminal extension (which binds to nsp7 and nsp8) and polymerase domain (Fig. 2c). The spike is a transmembrane glycoprotein that plays a pivotal role in mediating viral infection through binding the host receptor31,32. Figure 2d shows the 3D structure of the spike protein bound with the host receptor angiotensin converting enznyme2 (ACE2) in SARS-CoV (PDB ID: 6ACK). A recent study showed that 2019-nCoV/SARS-CoV-2 is able to utilize ACE2 as an entry receptor in ACE2-expressing cells33, suggesting potential drug targets for therapeutic development. Furthermore, cryo-EM structure of the spike and biophysical assays reveal that the 2019-nCoV/SARS-CoV-2 spike binds ACE2 with higher affinity than SARS-CoV34. In addition, the nucleocapsid is also an important subunit for packaging the viral genome through protein oligomerization35, and the single nucleocapsid structure is shown in Fig. 2e.

Protein sequence alignment analyses indicated that the 2019-nCoV/SARS-CoV-2 was most evolutionarily conserved with SARS-CoV (Supplementary Table S2). Specifically, the envelope and nucleocapsid proteins of 2019-nCoV/SARS-CoV-2 are two evolutionarily conserved regions, with sequence identities of 96% and 89.6%, respectively, compared to SARS-CoV (Supplementary Table S2). However, the spike protein exhibited the lowest sequence conservation (sequence identity of 77%) between 2019-nCoV/SARS-CoV-2 and SARS-CoV. Meanwhile, the spike protein of 2019-nCoV/SARS-CoV-2 only has 31.9% sequence identity compared to MERS-CoV.

HCoV–host interactome network

To depict the HCoV–host interactome network, we assembled the CoV-associated host proteins from four known HCoVs (SARS-CoV, MERS-CoV, HCoV-229E, and HCoV-NL63), one mouse MHV, and one avian IBV (N protein) (Supplementary Table S3). In total, we obtained 119 host proteins associated with CoVs with various experimental evidence. Specifically, these host proteins are either the direct targets of HCoV proteins or are involved in crucial pathways of HCoV infection. The HCoV–host interactome network is shown in Fig. 3a. We identified several hub proteins including JUN, XPO1, NPM1, and HNRNPA1, with the highest number of connections within the 119 proteins. KEGG pathway enrichment analysis revealed multiple significant biological pathways (adjusted P value < 0.05), including measles, RNA transport, NF-kappa B signaling, Epstein-Barr virus infection, and influenza (Fig. 3b). Gene ontology (GO) biological process enrichment analysis further confirmed multiple viral infection-related processes (adjusted P value < 0.001), including viral life cycle, modulation by virus of host morphology or physiology, viral process, positive regulation of viral life cycle, transport of virus, and virion attachment to host cell (Fig. 3c). We then mapped the known drug–target network (see Materials and methods) into the HCoV–host interactome to search for druggable, cellular targets. We found that 47 human proteins (39%, blue nodes in Fig. 3a) can be targeted by at least one approved drug or experimental drug under clinical trials. For example, GSK3B, DPP4, SMAD3, PARP1, and IKBKB are the most targetable proteins. The high druggability of HCoV–host interactome motivates us to develop a drug repurposing strategy by specifically targeting cellular proteins associated with HCoVs for potential treatment of 2019-nCoV/SARS-CoV-2.

Fig. 3: Drug-target network analysis of the HCoV–host interactome. a A subnetwork highlighting the HCoV–host interactome. Nodes represent three types of HCoV-associated host proteins: targetgable (proteins can be targeted by approved drugs or drugs under clinical trials), non-targetable (proteins do not have any known ligands), neighbors (protein–protein interaction partners). Edge colors indicate five types of experimental evidence of the protein–protein interactions (see Materials and methods). 3D three-dimensional structure. b, c KEGG human pathway (b) and gene ontology enrichment analyses (c) for the HCoV-associated proteins. Full size image

Network-based drug repurposing for HCoVs

The basis for the proposed network-based drug repurposing methodologies rests on the notions that the proteins that associate with and functionally govern viral infection are localized in the corresponding subnetwork (Fig. 1a) within the comprehensive human interactome network. For a drug with multiple targets to be effective against an HCoV, its target proteins should be within or in the immediate vicinity of the corresponding subnetwork in the human protein–protein interactome (Fig. 1), as we demonstrated in multiple diseases13,22,23,28 using this network-based strategy. We used a state-of-the-art network proximity measure to quantify the relationship between HCoV-specific subnetwork (Fig. 3a) and drug targets in the human interactome. We constructed a drug–target network by assembling target information for more than 2000 FDA-approved or experimental drugs (see Materials and methods). To improve the quality and completeness of the human protein interactome network, we integrated PPIs with five types of experimental data: (1) binary PPIs from 3D protein structures; (2) binary PPIs from unbiased high-throughput yeast-two-hybrid assays; (3) experimentally identified kinase-substrate interactions; (4) signaling networks derived from experimental data; and (5) literature-derived PPIs with various experimental evidence (see Materials and methods). We used a Z-score (Z) measure and permutation test to reduce the study bias in network proximity analyses (including hub nodes in the human interactome network by literature-derived PPI data bias) as described in our recent studies13,28.

In total, we computationally identified 135 drugs that were associated (Z < −1.5 and P < 0.05, permutation test) with the HCoV–host interactome (Fig. 4a, Supplementary Tables S4 and 5). To validate bias of the pooled cellular proteins from six CoVs, we further calculated the network proximities of all the drugs for four CoVs with a large number of know host proteins, including SARS-CoV, MERS-CoV, IBV, and MHV, separately. We found that the Z-scores showed consistency among the pooled 119 HCoV-associated proteins and other four individual CoVs (Fig. 4b). The Pearson correlation coefficients of the proximities of all the drugs for the pooled HCoV are 0.926 vs. SARS-CoV (P < 0.001, t distribution), 0.503 vs. MERS-CoV (P < 0.001), 0.694 vs. IBV (P < 0.001), and 0.829 vs. MHV (P < 0.001). These network proximity analyses offer putative repurposable candidates for potential prevention and treatment of HCoVs.

Fig. 4: A discovered drug-HCoV network. a A subnetwork highlighting network-predicted drug-HCoV associations connecting 135 drugs and HCoVs. From the 2938 drugs evaluated, 135 ones achieved significant proximities between drug targets and the HCoV-associated proteins in the human interactome network. Drugs are colored by their first-level of the Anatomical Therapeutic Chemical (ATC) classification system code. b A heatmap highlighting network proximity values for SARS-CoV, MERS-CoV, IBV, and MHV, respectively. Color key denotes network proximity (Z-score) between drug targets and the HCoV-associated proteins in the human interactome network. P value was computed by permutation test. Full size image

Discovery of repurposable drugs for HCoV

To further validate the 135 repurposable drugs against HCoVs, we first performed gene set enrichment analysis (GSEA) using transcriptome data of MERS-CoV and SARS-CoV infected host cells (see Methods). These transcriptome data were used as gene signatures for HCoVs. Additionally, we downloaded the gene expression data of drug-treated human cell lines from the Connectivity Map (CMAP) database36 to obtain drug–gene signatures. We calculated a GSEA score (see Methods) for each drug and used this score as an indication of bioinformatics validation of the 135 drugs. Specifically, an enrichment score (ES) was calculated for each HCoV data set, and ES > 0 and P < 0.05 (permutation test) was used as cut-off for a significant association of gene signatures between a drug and a specific HCoV data set. The GSEA score, ranging from 0 to 3, is the number of data sets that met these criteria for a specific drug. Mesalazine (an approved drug for inflammatory bowel disease), sirolimus (an approved immunosuppressive drug), and equilin (an approved agonist of the estrogen receptor for menopausal symptoms) achieved the highest GSEA scores of 3, followed by paroxetine and melatonin with GSEA scores of 2. We next selected 16 high-confidence repurposable drugs (Fig. 5a and Table 1) against HCoVs using subject matter expertise based on a combination of factors: (i) strength of the network-predicted associations (a smaller network proximity score in Supplementary Table S4); (ii) validation by GSEA analyses; (iii) literature-reported antiviral evidence, and (iv) fewer clinically reported side effects. Specifically, we showcased several selected repurposable drugs with literature-reported antiviral evidence as below.

Fig. 5: A discovered drug-protein-HCoV network for 16 candidate repurposable drugs. a Network-predicted evidence and gene set enrichment analysis (GSEA) scores for 16 potential repurposable drugs for HCoVs. The overall connectivity of the top drug candidates to the HCoV-associated proteins was examined. Most of these drugs indirectly target HCoV-associated proteins via the human protein–protein interaction networks. All the drug–target-HCoV-associated protein connections were examined, and those proteins with at least five connections are shown. The box heights for the proteins indicate the number of connections. GSEA scores for eight drugs were not available (NA) due to the lack of transcriptome profiles for the drugs. b–e Inferred mechanism-of-action networks for four selected drugs: b toremifene (first-generation nonsteroidal-selective estrogen receptor modulator), c irbesartan (an angiotensin receptor blocker), d mercaptopurine (an antimetabolite antineoplastic agent with immunosuppressant properties), and e melatonin (a biogenic amine for treating circadian rhythm sleep disorders). Full size image

Table 1 Top 16 network-predicted repurposable drugs with literature-derived antiviral evidence. Full size table

Selective estrogen receptor modulators

An overexpression of estrogen receptor has been shown to play a crucial role in inhibiting viral replication37. Selective estrogen receptor modulators (SERMs) have been reported to play a broader role in inhibiting viral replication through the non-classical pathways associated with estrogen receptor37. SERMs interfere at the post viral entry step and affect the triggering of fusion, as the SERMs’ antiviral activity still can be observed in the absence of detectable estrogen receptor expression18. Toremifene (Z = –3.23, Fig. 5a), the first generation of nonsteroidal SERM, exhibits potential effects in blocking various viral infections, including MERS-CoV, SARS-CoV, and Ebola virus in established cell lines17,38. Compared to the classical ESR1-related antiviral pathway, toremifene prevents fusion between the viral and endosomal membrane by interacting with and destabilizing the virus membrane glycoprotein, and eventually inhibiting viral replication39. As shown in Fig. 5b, toremifene potentially affects several key host proteins associated with HCoV, such as RPL19, HNRNPA1, NPM1, EIF3I, EIF3F, and EIF3E40,41. Equilin (Z = –2.52 and GSEA score = 3), an estrogenic steroid produced by horses, also has been proven to have moderate activity in inhibiting the entry of Zaire Ebola virus glycoprotein and human immunodeficiency virus (ZEBOV-GP/HIV)18. Altogether, network-predicted SERMs (such as toremifene and equilin) offer candidate repurposable drugs for 2019-nCoV/SARS-CoV-2.

Angiotensin receptor blockers

Angiotensin receptor blockers (ARBs) have been reported to associate with viral infection, including HCoVs42,43,44. Irbesartan (Z = –5.98), a typical ARB, was approved by the FDA for treatment of hypertension and diabetic nephropathy. Here, network proximity analysis shows a significant association between irbesartan’s targets and HCoV-associated host proteins in the human interactome. As shown in Fig. 5c, irbesartan targets SLC10A1, encoding the sodium/bile acid cotransporter (NTCP) protein that has been identified as a functional preS1-specific receptor for the hepatitis B virus (HBV) and the hepatitis delta virus (HDV). Irbesartan can inhibit NTCP, thus inhibiting viral entry45,46. SLC10A1 interacts with C11orf74, a potential transcriptional repressor that interacts with nsp-10 of SARS-CoV47. There are several other ARBs (such as eletriptan, frovatriptan, and zolmitriptan) in which their targets are potentially associated with HCoV-associated host proteins in the human interactome.

Immunosuppressant or antineoplastic agents

Previous studies have confirmed the mammalian target of rapamycin complex 1 (mTORC1) as the key factor in regulating various viruses’ replications, including Andes orthohantavirus and coronavirus48,49. Sirolimus (Z = –2.35 and GSEA score = 3), an inhibitor of mammalian target of rapamycin (mTOR), was reported to effectively block viral protein expression and virion release effectively50. Indeed, the latest study revealed the clinical application: sirolimus reduced MERS-CoV infection by over 60%51. Moreover, sirolimus usage in managing patients with severe H1N1 pneumonia and acute respiratory failure can improve those patients’ prognosis significantly50. Mercaptopurine (Z = –2.44 and GSEA score = 1), an antineoplastic agent with immunosuppressant property, has been used to treat cancer since the 1950s and expanded its application to several auto-immune diseases, including rheumatoid arthritis, systemic lupus erythematosus, and Crohn’s disease52. Mercaptopurine has been reported as a selective inhibitor of both SARS-CoV and MERS-CoV by targeting papain-like protease which plays key roles in viral maturation and antagonism to interferon stimulation53,54. Mechanistically, mercaptopurine potentially target several host proteins in HCoVs, such as JUN, PABPC1, NPM1, and NCL40,55 (Fig. 5d).

Anti-inflammatory agents

Inflammatory pathways play essential roles in viral infections56,57. As a biogenic amine, melatonin (N-acetyl-5-methoxytryptamine) (Z = –1.72 and GSEA score = 2) plays a key role in various biological processes, and offers a potential strategy in the management of viral infections58,59. Viral infections are often associated with immune-inflammatory injury, in which the level of oxidative stress increases significantly and leaves negative effects on the function of multiple organs60. The antioxidant effect of melatonin makes it a putative candidate drug to relieve patients’ clinical symptoms in antiviral treatment, even though melatonin cannot eradicate or even curb the viral replication or transcription61,62. In addition, the application of melatonin may prolong patients’ survival time, which may provide a chance for patients’ immune systems to recover and eventually eradicate the virus. As shown in Fig. 5e, melatonin indirectly targets several HCoV cellular targets, including ACE2, BCL2L1, JUN, and IKBKB. Eplerenone (Z = –1.59), an aldosterone receptor antagonist, is reported to have a similar anti-inflammatory effect as melatonin. By inhibiting mast-cell-derived proteinases and suppressing fibrosis, eplerenone can improve survival of mice infected with encephalomyocarditis virus63.

In summary, our network proximity analyses offer multiple candidate repurposable drugs that target diverse cellular pathways for potential prevention and treatment of 2019-nCoV/SARS-CoV-2. However, further preclinical experiments64 and clinical trials are required to verify the clinical benefits of these network-predicted candidates before clinical use.

Network-based identification of potential drug combinations for 2019-nCoV/SARS-CoV-2

Drug combinations, offering increased therapeutic efficacy and reduced toxicity, play an important role in treating various viral infections65. However, our ability to identify and validate effective combinations is limited by a combinatorial explosion, driven by both the large number of drug pairs and dosage combinations. In our recent study, we proposed a novel network-based methodology to identify clinically efficacious drug combinations28. Relying on approved drug combinations for hypertension and cancer, we found that a drug combination was therapeutically effective only if it was captured by the “Complementary Exposure” pattern: the targets of the drugs both hit the disease module, but target separate neighborhoods (Fig. 6a). Here we sought to identify drug combinations that may provide a synergistic effect in potentially treating 2019-nCoV/SARS-CoV-2 with well-defined mechanism-of-action by network analysis. For the 16 potential repurposable drugs (Fig. 5a, Table 1), we showcased three network-predicted candidate drug combinations for 2019-nCoV/SARS-CoV-2. All predicted possible combinations can be found in Supplementary Table S6.

Fig. 6: Network-based rational design of drug combinations for 2019-nCoV/SARS-CoV-2. a The possible exposure mode of the HCoV-associated protein module to the pairwise drug combinations. An effective drug combination will be captured by the “Complementary Exposure” pattern: the targets of the drugs both hit the HCoV–host subnetwork, but target separate neighborhoods in the human interactome network. Z CA and Z CB denote the network proximity (Z-score) between targets (Drugs A and B) and a specific HCoV. S AB denotes separation score (see Materials and methods) of targets between Drug A and Drug B. b–d Inferred mechanism-of-action networks for three selected pairwise drug combinations: b sirolimus (a potent immunosuppressant with both antifungal and antineoplastic properties) plus dactinomycin (an RNA synthesis inhibitor for treatment of various tumors), c toremifene (first-generation nonsteroidal-selective estrogen receptor modulator) plus emodin (an experimental drug for the treatment of polycystic kidney), and d melatonin (a biogenic amine for treating circadian rhythm sleep disorders) plus mercaptopurine (an antimetabolite antineoplastic agent with immunosuppressant properties). Full size image

Sirolimus plus Dactinomycin

Sirolimus, an inhibitor of mTOR with both antifungal and antineoplastic properties, has demonstrated to improve outcomes in patients with severe H1N1 pneumonia and acute respiratory failure50. The mTOR signaling plays an essential role for MERS-CoV infection66. Dactinomycin, also known actinomycin D, is an approved RNA synthesis inhibitor for treatment of various cancer types. An early study showed that dactinomycin (1 μg/ml) inhibited the growth of feline enteric CoV67. As shown in Fig. 6b, our network analysis shows that sirolimus and dactinomycin synergistically target HCoV-associated host protein subnetwork by “Complementary Exposure” pattern, offering potential combination regimens for treatment of HCoV. Specifically, sirolimus and dactinomycin may inhibit both mTOR signaling and RNA synthesis pathway (including DNA topoisomerase 2-alpha (TOP2A) and DNA topoisomerase 2-beta (TOP2B)) in HCoV-infected cells (Fig. 6b).

Toremifene plus Emodin

Toremifene is among the approved first-generation nonsteroidal SERMs for the treatment of metastatic breast cancer68. SERMs (including toremifene) inhibited Ebola virus infection18 by interacting with and destabilizing the Ebola virus glycoprotein39. In vitro assays have demonstrated that toremifene inhibited growth of MERS-CoV17,69 and SARA-CoV38 (Table 1). Emodin, an anthraquinone derivative extracted from the roots of rheum tanguticum, has been reported to have various anti-virus effects. Specifically, emdoin inhibited SARS-CoV-associated 3a protein70, and blocked an interaction between the SARS-CoV spike protein and ACE2 (ref. 71). Altogether, network analyses and published experimental data suggested that combining toremifene and emdoin offered a potential therapeutic approach for 2019-nCoV/SARS-CoV-2 (Fig. 6c).

Mercaptopurine plus Melatonin

As shown in Fig. 5a, targets of both mercaptopurine and melatonin showed strong network proximity with HCoV-associated host proteins in the human interactome network. Recent in vitro and in vivo studies identified mercaptopurine as a selective inhibitor of both SARS-CoV and MERS-CoV by targeting papain-like protease53,54. Melatonin was reported in potential antiviral infection via its anti-inflammatory and antioxidant effects58,59,60,61,62. Melatonin indirectly regulates ACE2 expression, a key entry receptor involved in viral infection of HCoVs, including 2019-nCoV/SARS-CoV-2 (ref. 33). Specifically, melatonin was reported to inhibit calmodulin and calmodulin interacts with ACE2 by inhibiting shedding of its ectodomain, a key infectious process of SARS-CoV72,73. JUN, also known as c-Jun, is a key host protein involving in HCoV infectious bronchitis virus74. As shown in Fig. 6d, mercaptopurine and melatonin may synergistically block c-Jun signaling by targeting multiple cellular targets. In summary, combination of mercaptopurine and melatonin may offer a potential combination therapy for 2019-nCoV/SARS-CoV-2 by synergistically targeting papain-like protease, ACE2, c-Jun signaling, and anti-inflammatory pathways (Fig. 6d). However, further experimental observations on ACE2 pathways by melatonin in 2019-nCoV/SARS-CoV-2 are highly warranted.