No statistical methods were used to predetermine sample size. The experiments were not randomized and investigators were not blinded to allocation during experiments and outcome assessment.

Cloning, protein expression and purification of SARS-CoV-2 Mpro

The full-length gene that encodes SARS-CoV-2 Mpro (NC_045512) was optimized and synthesized for E. coli expression (Genewiz). The cloning strategy for producing authentic viral Mpro has previously been reported10. The expression plasmid was transformed into E. coli BL21 (DE3) cells and then cultured in Luria broth medium containing 100 μg/ml ampicillin at 37 °C. When the cells were grown to an optical density at 600 nm of 0.6–0.8, 0.5 mM IPTG was added to the cell culture to induce the expression at 16 °C. After 10 h, the cells were collected by centrifugation at 3,000g. The cell pellets were resuspended in lysis buffer (20 mM Tris-HCl pH 8.0, 300 mM NaCl), lysed by high-pressure homogenization, and then centrifuged at 25,000g for 40 min. The supernatant was loaded onto Ni-NTA affinity column (Qiagen), and washed in the resuspension buffer containing 20 mM imidazole. The His-tagged Mpro was eluted by cleavage buffer (50 mM Tris-HCl pH 7.0, 150 mM NaCl) including 300 mM imidazole. Human rhinovirus 3C protease was added to remove the C-terminal His tag. The Mpro was further purified by ion-exchange chromatography and size-exclusion chromatography. Coronavirus Mpro exists as a mixture of monomers and dimers in solution33. The purified Mpro was stored in 50 mM Tris-HCl pH 7.3, 1 mM EDTA.

Crystallization, data collection and structure determination

SARS-CoV-2 Mpro was incubated with 10 mM N3 for 30 min and the complex (5 mg/ml) was crystallized by hanging drop vapour diffusion method at 20 °C. The best crystals were grown with well buffer containing 0.1 M MES pH 6.0, 2% polyethylene glycol (PEG) 6000, 3% DMSO, 1 mM DTT. The cryo-protectant solution contained 0.1 M MES pH 6.0, 30% PEG 400.

X-ray data were collected on beamline BL17U1 at Shanghai Synchrotron Radiation Facility (SSRF) at 100 K and at a wavelength of 1.07180 Å using an Eiger X 16M image plate detector. Data integration and scaling were performed using the program Xia234. The structure was determined by molecular replacement with the Phaser module35 in CCP436 using the SARS-CoV Mpro (RCSP Protein Data Bank code (PDB) 2H2Z) as a search template. The output model from molecular replacement was subsequently subjected to iterative cycles of manual model adjustment with Coot37 and refinement was finished with Phenix38. The inhibitor N3 was built according to the omit map. The phasing and refinement statistics are summarized in Extended Data Table 1. The R work and R free values are 0.202 and 0.235, respectively. There are 97.3% of the residues in the most favoured regions of the Ramachandran plot, and no residues are found in disallowed regions.

Enzymatic activity and inhibition assays

The enzyme activity assays have previously been described10. In brief, the activity of SARS-CoV-2 Mpro was measured by a continuous kinetic assay, with the substrate Mca–AVLQ↓SGFR-K(Dnp)K (GL Biochem), using wavelengths of 320 nm and 405 nm for excitation and emission, respectively. The assay started by immediately mixing 0.2 μM SARS-CoV-2 Mpro with different concentrations of substrate (2.5–100 μM). Fluorescence intensity was monitored with an EnVision multimode plate reader (Perkin Elmer). Initial rates were obtained by fitting the linear portion of the curves to a straight line. The kinetic parameters K m and k cat were calculated from a double-reciprocal plot. As N3 is a mechanism-based irreversible inhibitor for SARS-CoV-2 Mpro, k obs /[I] was used as an approximation of the pseudo-second-order rate constant to evaluate the inhibition effect of the inhibitor N3 (ref. 12). In this case, the measurement was carried out with 0.2 μM of enzyme, 20 μM of substrate and inhibitor at 6 different concentrations (0–1 μM).

Virtual screening

The virtual screening was performed using our in-house database via a workflow application of Glide (v.8.2)22 in Maestro (Schrödinger 2019-1a). All compounds in the database were considered to be at pH 7.4 ± 0.2 to estimate their protonation state using the program EpiK39. Their three-dimensional (3D) conformations were generated by the ligPrep module of Maestro. The structure of SARS-CoV-2 Mpro (PDB 6LU7) was used to generate the receptor grid for docking simulations. The centre of the active site of the grid was determined according to the position of N3 in the structure. The flexibility of the receptor hydroxyl and thiol groups in side chains of C145, S46 and Y54 were considered. At the very beginning, we performed a relatively fast but raw screening using the Glide standard precision model, and the top 20% of compounds were kept. Finally, the candidate molecules were picked by analysing the predicted binding modes and their scores.

High-throughput drug screening and IC 50 measurement

Potential inhibitors against SARS-CoV-2 Mpro were screened by an enzymatic inhibition assay. When the different compounds were added into the enzymatic reaction mixture, the change of initial rates was calculated to evaluate their inhibitory effect. Five drug libraries—the Approved Drug Library (Target Mol), Clinic Compound Library (Target Mol), FDA-approved Drug Library (Selleck), Natural Product Library (Selleck), and Anti-virus Drug Library (Shanghai Institute for Advanced Immunochemical Studies)—that together comprised about 10,000 compounds were used. The preliminary screening reaction mixture included 0.2 μM protein, 20 μM substrate and 50 μM compounds. The compounds of interest were defined as those with a percentage of inhibition over 60% compared with the reaction in the absence of inhibitor. IC 50 values of 7 drug leads were measured using 0.2 μM protein, 20 μM substrate and 11 different inhibitor concentrations. To exclude inhibitors possibly acting as aggregators, a detergent-based control was performed by adding 0.001% or 0.01% freshly made up Triton X-100 to the reaction at the same time25. All experimental data was analysed using GraphPad Prism. All experiments were performed in triplicate.

Molecular docking

To understand the binding interaction of these molecules with SARS-CoV-2 Mpro, two different molecular docking methods (Glide (v.8.2)22 and iFitDock40) were used to predict their binding poses. Then, a 3D molecular similarity calculation method, SHAFTS41, was used for enumeration of the molecular alignment poses by matching the critical pharmacophore and volumetric overlay between the N3 molecule within the Mpro structure and the other drug candidates. Then, the obtained optimal superposition of these molecules was used to assess the reasonability of the predicted binding poses from the two docking methods, and only the binding orientations that were consistent among different methods were kept for constructing the initial complexes. Finally, these complexes were further optimized and re-scored using the MM-GBSA module42 of Schrödinger, and the residues within 5 Å around the ligand were refined.

Antiviral and cytotoxicity assays for compounds from high-throughput screening

The in vitro antiviral efficacy of the drug candidates on Vero cells was determined by qRT–PCR. About 1 × 104 Vero cells were seeded into a 96-well plate and incubated for 20–24 h at 37 °C. All the infection experiments were performed at biosafety level-3 (BSL-3). Cells were pretreated with the drug candidates (10 μM) for 1 h; SARS-CoV-2 (multiplicity of infection (MOI) of 0.01) was subsequently added to allow infection for 2 h. Then, the virus–drug mixture was removed and cells were further cultured with fresh drug-containing medium. At 72 h after infection, vRNA was extracted from the culture supernatant using QIAamp viral RNA mini kit (Qiagen) according to the manufacturer’s recommendation, and detected by qRT–PCR assay using the SARS-CoV-2-specific primers. Because shikonin showed cellular toxicity at the test concentration, its antiviral activity assay did not proceed further. vRNA copies per millilitre were determined using a synthetic RNA fragment to amplify the target region. The linearized plasmid containing the S gene of SARS-CoV-2 was subjected to in vitro transcription. The resulting RNA transcripts were purified and then quantified using spectrophotometry on Nanodrop 2000 (Thermo Fisher Scientific). The purified RNA was diluted tenfold serially using RNase-free water and was detected using qRT–PCR. Threshold cycle (C t ) values for the known concentrations of the RNA were plotted against the log of the number of genome-equivalent copies. The resultant standard curve was used to determine the number of genome equivalents of vRNA in the samples. The determination of the detection limit was based on the lowest level at which vRNA was detected and remained within the range of linearity of a standard curve (C t value of 38). TaqMan primers for SARS-CoV-2 are 5′-TCCTGGTGATTCTTCTTCAGG-3′ and 5′-TCTGAGAGAGGGTCAAGTGC-3′ with SARS-CoV-2 probe 5′-FAM-AGCTGCAGCACCAGCTGTCCA-BHQ1-3′. The cytotoxicity of the tested drugs on Vero cell were determined by MTS cell proliferation assays (Promega). Ten thousand cells were seeded into a 96-well plate and incubated for 20–24 h at 37 °C. After that, the medium was removed, and 100 μl of medium containing decreasing concentrations of antiviral compounds was added to the wells. After 4 days incubation at 37 °C, MTS assays were performed according to manufacturer’s protocols. All experiments were performed in triplicate. Vero cells were obtained from ATCC (American Type Culture Collection) with authentication service. All cell lines tested negative for mycoplasma contamination. No commonly misidentified cell lines were used.

Antiviral and cytotoxicity assays for cinanserin

For the antiviral assay, a clinical isolate of SARS-CoV-23 was propagated in Vero E6 cells, and viral titre was determined as previously described43. All of the infection experiments were performed at BSL-3. Preseeded Vero E6 cells (5 × 104 cells per well) were pretreated with the different concentrations of cinanserin for 1 h and the virus was subsequently added (MOI of 0.05) to allow infection for 2 h. Then, the virus–drug mixture was removed and cells were further cultured with fresh drug-containing medium. At 24 h after infection, the cell supernatant was collected and vRNA in supernatant was subjected to qRT–PCR analysis. For cytotoxicity assays, Vero E6 cells were suspended in growth medium in 96-well plates. The next day, appropriate concentrations of cinanserin were added to the medium. After 24 h, the relative numbers of surviving cells were measured by CCK8 (Beyotime) assay in accordance with the manufacturer’s instructions. All experiments were performed in triplicate. Vero E6 cells were obtained from ATCC with authentication service. All cell lines tested negative for mycoplasma contamination. No commonly misidentified cell lines were used.

Plaque-reduction assays

One hundred thousand Vero E6 cells were seeded in a 24-well plate and treated with different doses of the inhibitors. All of the infection experiments were performed at BSL-3. Inhibitors with different dilution concentrations were mixed with SARS-CoV-2 (100 plaque-forming units), and 200 μl mixtures were inoculated onto monolayer Vero E6 cells for 1 h. After removing the supernatant, the plate was washed twice with DMEM medium, cells were incubated with 0.9% agarose containing appropriate concentrations of inhibitors. The overlay was discarded at 4 days after infection, and cells were fixed for 30 min in 4% polyoxymethylene and stained with crystal violet working solution. The plaque-forming units were determined. All experiments were performed in four biological replicates.

Intact protein analysis

In brief, 2.5 μl of compounds (10 mM in DMSO) was added into 50 μl of SARS-CoV-2 Mpro (10 mg/ml). The mixtures were kept at room temperature for 30 min. Liquid chromatography–mass spectrometry analyses were performed in positive-ion mode with a quadrupole-time-of-flight mass spectrometer (Agilent 6550) coupled with a high-performance liquid chromatograph (HPLC, Agilent 1260) for detecting the molecular weight of intact proteins. The samples were eluted from a Phenomenex Jupiter C4 300Å LC column (2 × 150 mm, 5 μm) over a 15-min gradient from 5% to 100% acetonitrile containing 0.1% formic acid at a flow rate of 0.5 ml/min. The acquisition method in positive-ion mode with Dual Agilent Jet Stream electrospray voltage used a capillary temperature of 250 °C, a fragmentor of 175 V and a capillary voltage of 3,000 V. Mass deconvolution was performed using Agilent MassHunter Qualitative Analysis B.06.00 software with BioConfirm Workflow.

Tandem mass spectrometry analysis

The samples were precipitated and redissolved by 8 M urea, and then digested for 16 h at 25 °C by chymotrypsin at an enzyme-to-substrate ratio of 1:50 (w/w). The digested peptides were desalted and loaded onto a homemade 30-cm-long pulled-tip analytical column (ReproSil-Pur C18 AQ 1.9-μm particle size, Dr Maisch, 75-μm inner diameter × 360-μm outer diameter) connected to an Easy-nLC1200 UHPLC (Thermo Fisher Scientific) for mass spectrometry analysis. The elution gradient and mobile phase constitution used for peptide separation were as follows: 0–1 min, 4–8% B; 1–96 min, 8–35% B; 96–104 min, 35–60% B; 105–120 min, 60–100% B (mobile phase A: 0.1% formic acid in water; mobile phase B: 0.1% formic acid in 80% acetonitrile) at a flow rate of 300 nl/min. Peptides eluted from the liquid chromatography column were directly electro-sprayed into the mass spectrometer with the application of a distal 1.8-kV spray voltage. Survey full-scan mass spectra (from m/z 300–1,800) were acquired in the Orbitrap analyser (Q Exactive, Thermo Fisher Scientific) with resolution r = 70,000 at m/z 400. The top 20 tandem mass spectrometry (MS/MS) events were sequentially generated and selected from the full mass spectrum at a 30% normalized collision energy. The dynamic exclusion time was set to 10 s. One acquisition cycle includes one full-scan mas spectrum followed by top 20 MS/MS events, sequentially generated on the first to the twentieth most intense ions selected from the full mass spectrum at a 28% normalized collision energy. The acquired MS/MS data were analysed using the UniProtKB E. coli database (database released on 11 November 2016) and SARS-CoV-2 nsp5, using Protein Discoverer 2.1. To accurately estimate peptide probabilities and false-discovery rates, we used a decoy database containing the reversed sequences of all the proteins appended to the target database. The false-discovery rate was set to 0.01. Mass tolerance for precursor ions was set to 20 ppm. Chymotrypsin was defined as cleavage enzyme and the maximal number of missed cleavage sites was set to four. Protein N terminus acetylation, methionine oxidation and compounds covalent bindings were set as variable modifications. The modified peptides were manually checked and labelled.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.