The structure of NocTE

Previous experiments had demonstrated that the epimerization and thioester cleavage reactions take place on β-lactam containing peptide substrates bound to Ser1779 and not while still linked to a PCP, as observed for E domains17. To confirm bioinformatic assignment of the catalytic triad, guide mutagenesis experiments and identify additional residues that could play catalytic roles, the structure of NocTE was solved by MAD phasing and subsequently refined against a high-resolution dataset from a native protein crystal. The overall structure of NocTE is similar to known TE domain structures of NRPS and polyketide synthase (PKS) enzymes (Fig. 2)20. A search for the closest structural homologs with the DALI server21 indicated that NRPS TE domains from surfactin (PDB 1JMK)22 and enterobactin (PDB 5JA2)23 biosynthesis, the PKS TE domain from aflatoxin biosynthesis (PDB 3ILS)24 and the type-II TE domain RifR (PDB 3FLA)25 are closest to the NocTE domain with rms displacements of 2.3–3.0 Å. The typical α/β hydrolase fold consists of a central seven stranded β-sheet (β2–β8) surrounded by two and three helices on either side, while two helices on top form the lid region (α4 and α5). Like other TE domains of NRPS and PKS origin, the first N-terminal β-strand of the hydrolase fold is not present in the NocTE domain. The second β-strand is the only anti-parallel strand of the remaining six strands in the structure. The lid region of NocTE consists of two helices (α4 and α5) that are oriented at an angle of 48° relative to each other and are held together through hydrophobic interactions. The α4 helix of the lid is furthest away from the active site (Fig. 2) and makes minimal contacts with the core of the domain. In contrast, the second lid helix α5 makes extensive contacts with the core through interactions with the α2 and α3 helices and the loops that follow them. The observed positions of the NocTE lid helices are similar to the PksA TE (PDB 3ILS) and Srf TE (PDB 2VSQ)26, as shown in Fig. 2c.

Fig. 2 Structure of NocTE. NocTE forms a traditional α/β hydrolase fold observed with NRPS thioesterase domains. a The lid helices α4 and α5 are highlighted in wheat. The catalytic triad residues Ser1790, His1901, and Asp1806 are shown in yellow. In the absence of ligands, the N-terminal portion of helix α4 is disordered. b Catalytic triad residues are highlighted in yellow, while one side-chain orientation of His1808 is shown in cyan. c Superposition of PksA TE (PDB 3ILS) and Srf TE (2VSQ) with NocTE, showing similar lid helix positions. The lid helices of PksA are colored cyan and the helices of SrfA-C are colored green. Loop smoothing is used in (c) for clarity Full size image

The NocTE active site structure with its catalytic triad Ser1779, His1901, and Asp1806 is similar to canonical Type-I thioesterase domains (Fig. 2a). The nucleophilic Ser1779 belonging to the GXSXG motif is present on the loop following the β5 strand and His1901 is located on the loop after β8. The acidic residue Asp1806 is positioned according to its canonical site in α/β hydrolase folds at the loop following the β7 strand. The residues of the catalytic triad form a typical hydrogen bonding network. Interestingly, positioned two residues downstream from the catalytic aspartate (Asp1806) is His1808, which forms an additional hydrogen bond with the catalytic serine (Fig. 2b). In the unliganded structure, His1808 adopts two alternate side chain positions; the principal orientation hydrogen bonds to Ser1779.

Site-directed mutagenesis of catalytic residues

The crystal structure of NocTE confirmed the assignment of His1901 and Asp1806 in the catalytic triad with Ser1779, and additionally identified potential catalytic involvement by His1808. Given its location in the active site and lack of conservation in canonical thioesterase domains, we used site-directed mutagenesis to examine the role of these residues in the unusual epimerization–hydrolysis steps mediated by NocTE.

Previous work had established that mutation of the catalytic serine to alanine completely abolished activity17. Site-specific mutants of the remaining residues His1901 and Asp1806 to alanine were individually constructed and assayed17 against the N-acetylcysteamine thioester (SNAC) of epi-nocardicin G (5, Fig. 3). While the His1901 mutant was completely devoid of detectable activity, substitution of the catalytic aspartate with alanine resulted in full epimerization and hydrolysis, but at a rate lower than the wild type enzyme (Supplementary Fig. 1). Three different variants of NocTE were then prepared to investigate what role, if any, the “extra” histidine residue, His1808, might have on the dual function TE. This histidine was replaced by alanine, glutamine, and asparagine where the latter were intended to mimic the corresponding ε- and δ-imidazole nitrogens of histidine. As depicted in Fig. 3, mutation had little effect on the epimerization/hydrolysis reaction showing only some degradation of stereochemical control in the appearance of epi-nocardicin G (6) in the product profile. Even complete removal of the His1808 side chain in the H1808A mutant resulted in a protein that was largely still active. Thus, it appears that His1808 plays no catalytic role. However, given the increased proportion of 6 in the assays with the mutant enzymes of decreased steric size, His1808 may help enforce substrate orientation in the active site of NocTE. It is to be noted that, although the SNAC thioesters of nocardicin G (3) and epi-nocardicin G (5/6) could be synthesized in stereochemically pure form in dry, organic solvent, upon addition to aqueous assay buffer spontaneous C-terminal epimerization takes place with a half-life of ~21 min27. Notwithstanding this comparatively rapid rate as evident in the HPLC trace of substrate 5 in Fig. 3, the combined NocTE-catalyzed epimerization and hydrolysis occurs ~1200 times faster17.

Fig. 3 Mutational analysis of His1808. Wild-type or mutant NocTE was incubated with the SNAC thioester of the epi-nocardicin G 5 and monitored by HPLC for its ability to catalyze epimerization and hydrolysis to 3 or hydrolysis to 6. Chromatograms for assays were monitored at 272 nm. Mutation of His1808 had little impact on catalytic activity, but the fraction of 6 in the product rises from ≤1% in wild type to ~15% in the H1808A mutant Full size image

Inhibitor design and analysis

Due to the natural hydrolytic or macrocyclization activity of NRPS thioesterase domains, it has proved challenging to obtain structures of acyl-enzyme intermediates. To best explore the structural and mechanistic features of the unique epimerase/hydrolase chemistry performed by NocTE, we pursued the design of small-molecule inhibitors capable of producing an enzyme-inactivator adduct with maximal native structural fidelity28. Biochemical characterization of NocTE had demonstrated its marked selectivity towards β-lactam-bearing, tri- and pentapeptide N-acetylcysteamine (SNAC) thioester substrate analogues17. As a consequence, candidate inhibitors must retain key elements like the azetidinone ring, appropriate amino acid sequence, and stereochemistry to meaningfully duplicate interactions occurring within the NocTE active site.

We first pursued a diphenylphosphonate (DPP) warhead not only because of its well-documented success as a seryl-reactive group, but also because of the ease with which it could be incorporated into amino acid analogues29,30,31. A peptide analog that employed other serine hydrolase inhibitor classes would either require more complicated syntheses (e.g., boronic acids32), introduce unwanted bulk into the active site (e.g., trifluoromethyl ketones33,34), or be incompatible with the Hpg moiety by preliminary experiments (e.g., aldehydes35). Despite successfully synthesizing both tripeptide nocardicin G and pentapeptide pro-nocardicin G DPP analogues 7 and 8 (Fig. 1), respectively, incubation with NocTE in a variety of conditions did not result in any detectable adducts by mass spectrometry. Ultimately, the serendipitously discovered ability to convert the DPP group to a much more reactive fluorophosphonate (FP) warhead (4, Fig. 1) using a late-stage, two-step fluorinative hydrolysis and methylation protocol led to a potent and selective inactivator of NocTE that was stable in buffer for prolonged periods of time and conserved the desired structural properties of the enzyme28.

Structure of NocTE bound to tripeptide inhibitor

Initial attempts to introduce the covalent ligand into pregrown crystals by soaking proved unsuccessful. We, therefore, identified new crystal growth conditions of the covalently modified NocTE accessed by preincubation with FP 4 (Fig. 4). The covalently modified NocTE crystallized in space group P2 1 2 1 2 1 with four protein chains in the asymmetric unit. The structure of NocTE bound to ligand showed excellent density (Supplementary Fig. 2) for the covalent modification of Ser1779 in all four chains. The density was unequivocal for the complete ligand in chains A and B. Electron density was of lower quality in chains C and D. A glycerol molecule from the cryoprotection solution is located between the phosphonate moiety and bulk solvent in three chains. No density was observed in any of the four chains for a methyl group attached to the phosphonate of the ligand, likely owing to hydrolysis, or well-precedented “aging”, during the extended period of crystal growth36.

Fig. 4 The structure of NocTE reacted with fluorophosphonate 4. a Cartoon representation of NocTE. The central sheet of the protein is highlighted with blue. The ligand is shown with yellow carbon atoms. A glycerol molecule that co-crystallized in all four chains is shown at the base of the pocket. Simulated annealing omit map electron density calculated is shown for the peptide ligand. b Stereorepresentation of the active site. The captured ligand in the D-configuration is shown with yellow carbon atoms. Superimposed on the structure in cyan is the alternate L-stereoisomer of the C-terminal HPG residue, which was manually docked into the structure. In this orientation, the phenyl moiety occupies a hydrophobic cavity formed by residues Val1783, Ala1853, Phe1780, and Leu1810 in the NocTE structure. Several waters occupy this pocket in all chains of the structure, with one water molecule potentially approximating the location of the L-Hpg hydroxyl. One oxygen atom from the phosphonate projects back towards the oxyanion hole formed by the amide nitrogens of Phe1780 and Gly1716. Hydrogen bonds are shown in gray, with those of the catalytic triad shown in blue. c Surface representation of NocTE highlights the cavity for binding the substrate L-Hpg (cyan) as well as the more open pocket for the D-Hpg (yellow) product Full size image

Phosphonate-based inactivators have been successfully used to crystallize acyl-enzyme complexes and mimic tetrahedral intermediates of serine protease36 as well as PKS TE domains29. The NocTE-complex structure with the “aged” phosphonate–Ser adduct (loss of methanol) mimics closely the tetrahedral covalent intermediate of the hydrolysis half-reaction (Fig. 4). Unlike the dynamic lid helices of TE domains from Srf TE22 and Vlm237, NocTE did not show any major rearrangements after ligand binding. Instead, ligand binding caused two minor perturbations in the lid α4 helix (Supplementary Fig. 3). In the unliganded structure, the only contact made by the α4 helix to the core domain is through the backbone carbonyl of the Glu1829 to Arg1903 side chains. These contacts are lost upon ligand binding as the α4 helix has moved further away from active site and the Arg1903 side chain is shifted outwards. Second, ligand binding accompanies an ordering of the N-terminal turn of the helix and the preceding loop. This perturbation enables the formation of a partial hydrophobic groove for interaction with the N-terminal D-Hpg. Since residues from two lid helices and the loop preceding the α4 helix make intricate hydrophobic interactions with the N-terminal Hpg, the lid in NocTE appears to be involved in determining specificity for the aryl residue at this locus in the substrate.

Substrate interacting residues

The inhibitor is bound into the substrate-binding pocket of the TE domain (Fig. 4). The N-terminal Hpg group sits in a hydrophobic groove formed by Pro1819, Val1823, Leu1847, Gly1850, and Ala1854. The planar β-lactam ring is positioned at the center of the active site cavity and makes hydrophobic interaction through C β with Leu1846 and Gly1716 (Supplementary Fig. 4). The inhibitor cradles the Leu1810 side chain with both Hpg moieties making hydrophobic interactions. The C-terminal Hpg group points toward an exit channel and makes distal hydrophobic interactions with His1808 and catalytic His1901. The His1808 side chain has fully rotated away from the active site to accommodate positioning of the C-terminal Hpg group. Arg1826 is drawn toward the phosphonate group of the inhibitor, making a hydrogen bond with the second phospho-oxygen and additionally makes hydrogen bonds with the side chain of the catalytic Asp1806. The guanidinium of Arg1826 also forms one side of the pocket in which the C-terminal L-Hpg resides.

The backbone amides of Gly1716 and Phe1780 are in hydrogen bonding distances of 2.7 and 2.9 Å to one phospho-oxygen of the ligand and form the oxyanion hole (Supplementary Fig. 5). Catalytic His1901 is positioned to hydrogen bond with the second phospho-oxygen group, indicating that this oxygen may mimic the attacking water molecule in the post-epimerization hydrolysis reaction (see below). Coordination by the histidine residue from the catalytic triad serves to activate the water for attack during the hydrolysis of the epimerized peptide.

The fluorophosphonate probe 4 was incubated with NocTE as an epimeric mixture at the Hpg C-terminus. The final ligand density clearly shows the presence of only D-Hpg in the structure. (We note that, because of the replacement of the carbonyl with the phosphonate, the formally defined R/S stereochemistry of the peptide and the FP probe at the C-terminal residue are inverted. We will refer to the observed orientation as the D-epimer, reflecting the configuration of the peptide and not the probe.) Thus, NocTE either selected the D-Hpg diastereomer from the reaction mixture or bound to both stereoisomers and catalyzed the conversion of L-Hpg to D-Hpg at the enzyme active site. The resulting seryl-phosphonate adduct subsequently withstood hydrolysis resulting in the observation of only the peptide harboring the D-epimer. NocTE possesses a large hydrophobic pocket in the active site capable of accommodating the L-Hpg C-terminal epimer. This pocket is formed by the side chains of Val1783, Leu1810, Ala1853, and the main chain of Cα of Phe1780 (Fig. 4b). Alignment of TE domain sequences from 10 homologs that share 57–100% sequence identity shows that the catalytic triad is completely conserved. Additionally, there is strong homology in the other substrate-binding residues described in Fig. 4 with a few conservative changes of Val1783 to Thr, Pro1819 to Ala, and Arg1826 to His. Only Leu1847, which is near the N-terminal D-Hpg but points away from the active site shows more significant variation, appearing as a glutamic acid in several homologs.

We superimposed the four independent chains of NocTE bound to FP 4 on the unliganded model. The structures of the active site, including the water network, are remarkably conserved (Fig. 5). Three water molecules are conserved in the putative L-Hpg pocket. These waters form a network of hydrogen bonds. In the unliganded structure, a fourth water is present that is positioned near the β-lactam carbonyl oxygen. One of these waters approximates the p-OH group and interacts with the amide of Leu1810 and the carbonyl oxygen of His1808, suggesting these main chain atoms may coordinate the phenolic hydroxyl of the L-Hpg isoform. As noted above, His1808 adopts two conformations in the unliganded structure. In contrast, in all four chains of the NocTE bound to 4, this residue adopts a single conformation in which the side chain has cleared the active site making room for the D-Hpg side chain. The water molecules in the L-Hpg pocket are nearly identical in the unliganded and the D-Hpg bound states. These water molecules presumably must vacate this space when the L-Hpg peptide binds, prior to epimerization.

Fig. 5 Water network of the NocTE active site. Four chains of the inactivated NocTE (shown in different shades of blue) were superimposed on unliganded NocTE (yellow). Water molecules that fill the putative L-Hpg pocket are shown as spheres from the liganded (red) and unliganded (pink) structures. The side chain of His1808 adopts alternate conformations in the unliganded structure. The covalent adduct formed by reaction with inactivator 4 is shown in ball-and-stick from chain A only. Hydrogen bonds that form the network of interactions among the conserved waters, as well as for the catalytic triad and the oxyanion hole, are highlighted for chain A Full size image

Comparison with other TE domain structures

We compared NocTE with the didomain PCP-TE structure of EntF38 to identify the pantetheine binding site (Fig. 6). The PCP docking site is fairly open and is followed by a tunnel from where the phosphopantetheine arm can deliver the substrate to the catalytic Ser1779. The NocTE substrate channel is centered at a deep crevice where the catalytic nucleophile is positioned and opens to the wide exit site. The shape and orientation of the binding pocket of different TE domains are distinct, with some domains possessing an open channel, while the pocket for other TE domains is more closed (Supplementary Fig. 6). The NocTE active site is a fairly open channel allowing entry of solvent molecules at the active site as revealed in the NocTE unliganded structure. The dynamics of the lid helices are further reflected by the fact that the pockets in some of the unliganded structure collapse to appear almost completely closed in the crystal structures.

Fig. 6 Ligand binding pockets of NocTE and other TE domains. a Superposition of the EntF PCP-TE di-domain (PDB 3TEJ) on NocTE. The core N-terminal α/β hydrolase fold is colored gray and deep blue, lid helices in orange and wheat respectively. The EntF PCP domain is colored cyan and pantetheine arm in magenta. The pantetheine atoms derived from the 3TEJ, substrate analog are shown in magenta. The pantetheine terminates in a nitrogen derived from the amide analog used in this study. The NocTE inhibitor is in yellow, blue and red elemental colors. b Superposition of EntF PCP-TE di-domain on NocTE showing PCP-docking site and pantetheine tunnel in NocTE electrostatic surface representation. c Channel open at two-ends in TE domain in Vlm2 (6ECE). d Closed channel with a cavity at catalytic site in PksA TE domain (3ILS). e One-end open channel in Pks13 TE domain (5V3X) Full size image

The loop joining the β6 strand and α4 lid helix, residues Arg1815–Glu1821, is missing in the NocTE unliganded structure. Upon ligand binding, this loop becomes ordered, although it adopts modestly different positions in the four chains. The NocTE-complex showed the lid helix α5 interacting with the N-terminal D-Hpg of the ligand (Supplementary Fig. 4). Similar intricate interactions are observed with corresponding lid helix in M. tuberculosis Pks13 TE domain with a non-covalent benzofuran inhibitor and in PikTE domain with macrolactone product 10-deoxymethynolide and affinity labels (Supplementary Fig. 7). A common theme emerges where lid helix α5 may be important for ligand interactions in other TE domains with similar lid helices.

We compared the position of the ligand in NocTE to the recently determined structure of the valinomycin TE that utilized a 1,3-diaminopropionate non-coding amino acid to capture a portion of the ligand bound to the catalytic serine37. The VlmTE contains a much more extensive lid region of six α-helices, Lα1 through Lα6 (Fig. 7). In the VlmTE structure, helices Lα1 and Lα5 approximate the positions of NocTE helices α4 and α5, respectively, although Lα1 runs antiparallel to NocTE α4. Compared to the four ordered residues of the depsipeptide covalently bound to VlmTE observed in several chains, the nocardicin peptide adopts a distinct position, such that the N-terminal D-Hpg is sandwiched between the two lid helices. Other TE domains may therefore bind their peptides into a similar pocket as the N-terminus of the peptide does not need to access the thioester bond for cyclization.