Temperate phages constitute a potentially beneficial genetic reservoir for bacterial innovation despite being selfish entities encoding an infection cycle inherently at odds with bacterial fitness. These phages integrate their genomes into the bacterial host during infection, donating new but deleterious genetic material: the phage genome encodes toxic genes, such as lysins, that kill the bacterium during the phage infection cycle. Remarkably, some bacteria have exploited the destructive properties of phage genes for their own benefit by co-opting them as toxins for functions related to bacterial warfare, virulence, and secretion. However, do toxic phage genes ever become raw material for functional innovation? Here, we report on a toxic phage gene whose product has lost its toxicity and has become a domain of a core cellular factor, SpmX, throughout the bacterial order Caulobacterales. Using a combination of phylogenetics, bioinformatics, structural biology, cell biology, and biochemistry, we have investigated the origin and function of SpmX and determined that its occurrence is the result of the detoxification of a phage peptidoglycan hydrolase gene. We show that the retained, attenuated activity of the phage-derived domain plays an important role in proper cell morphology and developmental regulation in representatives of this large bacterial clade. To our knowledge, this is the first observation of a phage gene domestication event in which a toxic phage gene has been co-opted for core cellular function at the root of a large bacterial clade.

To better characterize the SpmX muramidase domain and the constraints underlying its conservation, we performed an in-depth bioinformatics study of more than 60 available SpmX genes together with structural determination, biochemical analysis, and comparative cell biology between Caulobacter and Asticcacaulis. We show that spmX arose prior to the diversification of Caulobacterales, a large order of stalked bacteria. We establish that the SpmX muramidase domain is a close relative of GH24 autolysin/endolysins that have been laterally exchanged via prophages. We find that the SpmX muramidase domain exhibits attenuated ancestral phage activity, consistent with its remodeled active cleft. Finally, we demonstrate that this enzymatic activity is necessary for SpmX function in three representative species. We conclude that, close to the time of the genesis of the full-length spmX gene, the co-opted muramidase domain accumulated mutations that attenuated its hydrolytic activity on peptidoglycan and detoxified it for bacterial use. To our knowledge, this is the first case of phage gene domestication in which a toxic phage gene has been incorporated into a new core bacterial gene shared by a large bacterial order.

Perplexingly, SpmX contains an N-terminal phage muramidase domain generally toxic to bacteria. Phages use these enzymatic domains to cleave the bacterial cell wall and lyse bacteria to release infectious phage particles. As a part of SpmX, this domain is critical for SpmX’s role in both developmental regulation and stalk biogenesis: the muramidase domain is necessary for proper SpmX localization in both C. crescentus [] and the Asticcacaulis genus []. Various studies have shown that SpmX localizes with the polar scaffold PopZ in C. crescentus [] entirely through the muramidase domain []. The inability to measure enzymatic activity from purified C. crescentus SpmX muramidase domain has led to the conclusion that the domain lost its enzymatic activity and was repurposed for protein interactions and oligomeric assembly []. However, given the remarkable sequence similarity of the SpmX muramidase domain to functional phage lysozymes, including the canonical catalytic glutamate, total loss of enzymatic activity seems unlikely. Why would this domain be so highly conserved if its new function were merely for non-essential protein-protein interactions?

SpmX was first identified as a developmental regulator in the model organism Caulobacter crescentus []. Like most members of Caulobacterales, stalked C. crescentus cells divide asymmetrically to produce a stalked “mother” cell and a motile, flagellated “daughter” or “swarmer” cell. The Caulobacter developmental cycle depends on strict coordination of cell growth, chromosome replication and segregation, and division by various regulatory proteins that differ in localization and timing []. This network depends on regulatory phospho-signaling factors localized and regulated by polar scaffolds. SpmX is one protein scaffold that localizes at the stalked pole during the swarmer-to-stalked cell transition and recruits and potentially activates the histidine kinase DivJ []. Intriguingly, SpmX is required for stalk synthesis initiation and elongation in the closely related Asticcacaulis species A. excentricus and A. biprosthecum []. Therefore, this gene appears to have evolved multiple roles for defining cell morphology within this family of dimorphic, stalked bacteria.

Understanding how new genes arise is key to studying the forces that drive diversity and evolution. Although horizontal gene transfer (HGT) is widely regarded as an important mechanism for exchanging existing genes among bacteria, mobile genetic elements can transfer exogenous genetic material that gives rise to novel genes. These new genes provide the basis for evolving new traits and propelling evolutionary transitions []. Temperate bacteriophages mediate genetic transfer by integrating their genomes into bacterial hosts []. These integrated gene tracts, called prophages, remain dormant until induced by various signals to produce phage particles and proteins that lyse the cell. In many cases, prophages contain genes that benefit the host, promoting prophage retention in many bacterial lineages, even after mutations have inactivated the prophage []. Accumulation of host-specific beneficial mutations in prophages has been referred to as “domestication.” Many domesticated segments of inactivated prophages unexpectedly contain lytic and virion genes, which would intuitively be useless or even detrimental to the bacterial host []. Bacteria can use these genes as weapons against competing bacteria and eukaryotic hosts []. In contrast, we have identified an instance in which a toxic phage gene has not been repurposed as a weapon but has evolved into a domain in a new core bacterial gene, spmX. Here, we report that SpmX resulted from an ancient domestication event at the root of the alphaproteobacterial order Caulobacterales, in which co-option and detoxification of a toxic phage gene gave rise to a novel bacterial gene with roles in developmental regulation and morphogenesis.

Ecological and evolutionary benefits of temperate phage: what does or doesn’t kill you makes you stronger.

To determine whether the loss of SpmX protein levels was particular to using P22 lysozyme, we verified the phenotype when SpmX lacked the muramidase domain entirely. Deletion of the muramidase domain from the spmX locus in all three species also resulted in strains with the ΔspmX phenotype that failed to produce detectable amounts of Δmur-SpmX-sfGFP by western blot ( Figure S5 E). These results suggest that the SpmX muramidase domain is necessary to produce and/or maintain WT levels of SpmX in all three species and that P22 lysozyme, despite high sequence similarity (51%) and structural homology (root-mean-square deviation [RMSD] 1.7 Å), is not sufficient to replace it. P22Lyso and SpmX-Mur-Cc are nonetheless fairly distantly related, so we tested the ability of other SpmX muramidase domains to replace that of C. crescentus. Previously, C. crescentus and Asticcacaulis muramidase domains were shown to be interchangeable [], so we extended the sequence distance to SpmX muramidases from the next closest relative Brevundimonas subvibrioides, which has D20R in the catalytic cleft, and the most distant relative Parvularcula bermudensis, which shares the D20L mutation. We found that the muramidase domain from B. subvibrioides supported the WT phenotype in C. crescentus ( Figure 6 Av), but that the SpmX muramidase domain from P. bermudensis did not. We were surprised to see no evidence of delocalization in the B. subvibrioides SpmX chimera because the L20R point mutant of SpmX in C. crescentus showed some delocalization ( Figure 6 Avi). This result suggests that the L20R mutation in the brevundimonads must coexist with other compensatory mutations. The SpmX muramidase domain from P. bermudensis, like P22 lysozyme, must be too distant from C. crescentus to support WT expression levels. In combination with data from the P22Lyso chimeras, these data indicate that a T4L GH fold alone is not sufficient for SpmX function and that the SpmX muramidase domain must contain other mutations necessary for stable protein levels in Caulobacterales. It could suggest that this domain has additional constraints on it unrelated to potential peptidoglycan interactions, such as binding interfaces specific to its function as a recruiting factor and protein scaffold.

Because SpmX localization depended on the ability of the muramidase domain to interact with peptidoglycan, we were interested in whether swapping alternative muramidase domains into SpmX would support WT function. We first made chimeras wherein P22 lysozyme replaced the domain with the hypothesis that (1) P22 lysozyme would be too active and therefore toxic to the cells and that (2) P22 lysozyme E11A might be able to support some level of SpmX localization. Although SpmX and the SpmX-E11A mutant exhibited the previously determined morphological and delocalization phenotypes ( Figures 6 Aiii and 6Aiv), chimeras with P22 lysozyme were surprisingly viable but phenocopied the parent ΔspmX strain and lacked fluorescence. We were unable to detect any GFP-fusion products in this chimera by western blot ( Figure 6 B) but confirmed by sequencing that P22Lyso-SpmX had been correctly inserted at the spmX locus, suggesting that the chimeras were likely expressed but quickly degraded in C. crescentus. Therefore, the phenotype of this chimera is due to the loss of SpmX and not the addition of the P22 lysozyme domain. Inactivating P22 lysozyme (E11A) did not change the outcome, suggesting that the toxicity of the phage muramidase was not driving SpmX degradation.

Overall, these data show that inactivating enzymatic activity or reducing the peptidoglycan-binding capability of the muramidase domain affects SpmX localization and function. Although it is not clear whether disrupting SpmX localization with the E11A mutation stems from eliminating SpmX’s hydrolytic activity or decreasing SpmX’s binding affinity for peptidoglycan, the similar phenotype from mutating a predicted peptidoglycan-interacting residue (N105R) underscores the importance of SpmX-peptidoglycan interactions. That the catalytic mutant has an intermediate morphological phenotype in C. crescentus and one Asticcacaulis species indicates that the muramidase domain may coordinate SpmX functions similarly in the two genera and that this function likely relies on its interactions with peptidoglycan.

Because the E11A intermediate phenotype suggested that the mutation might be disrupting SpmX localization by interfering with peptidoglycan interactions, we also mutated a position associated with peptidoglycan binding, but not catalysis, in T4 lysozyme. N/Q105 has been shown to coordinate peptidoglycan in the active cleft [], and the mutation Q105R abolished activity in T4 phage plaque assays []. The mutation N105R (N91R in SpmX numbering) in C. crescentus and A. biprosthecum resulted in similar delocalization and intermediate morphological phenotypes as E11A ( Figures 5 Av and 5Bv). We also investigated the effects of restoring the phage active site D20 (L28D in SpmX numbering) to the catalytic cleft. However, this had no evident effect on SpmX localization or cell morphology ( Figure 6 Ai), suggesting that the D20L substitution in SpmX, although ancestral, is not strictly necessary for SpmX function. This finding is in line with the observation that SpmX-Mur-Cc-L20D activity was not significantly different from WT in our in vitro RBB assays ( Figure S4 B). Therefore, the D20L substitution was likely a key first step in SpmX detoxification but no longer appears to be under fitness constraints.

(B) Western blot comparing the ΔspmX parent strain to SpmX mutants and chimeras inserted at the spmX locus. In all cases, the primary antibody is directed against the C-terminal GFP fusion.

(A) Phase and fluorescent images of strains in which the native spmX allele was replaced with the following gene fusions in the ΔspmX parent strain (ii): (i) spmX-L20D-sfGFP; (iii) WT spmX-sfGFP; (iv) spmX-E11A-sfGFP; (v) MurBs-Δmur-SpmX-sfGFP where MurBs is the muramidase domain from Brevundimonas subvibrioides SpmX; and (vi) spmX-L20R-sfGFP. All scale bars are 5 μm.

Crystallographic determination of the mode of binding of oligosaccharides to T4 bacteriophage lysozyme: implications for the mechanism of catalysis.

WT and mutant SpmX GFP fusions allowed us to monitor changes in SpmX cellular localization. As shown previously [], WT SpmX localized at the future position of the stalk, at the pole as in C. crescentus, or at sub-polar or bilateral positions in Asticcacaulis and was retained at this position during stalk elongation ( Figures 5 A–5Ciii). Both C. crescentus and A. biprosthecum spmX E11A mutants exhibited an increase in delocalized fluorescence throughout the cell body compared to WT ( Figures 5 A–5Biv). Quantification of the fluorescence data indicated that, although the overall mean cell fluorescence was the same as WT, the SpmX foci were significantly less intense in the mutants ( Figures S5 A and S5B). We also observed a 3× increase in the stalk fluorescence in A. biprosthecum expressing SpmX E11A compared to WT ( Figure S5 B). Although no difference in focal fluorescence intensity was observed in A. excentricus spmX E11A mutant cells, more cells had a second SpmX focus at stalk tips than WT cells ( Figures 5 iv and S5 C), indicating altered localization. Western blots of cells expressing WT SpmX-eGFP and SpmX mutants confirmed that the delocalized fluorescence was not due to clipping of the GFP tag but to delocalized SpmX protein ( Figure S5 E). Together, these data show that the E11A mutation disrupts SpmX localization in all three species and may underlie the morphological defects observed in C. crescentus and A. biprosthecum.

To determine the role of the preserved, albeit attenuated, activity of the muramidase domain in SpmX function, we inactivated it by mutating the conserved catalytic glutamate to alanine (E11A and E19A in SpmX numbering) at the chromosomal locus in various species and observed the effects in vivo ( Figure 5 ). We determined the effects of the E11A mutation on cellular morphology, as the C. crescentus, A. excentricus, and A. biprosthecum ΔspmX strains all have morphological phenotypes ( Figures 5 A–5Cii): in C. crescentus, ΔspmX cells have a characteristic elongated morphology resulting from failed division cycles and often grow stalks prematurely from daughter cells that fail to divide completely ( Figure 5 Aii) []. In Asticcacaulis, ΔspmX cells lack stalks without other apparent developmental phenotypes ( Figures 5 B–5Cii) []. If enzymatic activity is critical for overall SpmX function, we expected that eliminating catalytic activity with the E11A mutation would phenocopy ΔspmX. However, we observed intermediate phenotypes for this mutation. In C. crescentus, the E11A mutant population contained both WT-like cells and cells exhibiting the division defect but with less severity than in ΔspmX ( Figure 5 Aiii). In both Asticcacaulis species, the E11A mutants still grew stalks ( Figures 5 B–5Ciii). Nevertheless, the A. biprosthecum E11A mutant exhibited a significant loss of bilateral stalks (3.5-fold reduction) and an increase in the frequency of cells with a single stalk ( Figure S5 D). These results suggest that eliminating catalytic activity does not fully inhibit SpmX function.

Phase and fluorescent images of (A) C. crescentus, (B) A. biprosthecum, and (C) A. excentricus. In the top panel, phase images with derived schematics emphasizing stalks and morphologies are shown for (i) WT and (ii) ΔspmX cells. In (Aii), C. crescentus cells exhibiting characteristic ΔspmX divisional defects are marked with asterisks and a cell growing stalks from both poles has its stalks marked with red arrowheads. Phase and fluorescent images of cells expressing (iii) SpmX-eGFP, (iv) SpmX-E11A-eGFP, or (v) SpmX-N105R-eGFP from the native chromosomal locus are shown in the lower panels. In (Aiv) and (Av), cells with divisional defects are marked with white asterisks. In (Biii) and (Biv), cells with one lateral or sub-polar stalk are marked with white arrowheads. In (Civ), cells with foci at the tips of stalks are marked with white arrowheads. All scale bars are 5 μm. See Figure S5 for quantification of fluorescence and morphology data.

Although purified P22Lyso-D20L and SpmX-Mur-Cc had similar activation curves in vitro, Lemo21(DE3) strains expressing SpmX-Mur-Cc never lysed ( Figure 4 C). This was despite equivalent periplasmic expression levels to P22Lyso-D20L ( Figure S4 D). Different growth conditions and media increased the amount of SpmX-Mur-Cc in the periplasm but did not affect cell viability ( Figure S4 C). Moreover, SpmX-Mur-Cc was active on sacculi isolated from Lemo21(DE3) ( Figure S4 E), eliminating the possibility that it could not cleave E. coli peptidoglycan. It is possible that SpmX-Mur-Cc cannot fold correctly in the E. coli periplasm or that its activity is further attenuated in the periplasmic environment. However, the periplasmic expression tests in E. coli confirm that the D20L mutation attenuates P22Lyso hydrolytic activity and thereby increases the amount of protein required to induce lysis. This tuning of enzymatic activity might have served as a critical detoxifying step in the co-option of the muramidase domain from phage. Because SpmX has retained the ancestral catalytic glutamate and its modified catalytic cleft is capable of hydrolytic activity, we conclude that this attenuated activity is under purifying selection in SpmX and must be important for SpmX function.

The enzymatic activity of the P22Lyso-D20L was puzzling in light of early work that reported that D20 mutations inhibited T4 lysozyme in phage plaque assays []. One possible explanation is that the D20L mutation reduces lysozyme activity to the point that it is not suitable for cell lysis at in vivo expression levels and that T4 lysozyme with the mutation was unable to complete infection and form plaques. To explore this possibility, we designed an experimental system to test the activity of P22Lyso and SpmX-Mur-Cc mutants in the E. coli periplasm using fusions to the N-terminal PelB leader sequence (pET22b). Lemo21(DE3) cells expressing P22Lyso lysed without induction ( Figure 4 B), indicating that marginal P22Lyso levels can drive cell lysis. In contrast, cells expressing P22Lyso-D20L lysed only after induction ( Figure 4 C), confirming that much higher enzyme concentrations were needed. Thus, the D20L mutation may represent a critical detoxification step that reduced the ability of the domain to lyse the cell and made it available for co-option.

Because the D20L mutation reduced P22Lyso’s activity close to that of SpmX muramidase, it was possible that this mutation was responsible for SpmX’s attenuated activity. However, restoring the ancestral D20 (SpmX-Mur-Cc-L20D) did not increase SpmX activity in vitro ( Figure S4 B). We suspect that the additional accumulation of mutations in SpmX muramidase, such as the drift observed at Y18 and T26 in the cleft, has made it impossible to restore ancestral phage lysozyme activity with a single mutation. Because the D20L mutation is ancestral in the SpmX phylogeny ( Figure 1 B) and capable of attenuating P22 lysozyme activity to SpmX-like levels, we infer that this mutation likely occurred first. The increased flexibility of the GH motif observed in the SpmX-Mur-Ae structure is therefore the consequence of many mutations that accumulated either neutrally after the D20L substitution attenuated the activity or selectively to shape the new function of the domain as part of SpmX.

Given SpmX’s reported inactivity [] and the structural data suggesting the catalytic cleft is capable of interacting with peptidoglycan, we hypothesized that the domain retains ancestral function in binding peptidoglycan. To test this, various constructs from C. crescentus, A. excentricus, and A. biprosthecum were purified and incubated with sacculi from all three species. Both muramidase and entire soluble domains, including the intermediate domain, bound sacculi from all three species ( Figure S3 ). Because the purified protein was capable of binding its putative substrate, we also tested its ability to hydrolyze peptidoglycan. We used Remazol Brilliant Blue (RBB) assays to compare the activity of SpmX muramidase from C. crescentus (SpmX-Mur-Cc) to P22 lysozyme (P22Lyso) and its D20L mutant (P22Lyso-D20L) ( Figure 4 A) and found that both SpmX-Mur-Cc and P22Lyso-D20L exhibit similarly attenuated hydrolytic activity in comparison to P22Lyso. Both reached maximal levels of RBB release near enzyme concentrations of 15 μM, and P22Lyso reached the same levels near 5 μM. Mutants in which the catalytic glutamate was replaced with alanine (SpmX-Mur-Cc-E11A and P22Lyso-E11A) did not exhibit activity ( Figure S4 A). These data indicate that the “inactivating” substitution D20L attenuates enzymatic activity, whereas mutating the catalytic glutamate abolishes it altogether.

(D) Phase and fluorescent overlays show live-dead staining of Lemo21(DE3) cells expressing P22Lyso-D20L and SpmX-Mur-Cc after 4 h of induction. Green, membrane-permeable SYTO 9 stains DNA in live cells and red, membrane-impermeable propidium iodide nucleic acid dyes labels released nucleoids and DNA from lysed bacteria. The rounding of the E. coli in (i) is characteristic of spheroplast formation and lysis by hydrolytic activity on the cell wall. Scale bars are 5 μm.

(B and C) Growth curves of Lemo21(DE3) E. coli expressing P22 lysozyme (blue), P22 lysozyme D20L mutant (green), and C. crescentus SpmX muramidase (red). Proteins were expressed from pET22b with an N-terminal PelB signal sequence. In (B), strains were grown in 5 mM rhamnose without isopropyl β-D-1-thiogalactopyranoside (IPTG) for maximal repression of basal expression from the plasmids. In (C), strains were grown without rhamnose and induced with 400 μM IPTG at the indicated time. Error bars are ± standard deviation of replicate cultures (n = 3). Lines are drawn to help guide the eye toward basic trends. See Figures S4 C–S4E for enzymatic activity and periplasmic expression of SpmX-Mur and various mutants.

(A) Remazol brilliant blue assays on C. crescentus sacculi using purified P22 lysozyme, P22 lysozyme D20L mutant, and C. crescentus SpmX muramidase. Active enzymes release peptidoglycan monomers covalently bound to RBB into the supernatant that are detected by absorbance at 595 nm. Error bars are ±SD for each normalized absorbance (n = 3). Lines are drawn to help guide the eye toward basic trends. Data points are from various days and sacculi preparations but with internal normalization to hen egg white lysozyme (HEWL). See Figure S3 for peptidoglycan binding activity and Figures S4 A and S4B for SpmX mutant activity in RBB assays.

GH motif sequence alignments ( Figure 2 B) show that SpmX muramidase domains have lost a highly conserved tyrosine residue at position 18. Although T4L enzymatic activity is not sensitive to mutation at this position [], it is invariant across all the phage lysozyme classes we analyzed. Visualization of Y18 in the P22 lysozyme structure ( Figure 3 C) shows that it interacts with R14 at the base of the beta-hairpin, possibly a critical interaction for coordinating the beta-hairpin with the catalytic glutamate. In SpmX-Mur-Ae, Y18S still appears to make hydrogen-bonding contact with R14; however, most SpmX muramidase domains have non-polar residues at position 18 ( Figure 2 Biv), which may reduce coordination. It has been previously shown that the Y18 position is a hotspot for compensatory mutations that restore activity to inactive catalytic mutants [], and it is intriguing to imagine that mutations at this position in SpmX muramidase are associated with the ability of its remodeled, more flexible catalytic cleft to still bind and/or cleave peptidoglycan.

Obtaining the structure of the SpmX muramidase domain (residues 1–150) from Asticcacaulis excentricus (SpmX-Mur-Ae) ( Table S2 ) allowed us to directly visualize the effect of the D20L and T26X mutations on the catalytic cleft. Overall, SpmX-Mur-Ae exhibits the characteristic T4 lysozyme structure: the predicted catalytic glutamate occurs at the C-terminal end of the first α helix, within the catalytic cleft formed between the N- and C-terminal lobes ( Figure 3 A). P22 lysozyme (the model for molecular replacement) and the active conformation of the distantly related SAR endolysin protein R21 (PDB: 3HDE ) from bacteriophage P21 ( Figure S2 ) are overlaid in the structural alignment in Figure 3 A to emphasize the manner in which the SpmX muramidase domain deviates from these phage lysozymes: besides the extended beta-hairpin in the C-terminal lobe, the canonical GH beta-hairpin in the N-terminal lobe of SpmX-Mur-Ae splays away from the catalytic cleft relative to those of the phage lysozymes. This GH beta-hairpin region exhibited the most conformational differences among the three molecules of SpmX-Mur-Ae in the asymmetric unit. The overlay of the three SpmX-Mur-Ae chains in Figure 3 B illustrates how the orientation of the GH beta-hairpin is tilted by about 16° between chains A and B, suggesting a heightened flexibility in this region compared to other T4L-like lysozymes, which may reduce the ability of the enzyme to coordinate peptidoglycan hydrolysis in the catalytic cleft.

(C) Overlays of ribbon diagrams and surfaces of P22 lysozyme (PDB: 2ANX left ) and SpmX-Mur-Ae (PDB: 6H9D , right) illustrating the conformation of the critical residues E11 (red), D20 (dark blue), R14 (yellow), and Y18 (orange). T4L numbering is used for ease of comparison. These structures have been rotated 180° around the y axis from their representation in (A), (B), (D), and (E).

(B) Structural alignment of the three SpmX-Mur-Ae molecules, chains A (green), B (light blue), and C (dark blue), from the asymmetric unit. The surface of chain B is shown in partially transparent light blue. The double-headed arrow indicates the tilt of about 16° between the GH beta-hairpins of chains B and A.

(A) Structural alignment of P22 lysozyme (PDB: 2ANX ; the model used for molecular replacement) in purple, R21 endolysin from P21 (PDB: 2HDE ; a distantly related GH24 T4L lysozyme) in navy blue, and SpmX-Mur-Ae in gold (PDB: 6H9D ). The catalytic glutamate is shown in red. Root-mean-square deviation (RMSD) 1.7 Å and 40% identity over 141 aligned Cα atoms and Dali Z score 21.5 between P22 lysozyme and SpmX-Mur-Ae are shown. See Table S2 for data collection and refinement statistics.

We compared the SpmX GH motif to those of lysozymes from the T4L and endolysin and/or autolysin classes, which should share the same family-specific residues. Figure 2 B shows the amino acid conservation in the GH motif of T4L-like, autolysin/endolysin, closely related non-SpmX muramidase and SpmX muramidase protein sequences. Because the autolysin/endolysin class and the closest non-SpmX relatives are likely to be active phage enzymes, highly conserved residues shared by these groups with T4L delineate positions that are evolutionarily constrained for phage lysozyme activity and stability in this clade. For example, D10 is not conserved outside of T4L-like enzymes because the autolysin/endolysin class does not have a salt bridge between D10 and the C-terminal lobe []. On the other hand, all of the putative phage sequences ( Figures 2 Bi–2Biii) conserve the T4 lysozyme “catalytic triad”: the catalytic residue E11 and active site residues D20 and T26. Although the exact roles of D20 and T26 are not clear, they are critical for effective catalysis []. Position D20 is very sensitive to mutation, with only substitutions D20C or D20A retaining the hydrolytic activity of T4L or P22 phage lysozymes []; these substitutions are tellingly well represented among the putative phage sequences. Remarkably, SpmX muramidase domains demonstrate strong conservation of residues required for the GH motif but low conservation of residues associated with catalysis, with the exception of the main catalytic residue, E11 ( Figure 2 Biv). The majority of SpmX genes contain the mutation D20L/R, both of which reduced T4L activity to less than 3% of wild-type (WT) in previous studies [] and which are distinctly unrepresented in the other phage muramidases. Moreover, the T26 position no longer appears to be under selective constraint in SpmX. The conservation of the GH motif coupled with the apparent inactivation of the catalytic triad across all SpmX genes suggests that the catalytic cleft has been remodeled structurally and that the muramidase domain may therefore not retain the same level of activity or function as phage GH24v lysozymes.

To determine whether critical enzymatic residues in SpmX muramidase were conserved, we compared SpmX amino acid sequences to other GH24v lysozymes. By definition, lysozymes catalyze the hydrolysis of β1,4-linked glycosidic bonds in peptidoglycan and chitin []. This superfamily includes at least seven distinct groups (five are represented in Figure S2 ) that are unrelated by sequence similarity but share a common fold in which the catalytic Glu and the beta-hairpin motif in the N-terminal lobe pack against the C-terminal lobe to form the catalytic cleft ( Figure 2 A) []. This beta-hairpin, or GH motif, contains family-specific residues critical for enzyme activity in all lysozyme superfamily members [].

Refer to Figure S1 for alignments of SpmX muramidases, which are listed in Table S1 . See also Table S5 for non-SpmX GH24 gene IDs.

(B) HMM logos of GH lysozymes made using WebLogo 3 []. Logos were constructed from protein sequences of (i) T4 lysozyme-like genes (n = 94); (ii) representative autolysins and/or endolysins from the Conserved Domain Database, including P22 lysozyme (n = 20), but excluding SpmX genes; (iii) closest BLAST hits from non-SpmX muramidases (n = 60); and (iv) SpmX muramidases (n = 66), and organized in a cladogram to resemble the sequence cluster tree diagram in Figure S2 . Amino acids are color coded according to chemical properties, with uncharged polar residues in green, neutral residues in purple, basic residues in blue, acidic residues in red, and hydrophobic residues in black. The height of each letter is proportional to the relative frequency of a given identity, and the height of the stack indicates the sequence conservation at that position. T4L numbering is used for ease of comparison. Asterisks mark positions critical for enzymatic activity, and open circles mark positions associated with GH motif stability [].

(A) P22 lysozyme (PDB: 2ANX ) as a model lysozyme colored with rainbow gradient from blue N terminus to red C terminus. The catalytic glutamate appears in fuchsia and the GH beta-hairpin in light blue.

The SpmX Muramidase Domain Retains the Canonical GH24 Motif but Contains Mutations in the Catalytic Cleft Known to Inactivate Phage Lysozymes

Unlike its close relatives that have been transferred horizontally through the bacterial domain via prophage, the SpmX muramidase domain coding region has been inherited vertically as part of the spmX gene in Caulobacterales. The SpmX gene tree mirrors the phylogeny of Caulobacterales from concatenated gene alignments ( Figure 1 B). None of the spmX genes appear in tracts of prophage genes. The genomic context of spmX appears to be well maintained in members of Caulobacterales, with the gene occurring between a putative Mgtransporter and a putative isovaleryl-coenzyme A (CoA) dehydrogenase in most species. Together, these findings suggest that SpmX muramidase domain is derived from an autolysin/endolysin no longer within a prophage island but instead under direct cellular control. It likely fused with the intermediate and TM domains in a common ancestor of Parvularculales and Caulobacterales. The vertical transmission of spmX and strong sequence conservation of the muramidase domain suggests an important cellular function for the gene among Caulobacterales members.

Consistent with finding close SpmX muramidase relatives in prophages, NCBI’s Conserved Domain Database (CDD) tool [] clustered SpmX muramidase with glycoside hydrolase 24 (GH24) lysozymes in the autolysin/endolysin class. The sequence cluster diagram in Figure S2 illustrates the inferred, ancient evolutionary relationships between lysozyme families based on sequence and structural alignments. These relationships allow us to determine a root for the GH24v lysozymes, with SpmX emerging relatively recently within this ancient clade of phage lysozymes. Autolysin/endolysins are closely related to classical phage T4 lysozyme-like (T4L-like) peptidoglycan hydrolases, which cleave peptidoglycan and lyse cells during the lytic cycle. These lysozymes are distinct from lytic transglycosylases ( Figure S2 ; GH24λ), which include known housekeeping bacterial hydrolases with roles in cell growth and division. Lytic transglycosylases are also assigned to the GH24 group but share no sequence similarity with T4L-like muramidases []. Thus, although core bacterial genomes encode peptidoglycan hydrolases, the SpmX muramidase domain is most closely related to peptidoglycan hydrolases encoded by prophages and phage genomes.

Crystal structure of the lysozyme from bacteriophage lambda and its relationship with V and C-type lysozymes.

We first determined the prevalence of SpmX and its homologs in the bacterial domain. Simple pBLAST analysis revealed that SpmX, as defined by its three-part architecture with an N-terminal muramidase domain, a charged and proline-rich intermediate domain, and two C-terminal transmembrane (TM) segments ( Figure 1 A), is taxonomically constrained to Caulobacterales and one member of its sister taxa, Parvularculales. It is conserved as a single-copy gene in all sequenced members ( Table S1 ). In all 69 identified spmX orthologs, the muramidase domains exhibit high amino acid sequence conservation ( Figure S1 ), the intermediate domains high variability in length and sequence conservation, and the TM segments moderate sequence conservation among genera ( Figure 1 A). Apart from these orthologs, BLAST searches using SpmX only returned hits for the muramidase domain. These hits came from Gram-negative bacterial genomes that span the entire bacterial domain and from viral genomes. Most of these bacterial genes are likely to be in prophage regions, as evidenced by their position in tracts of prophage genes. We did not detect sequences homologous to SpmX TMs in our search, although we occasionally detected homologous phage muramidase domains fused to other, non-homologous TM segments.

(B) Phylogenetic trees of representative species from Caulobacterales and other Alphaproteobacteria for concatenated housekeeping gene alignments (left) and for SpmX (right), with branch colors indicating the amino acid identity at position 20 of SpmX (D20L in yellow, D20R in red, and D20G in green). See Table S6 for genome IDs. The concatenated housekeeping tree is fully supported with posterior probability of 1.0 for all clades. Asterisks indicate clades in the SpmX tree with posterior probabilities >0.95. See Figure S2 for the relationship of the SpmX muramidase domain within the lysozyme superfamily.

(A) Schematic of SpmX architecture, including the conserved muramidase domain (see Figure S1 for alignments), the variable intermediate domain, and two C-terminal transmembrane (TM) segments. Bar indicates amino acid sequence conservation among spmX alleles (see Table S1 for a list of spmX genes used in this study).

Discussion

2 Koonin E.V. Viruses and mobile elements as drivers of evolutionary transitions. 34 Braga L.P.P.

Soucy S.M.

Amgarten D.E.

da Silva M.A.

Setubal J.C. Bacterial diversification in the light of the interactions with phages: the genetic symbionts and their role in ecological speciation. 8 Touchon M.

Moura de Sousa J.A.

Rocha E.P. Embracing the enemy: the diversification of microbial gene repertoires by phage-mediated horizontal gene transfer. 35 Canchaya C.

Fournous G.

Chibani-Chennoufi S.

Dillmann M.L.

Brüssow H. Phage as agents of lateral gene transfer. 36 Cortez D.

Forterre P.

Gribaldo S. A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes. 37 Daubin V.

Ochman H. Bacterial genomes as new gene homes: the genealogy of ORFans in E. 2 Koonin E.V. Viruses and mobile elements as drivers of evolutionary transitions. 36 Cortez D.

Forterre P.

Gribaldo S. A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes. 37 Daubin V.

Ochman H. Bacterial genomes as new gene homes: the genealogy of ORFans in E. Bacteriophages shape bacterial evolution in various ways: they increase bacterial diversity by selectively preying on species []; drive horizontal gene transfer []; and serve as reservoirs of raw material for genetic innovation []. Phages are heralded as a major source of genetic material for novel gene emergence in bacteria [], but, as we discuss later in this section, very few examples of novel gene emergence from prophage exist in the literature. We have investigated the origin and function of a taxonomically restricted gene from Caulobacterales, spmX, and determined that its occurrence is the result of the fusion and domestication of a phage peptidoglycan hydrolase gene. Although SpmX functions as a scaffold in developmental regulation and morphology, its muramidase domain retains high sequence similarity to phage lysozymes, which are toxic to bacteria. The active cleft contains mutations that have attenuated the toxic activity of the domain, presumably making it available for genetic innovation and bacterial use. We show here that the domain remains enzymatically active on peptidoglycan and that eliminating this activity alters the function of the full-length protein in vivo. Thus, the SpmX gene represents a core gene innovation specific to the Caulobacterales order that originally arose from a prophage gene with antibacterial activity.

18 Perez A.M.

Mann T.H.

Lasker K.

Ahrens D.G.

Eckart M.R.

Shapiro L. A localized complex of two protein oligomers controls the orientation of cell polarity. 18 Perez A.M.

Mann T.H.

Lasker K.

Ahrens D.G.

Eckart M.R.

Shapiro L. A localized complex of two protein oligomers controls the orientation of cell polarity. 15 Radhakrishnan S.K.

Thanbichler M.

Viollier P.H. The dynamic interplay between a cell fate determinant and a lysozyme homolog drives the asymmetric division cycle of Caulobacter crescentus. Previously, it was suggested that the SpmX muramidase domain functions only in protein-protein and self-oligomerizing interactions in SpmX’s role as a developmental regulator and scaffold in C. crescentus []. This conclusion was based on the lack of detectable activity from the purified domain and the inability of the catalytic E11R (E19R in SpmX numbering) mutant to self-oligomerize. It is highly likely that the E11R mutation greatly destabilizes the muramidase domain structure. We found that even the E11A mutant eluted in multiple fractions during purification, indicating decreased conformational stability. Moreover, the E11R protein product was no longer detectable in the cells expressing the gene []. Thus, the effect of the E11R mutation is similar to using a distantly related muramidase domain (like P22Lyso) or deleting portions of the muramidase domain entirely []. These data indicate that the muramidase domain plays an unanticipated role in maintaining stable SpmX protein levels across all tested species: without an appropriate muramidase domain, SpmX is misfolded, misprocessed, and/or quickly degraded.

17 Jiang C.

Brown P.J.B.

Ducret A.

Brun Y.V. Sequential evolution of bacterial morphology by co-option of a developmental regulator. 15 Radhakrishnan S.K.

Thanbichler M.

Viollier P.H. The dynamic interplay between a cell fate determinant and a lysozyme homolog drives the asymmetric division cycle of Caulobacter crescentus. Inactivating the SpmX muramidase domain resulted in developmental defects in C. crescentus and significant decrease in bilateral stalks in A. biprosthecum. Curiously, inactivating the enzymatic domain did not yield a null phenotype or complete delocalization. It is possible that enough peptidoglycan interactions are maintained in the mutants for the domain to function as a peptidoglycan-binding domain. It is also possible that SpmX recruits proteins with redundant enzymatic activity that cannot be recruited in the ΔspmX mutant, as it is already known that SpmX interacts with targeting factors via its C-terminal domains in Asticcacaulis [] and possibly via its transmembrane segments with DivJ []. Finally, it is hard to distinguish whether there is a direct relationship between catalytic activity and peptidoglycan binding or whether cleaving peptidoglycan could indirectly localize SpmX. The multiple domains and pleiotropic effects of SpmX make it difficult to assess the effects of an individual domain on its in vivo function. However, our data support a model in which the muramidase domain of SpmX is still active, and this activity is used to localize SpmX. We conclude that the muramidase domain functions in localizing SpmX via its interactions with peptidoglycan rather than self-oligomerization as previously hypothesized. This proper localization is necessary for its roles in development and morphology.

32 Poteete A.R.

Sun D.P.

Nicholson H.

Matthews B.W. Second-site revertants of an inactive T4 lysozyme mutant restore activity by restructuring the active site cleft. 33 Anderson W.F.

Grütter M.G.

Remington S.J.

Weaver L.H.

Matthews B.W. Crystallographic determination of the mode of binding of oligosaccharides to T4 bacteriophage lysozyme: implications for the mechanism of catalysis. SpmX emerges in the genomic record at the root of Caulobacterales with the attenuating D20L mutation ( Figure 1 B). The D20L mutation is therefore ancestral and potentially the initial step in the co-option of the domain. D20L conservation throughout most of Caulobacterales suggests evolutionary constraint on this position despite no observable phenotype from the SpmX-Mur-Cc L20D reversion mutation in vivo or in vitro. After the D20L substitution detoxified the muramidase domain, the domain likely accumulated both neutral and occasional adaptive mutations in the context of its new function. The active cleft contains several modifications, including the loss of selection on the third catalytic triad position, T26, and the invariant residue Y18. This pair is interesting in that Y18 was identified as a hotspot for spontaneous second site revertants of T26 mutants in T4 lysozyme []. It is possible that the changes we see at these two positions are compensatory mutations retaining attenuated activity, although there is no clear history of covariation. Accumulation of these types of mutations likely underlies the inability to restore phage lysozyme-like activity by reversing the D20L substitution. The ancestral D20L mutation has diverged in two groups: Oceanicaulis and Maricaulis (D20R/G) and Brevundimonas (D20R; Figures 1 B and S1 ). Interestingly, D20R/G is covariant with residue N105S/D ( Figure S1 ), a peptidoglycan-interacting residue in T4L []. The covariance of peptidoglycan-interacting residues in these diverging genera further underscores the importance of this domain in peptidoglycan interactions, rather than just protein-protein interactions.

18 Perez A.M.

Mann T.H.

Lasker K.

Ahrens D.G.

Eckart M.R.

Shapiro L. A localized complex of two protein oligomers controls the orientation of cell polarity. 17 Jiang C.

Brown P.J.B.

Ducret A.

Brun Y.V. Sequential evolution of bacterial morphology by co-option of a developmental regulator. spmX arose recently enough to see the hallmarks of novel gene emergence and adaptation in a constrained bacterial clade. The gene either arose from a fusion event in the bacterial genome, or the original phage gene contained the transmembrane segments. Detoxification of the muramidase appears concomitant with the origin of the full SpmX gene comprising three fused domains. Maintenance of the muramidase domain since the emergence of SpmX and its activity in current living Caulobacterales members suggest that its attenuated activity was selected for in the ancestral protein and still involved in its modern functions. In contrast, SpmX’s downstream intermediate domain is highly variable throughout Caulobacterales ( Figure 1 A). This domain appears to experience comparatively minimal sequence constraint and has undergone multiple independent events of elaboration and reduction in this clade. This region of charged residues and prolines drives SpmX self-oligomerization in vitro [] and may also facilitate other protein interactions. For example, the intermediate domain appears to be responsible for targeting SpmX to sub-polar and bilateral positions in Asticcacaulis [].

38 Brézellec P.

Vallet-Gely I.

Possoz C.

Quevillon-Cheruel S.

Ferat J.-L. DciA is an ancestral replicative helicase operator essential for bacterial replication initiation. 39 Brézellec P.

Petit M.-A.

Pasek S.

Vallet-Gely I.

Possoz C.

Ferat J.-L. Domestication of lambda phage genes into a putative third type of replicative helicase matchmaker. 40 Forterre P. Displacement of cellular proteins by functional analogues from plasmids or viruses could explain puzzling phylogenies of many DNA informational proteins. 41 Sabehi G.

Shaulov L.

Silver D.H.

Yanai I.

Harel A.

Lindell D. A novel lineage of myoviruses infecting cyanobacteria is widespread in the oceans. 42 Lang A.S.

Zhaxybayeva O.

Beatty J.T. Gene transfer agents: phage-like elements of genetic exchange. 43 Lang A.S.

Beatty J.T. Importance of widespread gene transfer agent genes in α-proteobacteria. 44 Shakya M.

Soucy S.M.

Zhaxybayeva O. Insights into origin and evolution of α-proteobacterial gene transfer agents. 45 Ho B.T.

Dong T.G.

Mekalanos J.J. A view to a kill: the bacterial type VI secretion system. 46 Russell A.B.

Peterson S.B.

Mougous J.D. Type VI secretion system effectors: poisons with a purpose. 12 Ghequire M.G.K.

De Mot R. The tailocin tale: peeling off phage tails. 13 Hockett K.L.

Renner T.

Baltrus D.A. Independent co-option of a tailed bacteriophage into a killing complex in pseudomonas. 14 Scholl D. Phage tail-like bacteriocins. 11 Sarris P.F.

Ladoukakis E.D.

Panopoulos N.J.

Scoulica E.V. A phage tail-derived element with wide distribution among both prokaryotic domains: a comparative genomic and phylogenetic study. 47 Hurst M.R.H.

Glare T.R.

Jackson T.A. Cloning Serratia entomophila antifeeding genes--a putative defective prophage active against the grass grub Costelytra zealandica. 48 Yang G.

Dowling A.J.

Gerike U.

ffrench-Constant R.H.

Waterfield N.R. Photorhabdus virulence cassettes confer injectable insecticidal activity against the wax moth. 49 Shikuma N.J.

Pilhofer M.

Weiss G.L.

Hadfield M.G.

Jensen G.J.

Newman D.K. Marine tubeworm metamorphosis induced by arrays of bacterial phage tail-like structures. 2 Koonin E.V. Viruses and mobile elements as drivers of evolutionary transitions. 50 Patzer S.I.

Albrecht R.

Braun V.

Zeth K. Structural and mechanistic studies of pesticin, a bacterial homolog of phage lysozymes. 51 Michalska K.

Brown R.N.

Li H.

Jedrzejczak R.

Niemann G.S.

Heffron F.

Cort J.R.

Adkins J.N.

Babnigg G.

Joachimiak A. New sub-family of lysozyme-like proteins shows no catalytic activity: crystallographic and biochemical study of STM3605 protein from Salmonella Typhimurium. 52 Ren Q.

Wang C.

Jin M.

Lan J.

Ye T.

Hui K.

Tan J.

Wang Z.

Wyckoff G.J.

Wang W.

Han G.Z. Co-option of bacteriophage lysozyme genes by bivalve genomes. In several reported cases, bacteria have domesticated phage genes for genetic manipulation and transfer, bacterial warfare, virulence, and secretion. However, these events are distinct from that which created the novel bacterial gene spmX. Phage genes for DNA replication and recombination have replaced bacterial functional homologs within bacterial genomes several times []; however, these genes retain their original function and carry out the same tasks. Gene transfer agents (GTAs) pose an interesting case where virion proteins from cryptic prophage package random DNA from the bacterial genome to presumably share with other bacteria []. Although a specific GTA has been stably maintained across several alphaproteobacterial orders, this domesticated island of phage genes still shuttles DNA around, as it once did in ancestral infectious cycles []. Phage tails have been weaponized many times, resulting in type VI secretion systems [], tailocins and phage tail-like bacteriocins [], phage tail-like systems with insecticidal properties [], and phage tail-like arrays []. All of these represent a “guns for hire” acquisition scheme in which phage genes are co-opted for their ancestral toxicity and function []. Many of these genes reside in genomic islands and confer environmental, niche-specific advantages that directly exploit their ancestral activity for the benefit of the host. Similarly, in two other known cases of phage lysozyme domestication in bacteria, muramidase domains have been fused to colicins [] or are predicted to be secreted with type III secretion systems [], presumably for use in bacterial warfare or infection. In one strange case, a phage lysozyme gene has been co-opted in bivalve genomes, which apparently still use the gene for its antibacterial properties [].

The domestication of the muramidase domain in SpmX is distinct from the above cases of “guns for hire” because the phage gene has been incorporated into a novel bacterial gene with new function in basic cellular processes in a large bacterial order. The SpmX muramidase domain, although active, no longer lyses bacterial cells; instead, it plays a role in localizing SpmX for its function in developmental regulation and morphogenesis. The co-option of phage genes for core cellular function is likely a common event in nature, but identifying such genes may require a careful search. Based on our findings, we suggest a future strategy for their detection: searching for phage gene homologs with long histories of vertical inheritance and signs of innovation in bacterial genomes.