tRNA-seq reveals tRNA modifications in bacterial cultures

To determine the feasibility of employing tRNA-seq for modification studies in simple model systems, we performed DM-tRNA-seq on bacterial cultures from four species: Escherichia coli, Bacillus subtilis, Staphylococcus aureus, and Barnesiella viscericola. Bacterial tRNAs share some modifications with eukaryotic tRNAs, but also possess unique modifications of their own21. Since almost all E. coli tRNAs and the majority of B. subtilis tRNAs have been mapped for modifications by 2D-TLC and LC/MS, these two cultures allowed us to benchmark tRNA-seq results against the known modifications. In contrast, S. aureus and B. viscericola were suitable to identify new modifications using tRNA-seq.

We focused our analysis on Watson–Crick face base modifications for which DM-tRNA-seq was particularly well suited20. We first determined the mutation and the stop fraction at each nucleotide position within tRNA sequences20. Here, we calculated the mutation fraction by only including sequencing reads that pass the modification site while excluding shorter reads that stopped before the modification site. Figure 1 and Supplementary Figure 1 display representative mutation and stop fractions for several tRNAs with modification sites annotated using standard tRNA nomenclature22. E. coli tRNAPro(CGG) has N1-methylguanosine (m1G)37 and 4-thiouridine (s4U8), but only m1G can be removed by the demethylase treatment. As expected, both mutations and stops showed a reduction to the background levels at the m1G37 position in the demethylase-treated sample (Fig. 1a, Supplementary Fig. 1a). We identified a second mutation peak at position U8 that was more than 10-folds higher compared to the background level. This peak did not change in the demethylase-treated sample, consistent with the known s4U modification. E. coli tRNAPhe(GAA) has 3-(3-amino-3-carboxypropyl)uridine (acp3U)47, 2-methylthio-N6-isopentenyladenosine (ms2i6A)37, and s4U8, all were detected by mutation, and none responded to demethylase treatment (Fig. 1b, Supplementary Fig. 1b). Ms2i6A37 showed ~99% stops, and ~45% of the remaining 1% read-through were mutations. B. subtilis has m1A22 modifications, which are absent in E. coli. This methylation can be almost fully removed by the demethylases as reflected by the reduction of mutation fraction upon demethylase treatment for B. subtilis tRNASer(UGA) and tRNAGlu(UUC) (Fig. 1c, d). For tRNASer(UGA), the s4U8 was readily identified through its mutations (Fig. 1c), whereas the N6-isopentenyladenosine (i6A37) modification only showed stops (Supplementary Fig. 1c). Modifications in tRNAGlu(UUC) have not been previously mapped; our result is consistent with the presence of m1A22 in this tRNA (Fig. 1d, Supplementary Fig. 1d). No tRNA modifications have been mapped in S. aureus and B. viscericola. Our results are consistent with the presence of demethylase-sensitive m1G37 and m1A22 and demethylase-insensitive s4U8 in S. aureus tRNALeu(UAG) (Fig. 1e, Supplementary Fig. 1e), and m1A22 and s4U8 in tRNASer(GCU) (Fig. 1f, Supplementary Fig. 1f). Finally, we identified mutation and stop signatures that are consistent with the expected m1G37 and inosine (I)34 in B. viscericola tRNAArg(ICG) (Fig. 1g, Supplementary Fig. 1g). Using mutation and/or stop signatures, our results are consistent with all known Watson–Crick face base modifications in these four bacterial species (Supplementary Fig. 2, Supplementary Table 1).

Fig. 1 tRNA modifications of bacterial cultures. Red and black lines show mutation fractions in representative tRNA sequences with (+DM) and without (-DM) demethylase treatment, respectively. a E. coli tRNAPro(CGG) shows m1G37 and s4U8. b E. coli tRNAPhe(GAA) shows acp3U47, ms2i6A37 and s4U8. The additional peak denoted by asterisk (*) may represent an unknown modification. c B. subtilis tRNASer(UGA) shows m1A22, and s4U8. d B. subtilis tRNAGlu(UUC) shows m1A22. e S. aureus tRNALeu(UAG) shows m1G37, m1A22 and s4U8. f S. aureus tRNASer(GCU) shows m1A22 and s4U8. g B. viscericola tRNAArg(ACG) shows m1G37 and I34 Full size image

We examined the complete mutation profiles that correspond to m1A, m1G, and s4U for tRNAs in the four bacterial cultures (Fig. 2, Supplementary Fig. 3). We only found mutation signatures that are consistent with m1A at position 22 located at the junction of the D stem and D loop in B. subtilis and S. aureus, but not in E. coli and B. viscericola (Fig. 2a). We found mutation signatures that are consistent with 13 and 12 tRNAs with m1A22 modification in B. subtilis and S. aureus, respectively; only four have been previously mapped in B. subtilis by 2D-TLC23,24. Among all tRNAs with A22 in these two bacteria, all with G/A13–A22 pairs were modified, whereas all with U13–A22 were not modified, consistent with this modification requiring a non-Watson–Crick base paired A22. M1A22 mutation fractions ranged from 0.81–0.94 in B. subtilis and 0.61–0.94 in S. aureus, suggesting that most tRNAs are modified at high fractions at A22 under our culture conditions. M1G is located at position 37 immediately 3’ to the anticodon nucleotides in bacteria. In all four bacteria, we found that mutation fractions ranged from 0.33–0.90 when the tRNA has G37 in its sequence (Supplementary Fig. 3). Unlike m1A22, m1G37 was accompanied by highly variable stop fractions in a tRNA-dependent manner (ref. 20, Supplementary Fig. 1). Since m1G37 prevents translational frameshifting25,26 and is thought to be present at high fractions in tRNA, mutation alone may not correlate well with modification fractions for m1G37.

Fig. 2 Mutation fractions of two tRNA sites. Heatmaps of mutation fractions for positions 22 and 8 (using standard, canonical tRNA nomenclature) are shown. tRNAs with different anticodons are grouped by their sequences at the respective position of modification (in parenthesis) and in alphabetical orders. Only E. coli and B. subtilis tRNA modifications have been mapped previously by 2D-TLC and LC/MS, but the mapping was not done for every tRNA species12. Every E. coli and B. subtilis tRNA species with mutation fraction at10-times above background is marked with a circle on the right with the following designations: Purples correspond to those known to be present and also identified by sequencing here; blacks correspond to those supposed to be absent but identified by sequencing; greens correspond to those not mapped previously but identified by sequencing; oranges correspond to those known to be present but were not found by sequencing. a m1A22; R corresponds to A or G. b s4U8 Full size image

We identified mutation signatures consistent with many s4U8 modifications in three of the four bacteria (Fig. 2b). We found 28 sites in the E. coli tRNA with mutation fractions of 0.03–0.26 (0.03 corresponded to >10× above background), 20 of which overlapped with the known s4U8 modifications by 2D-TLC and LC/MS. The other 8 sites could either be present only under our culture conditions or were missed in the previous mapping studies. We found 5 sites in B. subtilis tRNAs with mutation fractions of 0.03–0.25, two of which overlapped with the known s4U8 modifications. We found 21 sites with mutation fractions of 0.03–0.39 in S. aureus tRNAs, indicating that S. aureus tRNAs can be highly modified with s4U8. B. viscericola showed no mutation significantly above background for any tRNA, suggesting that s4U8 is absent in this bacterium. Overall, our results suggest that bacterial tRNA modifications can be highly variable at the transcriptome level among tRNA species even under mid-log growth conditions. These findings all together demonstrate the utility of tRNA-seq to study modifications in bacterial tRNA.

Community and anticodon profiles in gut microbiomes

Next, we evaluated the utility of tRNA-seq to infer tRNA abundance and modification profiles in complex gut microbial ecosystems using mice that were fed either a HF or LF diet. Since HF and LF diet affect microbial community structures in the gut27, this model provided an opportunity to also compare community structures inferred by tRNA-seq to those that were identified through 16S rRNA gene amplicons. We collected our samples from the mouse cecum, and processed them both for 16S rRNA gene analysis for taxonomic comparisons, and tRNA-seq with and without demethylase treatment. We developed an open-source software, tRNA-seq-tools (https://github.com/merenlab/tRNA-seq-tools) to de novo identify tRNA sequences with conserved sequence motifs and secondary structures from the raw sequencing data (Fig. 3a). We assigned taxonomy to resulting tRNA sequences, and we used only those sequences that were assigned to a unique anticodon for downstream analyses.

Fig. 3 Microbiome tRNA-seq workflow and taxonomy analysis. a Workflow of tRNA sequencing of gut microbiome samples fed with a high-fat (HF) or low-fat (LF) diet and de novo tRNA assignment. Conserved tRNA residues that were searched for in this work are shown in red. b Dendrograms compare relationships between HF and LF samples that were inferred based on community profiles of tRNA transcripts, or 16S rRNA gene amplicons. c Class-level taxonomy for averaged HF and LF samples based on tRNA-seq (top) and 16S rRNA gene amplicons (bottom). All bacterial classes at >1% level are shown in distinct colors, all other bacterial classes are grouped together and shown in purple. d tRNAGly taxonomy for anticodons GCC, UCC, and CCC. e tRNAGlu taxonomy for anticodons UUC and CUC. Among the other category for GCC/UCC/CCC and UUC, no class has an abundance of ≥1%; for CUC, other classes with an abundance of ≥1% include Alphaproteobacteria, Gemmatimonadetes, and Ignavibacteria. tRNAs decoding these two amino acids are the most abundant in our tRNA-seq results Full size image

DM-tRNA-seq generated all tRNA reads starting from the conserved 3’-CCA8. We designed tRNA-seq-tools to use the canonical tRNA sequence motifs (Supplementary Fig. 4) that start with the 7 nucleotides 3’ acceptor stem and 17 nucleotides TΨC stem-loop, a variable region of 4–5 for type I and 13–22 nucleotides for type II tRNAs (tRNASer/Leu/Tyr for bacteria), and finally a 17 nucleotides anticodon stem-loop. We included the conserved, canonical tRNA sequence of GTTC/C in the TΨC stem-loop and T 5’ and A/G 3’ to the anticodon nucleotides. We further restricted our assignment to anticodon stems containing only Watson–Crick or G-U wobble base pairs, but allowed for one mismatch among all conserved nucleotide sequences. The minimal length of our assigned tRNA reads corresponded to the 5’ end of the anticodon stem to the 3’-CCA which was 48–49 for type I and 57–66 nucleotides for type II tRNAs.

Our analysis on average resulted in ~2.2 million tRNA sequences per demethylase-treated sample containing unambiguous anticodon assignments (Supplementary Table 2). There were on average ~3.8 million other reads that were either too short to be assigned to a unique tRNA anticodon or did not unambiguously fit to the canonical tRNA signatures. Short sequences are often the result of persistent modifications demethylases fail to remove, or other tRNA structural features that interrupt the cDNA synthesis. An example of the former in our data was the ms2i6A37 modification in E. coli tRNAPhe which stopped the RT at ~99% (Supplementary Fig. 1b). Therefore, most tRNAPhe reads in E. coli did not reach the anticodon and would not have been taken into consideration here. For the tRNA sequences with anticodon assignments, grouping the biological replicates by individual anticodons showed good compatibility among the HF or LF groups (Supplementary Fig. 5). Our tRNA extraction method from the mouse cecum resulted in less than 1% contamination with mouse tRNA sequences, which was consistent with our microarray studies28. We randomly selected one type I and one type II tRNA and aligned matching quality-filtered short reads to visualize the coverage and homogeneity of individual nucleotide positions (Supplementary Fig. 6). Consistent with our assignment rules, all reads were beyond the 5’ stem of the anticodon stem-loop with gradual drop-offs toward the 5’ end of tRNA, which we had also observed with human tRNA reads8.

We performed de novo analysis of tRNA sequences and 16S rRNA gene amplicons using Minimum Entropy Decomposition29 to infer community structures, which revealed nearly identical relationships between samples (Fig. 3b). We also compared the average taxonomic distribution of HF/LF samples from tRNA-seq and 16S rRNA gene amplicons (Fig. 3c). While we used the SILVA database30 with >350,000 entries to assign taxonomy for 16S rRNA gene amplicons, we used 4235 gold-standard bacterial genomes obtained from the Ensembl database31 to assign taxonomy for tRNA sequences (Supplementary Table 3). For the six most abundant bacterial classes, the full tRNA taxonomy qualitatively matched the 16S rRNA gene-based taxonomy (Fig. 3c), but proportions of taxa differed between the two approaches. For instance, tRNA-seq showed higher fractions of Clostridia and Actinobacteria, and lower fractions of Bacteriodia and Erysipelotrichia classes. Multiple factors could result in differences between abundance estimates of the two approaches, one among them being the utilization of RNA transcripts in the tRNA-seq workflow in contrast to PCR amplification of genomic DNA in the 16S rRNA gene-based workflow (see Discussion).

We also analyzed the taxonomic make up of tRNA sequences at the anticodon-level (Fig. 3d, e, Supplementary Table 4). The two most abundant tRNA sequences in our dataset decoded amino acids of glycine and glutamic acid. Previous analyses based on available genomes show that for tRNAGly (Fig. 3d), UCC is present in only all, GCC in most, and CCC in only a small number of bacterial taxa 32. The taxonomy of tRNAGly(GCC) was similar to the taxonomic make up of all tRNA sequences. The tRNAGly(UCC) taxonomy has a larger representation of Clostridia, and the tRNAGly(CCC) taxonomy diverged from the all-tRNA taxonomic profiles such as it showed more Bacilli in the HF sample. Similarly, for tRNAGlu (Fig. 3e), UUC is present in all, while CUC is found only in some bacterial genomes. Our results showed that the tRNAGlu(UUC) taxonomy was more similar to the overall taxonomy of all tRNA sequences, whereas the tRNAGlu(CUC) taxonomy had a different profile with a larger representation of Actinobacteria in the LF sample.

To gain further insights into tRNA anticodon-based taxonomy, we performed additional analyses of the anticodons CUC and UUC of sequences that encode for glutamic acid (Supplementary Fig. 7). tRNAGlu(UUC) can read both GAA and GAG codons and is essential in all cells, whereas tRNAGlu(CUC) can only read GAG codon and is optional for life. We found that the taxonomic proportion of tRNAGlu(UUC) varies widely among the six major bacterial classes compared to the taxonomic profiles based on all tRNA sequences as well as 16S rRNA gene amplicons (Supplementary Fig. 7a). Zooming into the class Bacilli further, we found that 75% of all tRNAGlu(CUC) genes (78/104) are concentrated in the Lactobacillaceae family (Supplementary Fig. 7b), suggesting that this family can be uniquely assessed using tRNAGlu(CUC) reads. Lactobacillaceae could be resolved both through tRNA sequences and 16S rRNA gene amplicons which showed similar abundance patterns between HF and LF microbiome samples (Supplementary Fig. 7c).

Focusing further on tRNAGlu(CUC) and tRNAGlu(UUC) reads from the family of Lactobacillaceae, we found that the ratio of tRNAGlu(CUC) to tRNAGlu(UUC) was higher in the LF samples than the HF samples (Supplementary Fig. 7d). This could be due to either higher tRNAGlu(CUC) expression or a higher proportion of Lactobacillaceae organisms that contain tRNAGlu(CUC) in the LF samples. In the case of the former, the increased tRNAGlu(CUC) expression could potentially add an additional capacity for the Lactobacillaceae in LF samples to specifically decode the GAG codon in translation. Finally, we found that all tRNAGlu(CUC) reads in our samples are from the Lactobacillus genus in the Lactobacillaceae family. This genus is represented by 31 genomes in our tRNA database; 18 different tRNAGlu(CUC) gene sequences are present among the 31 genomes, which allowed us to investigate the species-level distribution of reads that resolved to tRNAGlu(CUC) (Supplementary Fig. 7e). Overall, these results indicate that anticodon-level taxonomic analyses of tRNA transcripts can provide additional insights into physiological differences between individual branches of bacteria.

Diet dependent differences in tRNA modification levels

A unique feature of tRNA-seq is the capability to assess in situ tRNA modifications. Compared to bacterial cultures, the analysis of modifications in tRNA transcripts in complex microbial communities presents additional challenges. For instance, even though both mutations and stops are useful for tRNA modification studies in pure cultures, in microbiome studies we can only rely on mutation information since stops yield short reads that often do not assign to specific tRNA seed sequences unambiguously. Mutations in the sequencing data can be identified through the alignment of individual reads to reference seed sequences. For bacterial cultures, the seed sequences are simply the annotated tRNA genes. In contrast, complex environments may require de novo-identified tRNA sequences to serve as seed sequences for modification analyses due to the lack of comprehensive reference genomes (Fig. 4a, Supplementary Fig. 8).

Fig. 4 Microbiome tRNA modification analysis. a Workflow for modification assignment using mutation signatures. b–e Representative positional plots showing m1A and s4U modifications for transcripts of tRNASer(GCU) (b), tRNASer(UGA) (c), tRNASer(GGA) (d), and tRNASer(CGA) (e), HF-fed mouse sample. The peak numbers correspond to those described in the text with peak 1 called for s4U8 and peak 2 for m1A22. Peak 3 is located around nucleotide 73–79 in the type II tRNASers, but is m1A59 in the standard tRNA nomenclature. Red and black lines show mutation fractions in tRNASer sequences with (+DM) and without (−DM) demethylase treatment, respectively Full size image

To study modification fractions we selected our seed sequences from demethylase-treated samples. Our clustering-based approach (see Materials and Methods) resulted in an average of ~10,700 seed sequences for each sample, which was approximately proportional to the total number of tRNA reads per sample (Supplementary Table 2).

We ran a multiple sequence alignment for the same anticodon while allowing flexible lengths in the variable loop and the α and β regions of the D loop to visualize our results. Figure 4 shows our workflow to study modification fractions, and the alignment results for the tRNASer transcripts in a HF sample. We found three major peaks for tRNASer(GCU) (Fig. 4b) without demethylase treatment. Peaks 2 and 3 disappeared upon demethylase treatment, therefore assigning them to m1A. Peak 1 remained upon demethylase treatment, thereby assigning it to s4U. We found the same 3 peaks for the other three tRNASers with the same responses to demethylase treatment (Fig. 4c–e). We found a demethylase-removable fourth peak for tRNASer(CGA) (Fig. 4e) which may represent m1A in tRNA species with shorter variable arms.

We then analyzed bacterial taxon-dependent modifications. Examples in Fig. 5a were tRNASer(UGA) in the same microbiome sample. For this tRNA from Lactobacillus, class Bacilli, we readily identified mutation signatures consistent with m1A22 and s4U8 (Fig. 5a, Supplementary Fig. 9a), both were also known in tRNASer(UGA) of B. subtilis, class Bacilli. For Bifidobacterium, class Actinobacteria, mutation signature consistent with an m1A in the T loop (m1A59 in standard tRNA nomenclature) can be readily identified, but this tRNA did not have s4U8 (Fig. 5a, Supplementary Fig. 9b). M1A59 modification has also been found in Streptomyces griseus12 in class Actinobacteria, although S. griseus is in the order Actinomycetales, whereas Bifidobacterium belongs to the order Bifidobacteriales. We also identified the more common m1A58 modifications, e.g. in tRNAPhe from Bifidobacterium (Supplementary Fig. 9c).

Fig. 5 Taxonomic differences of modification sites. a Examples of aligning tRNA sequencing reads to two seed sequences of tRNASer(UGA) from Lactobacillus, class Bacilli, and Bifidobacterium, class Actinobacteria without and with demethylase treatment. Modification sites identified (s4U and m1A) are highlighted between the white lines. b m1A22, m1A58/59, and s4U8 identified in the abundant bacterial classes from Fig. 3c in the context of their phylogenetic relationship. Large fonts indicate bacterial classes in which the majority of the modifications are found (m1A22 in Clostridia and Bacilli, m1A58/59 in Actinobacteria, and s4U8 in Clostridia and Bacilli). c Proximal location of the m1A22 (red), m158 (blue), and m159 (green) modifications in a tRNA three-dimensional structure Full size image

We identified a class-dependent modification pattern at three positions (Fig. 5b). M1A22 was predominantly present in the closely related Clostridia and Bacilli classes. M1A58 and m1A59 were predominantly present in Actinobacteria. S4U8 is widely spread in most of the bacterial classes, but absent in Bacteriodia and Deltaproteobacteria. The absence of m1A and s4U in Bacteriodia is unlikely due to the lack of sequencing depth as this result is consistent with our bacterial culture study that did not find s4U8 for B. viscericola from class Bacteriodia (Fig. 2b). We superimposed the three m1A modifications onto the three-dimensional tRNA structure (Fig. 5c). M1A22, m1A58, and m1A59 all introduce a positive charge in the tRNA, and are all located in the elbow region of the tRNA structure, suggesting that they may serve a common function.

To investigate whether modification fractions changed between HF and LF group, we analyzed the mutation fractions of m1A and s4U sites in all tRNAs (Fig. 6). The mutation fraction is not fully quantitative for absolute modification fraction due to tRNA sequence context-dependent differences. However, comparative analyses of modification levels at each site between two samples can be interpreted with high confidence in regard to relative changes in modification fractions, as each site has the same sequence context. For Lachnospiraceae, m1A22 was present in eight tRNA anticodon groups, but in seven of the eight, the median level was higher in the HF than the LF samples (Fig. 6a). For Bifidobacteriaceae, m1A58 and m1A59 were present in 12 tRNA anticodon groups; in all 12 groups, the median level was higher in the HF than the LF samples (Fig. 6b). The m1A level change under dietary conditions of the mouse gut microbiome may be related to translation activity. In a human cell study, m1A modified tRNA had an increased affinity to the translation elongation factor and increased the expression of a reporter gene33. On the other hand, m1A-hypomodified tRNAs may be better tuned for interaction with other proteins in cells to perform extra-translational functions34.

Fig. 6 Comparisons of mutation fractions of HF versus LF samples. Bacterial families with the highest numbers of modifications at the respective nucleotides are shown. Each pair shows HF and LF samples with distinct anticodons marked on top. The amino acid whose codons are read by the corresponding tRNA with designated anticodon can be found in Supplementary Table 4. Box-and-whisker plots show median as a line, upper and lower quartiles in the box, and outliers outside of the line. a m1A22 from Lachnospiraceae, class Clostridia. b m1A58 and m1A59 from Bifidobacteriaceae, class Actinobacteria. c s4U8 from Lachnospiraceae, class Clostridia Full size image

For Lachnospiraceae, s4U8 was present in 24 tRNA anticodon groups; no clear preference for HF versus LF samples could be discerned (Fig. 6c), indicating that the s4U levels did not respond to the dietary conditions of the mouse gut microbiome. S4U8 is known to serve as a sensor to elicit cellular responses to UV;35 gut microbiomes are not exposed to UV light and therefore may not adjust its s4U8 levels to dietary differences.

tRNA modification fractions and protein expression levels

While differences across dietary conditions suggest tRNA modification fractions could reflect environmental responses, these results by themselves do not communicate whether modification fractions are associated with protein synthesis. To investigate associations between the m1A modification fractions observed in our study and protein synthesis, we used the gut metaproteomics data generated by Figeys, Stintzi, Mack, and co-workers36 also from mice fed with HF and LF diet. In their study, the authors describe a total of 849 proteins with significantly different expression levels between the HF and LF samples collected over a course of 43 days. The day 29 and 43 experimental conditions most closely resemble the experimental condition of our tRNA-seq experiment. The metaproteomics data from both day 29 and 43 showed a marked difference between HF and LF conditions (e.g., Figure 7a). Our taxonomic analysis assigned 77% (655/849) of these proteins to the class Clostridia (Fig. 7b), and we focused our subsequent analysis specifically on this group (Fig. 7c). To examine a possible relationship between these differentially expressed proteins and tRNA m1A modifications, we compared the amino acid and codon compositions of these proteins that are highly expressed in HF samples (log >1) and depleted in HF samples (log < −1). In Clostridia, m1A modification is present in tRNAs that decode amino acids Cys, Glu, Gln, and Ser (Fig. 6a). We combined amino acid residues or codons based on protein expression levels in HF over LF, and subtracted the compositions of HF over-expressed proteins minus the HF under-expressed proteins (Fig. 7d, e). In both cases, Cys and Glu, but not Gln and Ser are over-represented among the highly over-expressed proteins, although this over-representation was not unique to Cys and Glu.

Fig. 7 Analysis of differential protein expression and tRNA modification. Proteomics data from HF- and LF-fed mice were from reference36. a Average differential expression of 849 proteins between HF- and LF-fed mouse gut microbiome from day 43 mice that most mimics the experimental condition of our tRNA-seq experiment. b BLASTp protein sequence analysis shows that most of these proteins are from class Clostridia. c Quantitative difference between the clostridia proteins from day 43 mice. Lines show the boundaries of the proteins used for downstream analysis that are highly enriched (log >1, 88 proteins) or depleted (log< −1, 105 proteins) in HF over LF samples. The difference in amino acid (d) or codon content (e) determined by subtracting the compositions of HF over-expressed proteins minus the HF under-expressed proteins. The amino acids or codons for which their decoding tRNAs were found to contain m1A modifications are in red: Cys, Glu, Gln, Ser. The difference in amino acid pair (f) and codon pair content (g) determined by subtracting the pair compositions of HF over-expressed proteins minus the HF under-expressed proteins. The first amino acid represents the N-terminal residue and the first codon represents the 5’ codon Full size image

Since ribosome decoding includes an adjacent pair of mRNA codons in the A and P site, we also combined amino acid and codon pairs in a similar fashion (Fig. 7f, g, Supplementary Fig. 10). Excitingly, the top five most over-represented amino acid pairs in both day 29 and 43 HF over-expressed proteins relative to the HF depleted proteins were E-E (Glu-Glu), K-E (Lys-Glu), L-E (Leu-Glu), E-K (Glu-Lys), and R-E (Arg-Glu). For codon pairs, the top five most over-represented were GAG-GAA (Glu-Glu), AAG-GAA (Lys-Glu), GCG-GAG (Ala-Glu), CTG-GAA (Leu-Glu), and CCG-GAG (Arg-Glu) which closely matched the over-represented amino acid pairs. Decoding the top five over-represented amino acid and codon pairs all involves the m1A-containing tRNAGlu (anticodon CUC, UUC) in HF samples (Fig. 6a). These proteomic results are therefore consistent with our hypothesis that tRNAs with higher m1A levels enhance the decoding of their respective amino acid/codon pairs. Overall, this analysis indicates that tRNA m1A modification level differences are consistent with m1A modified tRNAs facilitating increased expression of microbiome proteins with specific amino acid and codon contexts.