Genomic characteristics of ActE

Table 1 compares the genomic characteristics of ActE with well-known soil-isolated Streptomyces that produce antibiotics and with two model cellulolytic bacteria, Clostridium thermocellum and Cellvibrio japonicus3,22,23. Putative biomass-degrading protein-coding sequences from ActE were identified by BLAST analysis of the finished genome to the Carbohydrate Active enZyme (CAZy) database. Among the 6357 predicted protein-coding genes, 167 have one or more domains assigned to CAZy families, including 119 GHs, 29 CEs, 6 PLs and 85 CBMs. ActE contains 45 different types of GH families, 4 PL families, 7 CE families and 21 CBM families. The number of total CAZy domains and diversity of CAZy families are comparable to other highly cellulolytic organisms.

Table 1 Comparison of genomic composition Full size table

Nearly all publically available Streptomyces genomes encode a relatively high percentage of genes for putative cellulolytic enzymes. Interestingly, ActE and the antibiotic producing Streptomyces, S. griseus and S. coelicolor, shown in Table 1 have similar numbers and compositions of CAZy families, but substantially different genome sizes. However, these antibiotic-producing Streptomyces are not highly cellulolytic (Fig. 1). Relative to S. griseus and S. coelicolor, the ActE genome contains two unique CAZy families but does not possess 16 CAZy families present in these species. However, ActE contains more representatives in 13 CAZy families. Enrichment of certain CAZy families is observed in other highly cellulolytic organisms. For example, C. thermocellum contains 16 genes in the GH9 family alone. It is interesting to consider whether the reduction in total genome size and differences in CAZy composition between ActE and other closely related soil-dwelling Streptomyces might have arisen from evolutionary specialization of ActE, perhaps driven by association with the Sirex-fungal symbiosis.

Figure 1 Growth of ActE in minimal medium containing filter paper as the sole carbon source. (A) Growth of ActE, Streptomyces coelicolor and Streptomyces griseus in minimal medium for 7 days at 30°C and pH 6.9. The expanded image shows small colonies of S. coelicolor and S. griseus forming on the surface of the paper. (B) Growth of ActE and Trichoderma reesei Rut-C30 for 7 days at 30°C and pH 6.0. Full size image

Secreted proteins from ActE during growth on pure polysaccharides

ActE grew well in minimal medium containing cellulose as the sole carbon source. The growth rate was similar to that of Trichoderma reesei Rut-C30, a model cellulolytic microbe (Fig. 1) and both completely deconstructed filter paper strips in 5–7 days. By comparison, S. coelicolor and S. griseus grew only sparingly as small colonies on filter paper under the same conditions. This difference in growth capabilities among closely related Streptomyces prompted us to further examine ActE, which also grew with a filamentous morphology on polysaccharides including plant biomass (Supplementary Fig. S1).

Reactions of the ActE secretomes

The enzymatic activities of ActE secretomes were compared with a commercial secretome, Spezyme CP. This enzyme cocktail is prepared from T. reesei Rut-C30 and thus provides a useful, routinely available reference point for the capabilities of other cellulolytic organisms. HPLC analysis showed that the ActE cellulose secretome released cellobiose as the primary product during reaction with cellulose (Fig. 2A, 95% of products), which is distinct from the higher proportion of glucose produced by the T. reesei secretome. Similarly, the primary products from xylan and mannan were xylobiose and mannobiose, respectively. Upon accounting for total glucose equivalents released, the ActE secretome obtained from growth on pure cellulose had specific activity that was about half of that provided by Spezyme CP (Fig. 2A, inset). Interestingly, the ActE secretome obtained from growth on pure cellulose had higher specific activity for deconstruction of pure mannan than Spezyme CP (Fig. 2B). Additionally, the ActE secretome obtained from growth on pure xylan had higher specific activity for reaction with pure xylan than Spezyme CP. Cellulose, xylan and mannan are all abundant in pinewood, thus accounting for the necessity of each of the major catalytic activities detected.

Figure 2 Reactions of ActE secretomes and Spezyme CP. (A) HPLC of sugars released from cellulose (1, cellotriose; 2, cellobiose; 3, glucose) and quantification of glucose equivalent (insert). (B) Reducing sugars released from xylan and mannan by the secretomes of ActE grown on cellulose and xylan. (C) Total reducing sugar released from ionic liquid-switchgrass (IL-SG) or AFEX-switchgrass (AFEX-SG) in reactions of the ActE cellulose, AFEX-SG and IL-SG secretomes and Spezyme CP. Data represent the mean ± s.d. from three experiments; * indicates P<0.01 compared with Spezyme CP. Full size image

Anion exchange chromatography was performed to fractionate the ActE secretome obtained from cells grown on cellulose as the sole carbon source. We identified fractions that hydrolyzed pure polysaccharides by biochemical assays (Supplementary Fig. S2) and confirmed the identity of the protein or proteins contained in these fractions by mass spectrometry (Supplementary Table S1). Where multiple polypeptides were present, the identity of each was confirmed by mass spectrometry to correspond to the indicated gene locus. In several cases, these most likely arise from proteolysis of a single protein found in the secretome. Fractions containing the maximum cellulase activity were highly enriched in SACTE_0236 and SACTE_0237, reducing and non-reducing end cellobiohydrolases from the GH6 and GH48 families, respectively. SACTE_0265 and SACTE_2347 were identified as the major proteins present in fractions associated with xylan and mannan hydrolysis, respectively. A CBM33 polysaccharide monooxygenase (SACTE_3159) was also identified in the ion exchange profile. Moreover, beta-1,3 glucanase activity was identified in fractions that were enriched in SACTE_4755.

When ActE was grown on either ammonia fiber expansion-treated switchgrass (AFEX-SG)24 or ionic liquid-treated switchgrass (IL-SG), the secretomes had ~2-fold increase in specific activity relative to the cellulose secretome and were equivalent to Spezyme CP for reaction with both the AFEX- and IL-treated biomass (Fig. 2C)24. The ActE secretomes retained greater than 60% of maximal activity for the hydrolysis of AFEX- and IL-SG from 30 to 55°C and 35 to 47°C, respectively, which is comparable to recent reports on the temperature profile of secretomes from thermophilic biomass-degrading fungi25 (Supplementary Fig. S3A). The secretomes showed a pH optimum of ~7 for reaction with AFEX-SG and a pH optimum of ~8 for reaction with IL-SG. Moreover, these secretomes retained greater than 60% of maximal activity over the ranges of pH 4.5 to 8.0 and pH 7.0 to 8.0, respectively (Supplementary Fig. S3B). These optimal pH values are considerably higher than observed for Spezyme CP.

Secretome analysis on pure polysaccharides and plant biomass

To identify secreted proteins, supernatants from ActE cultures grown on glucose, cellobiose, cellulose, xylan, chitin, switchgrass, AFEX-SG and IL-SG were analyzed by LC-MS/MS (Fig. 3, Supplementary Table S2). The proteins were sorted into a descending rank according to spectral counts and sets whose spectral counts summed to 95% of the total protein in each secretome are shown. Fig. 3A summarizes the percentages of CAZy families in the detected proteins. The glucose secretome had a protein concentration of ~0.03 g/L of culture medium and among the 136 proteins identified only 3% had a CAZy annotation. Indeed, the majority (>90%) likely originated from cell lysis. In contrast, the polysaccharide secretomes had a protein concentration of ~0.3 g/L of culture medium, a ~10-fold increase from the glucose secretome. Pectate lyase (SACTE_1310), chondroitin/alginate lyase (SACTE_4638), an extracellular solute binding protein (SACTE_4343), bacterioferritin (SACTE_1546) and catalase (SACTE_4439) were observed in all polysaccharide secretomes. The first two proteins, SACTE_1310 and SACTE_4638, have signal peptides and are thus secreted as part of the response needed for growth on polysaccharides.

Figure 3 Composition of ActE secretomes identified by LC-MS/MS. (A) CAZy genes account for 2.6% of the 6357 predicted protein-coding sequences in the ActE genome. (B) Identity of most abundant proteins in the cellulose secretome proteins is sorted according to decreasing spectral counts (accounting for 95% of total spectral counts); corresponding spectral counts from other secretomes are also shown. Full size image

Fig. 3 and Supplementary Table S2 demonstrate that 22 proteins accounted for 95% of the total spectral counts during growth on cellulose; two-thirds were from CAZy families. The five most abundant proteins, in order and representing ~85% of the total spectral counts, were reducing and non-reducing exoglucanases (SACTE_0236 and SACTE_0237), a CBM33 polysaccharide monooxygenase (SACTE_3159), an endoglucanase (SACTE_0482) and a β-mannosidase (SACTE_2347). The first four proteins encode a non-redundant set of enzymes that likely provide the essential activities required for utilization of crystalline cellulose22. Among the 22 most abundant proteins, there were representatives from 9 different GH families, two CE families, two PL families and two additional CMB33 proteins. Collectively, these secreted proteins represent ~20% of the CAZy composition in the ActE genome.

There were substantial differences in the composition of the xylan and chitin secretomes as compared to the cellulose secretome (Fig. 3, Supplementary Table S2). In the xylan secretome, 92 proteins comprise 95% of the detected spectral counts. Twenty GHs from 18 different CAZy families were included, along with 1 CE4 and 2 PL family proteins. Thus, growth on xylan elicits secretion of representatives from half of the total CAZy families found in the ActE genome. The broad distribution of hemicellulytic enzymes in the xylan secretome contrasts with the considerably less diverse composition of the chitin secretome, which consists of 7 representatives from GH18 (e.g., chitinase, endo beta-N-acetylglucosaminidase), 2 from GH19 (e.g., chitinase, lysozyme) and 1 chitinolytic CBM33 (Supplementary Table S2). While chitinolytic CAZy families account for two-thirds of the proteins secreted during growth on chitin, they represent only ~6% of the diversity of CAZy families found in the genome. These results document the substantially different substrate-specific responses of ActE during growth on different polysaccharides.

The secretomes isolated from cells grown on switchgrass, AFEX-SG and IL-SG contained the highly abundant secreted proteins identified in the purified cellulose and xylan experiments and some additional proteins. These additional proteins likely reflect cellular response to the more complex composition of polysaccharides present in the biomass samples. The increased diversity of proteins present in the biomass secretome also increased the efficiency of reaction with plant biomass (Fig. 2C). In total, the biomass secretomes contained 31 different CAZy families that contributed to the total spectral counts (~70% of the CAZy families present in the ActE genome), thus representing coordinated and extensive use of CAZyme families present in the ActE genome for biomass utilization.

Gene expression analysis during growth on purified polysaccharides and plant biomass

Gene expression profiles were determined for ActE grown on purified polysaccharides and plant biomass by whole genome microarrays (Figs. 4 and 5, Supplementary Figs. S4 to S9). Genome-wide gene expression was analyzed as a functional annotation network composed of ActE genes (circles) connected to predicted functional groups (triangles; KEGG or CAZy). In Fig. 4, the network was annotated with genome-wide microarray expression data to indicate genes that were differentially expressed when ActE was grown on either AFEX-SG or glucose and further annotated to indicate normalized expression levels observed during growth on AFEX-SG. While many aspects of metabolism are modestly changed in response to these different carbon sources, the CAZy and ABC transporter categories were substantially enriched in differentially expressed genes (Fig. 4, green circles). Furthermore, pentose sugar metabolism, sulfur metabolism and some amino acid biosynthesis pathways (e.g. aromatic amino acids) were also highly induced during growth on AFEX-SG relative to other carbon sources (Supplementary Figs. S4 to S9). In contrast, ribosomal, secondary metabolite and DNA repair genes showed little change in expression across the conditions examined. Within the CAZy functional group, there was a large induction of genes that contained both a GH domain and a CBM2 domain. Among the 11 genes in the ActE genome that contain a CBM2 domain, 6 were induced greater than 4-fold during growth on AFEX-SG. Furthermore, 9 of the 11 CBM2 containing proteins were identified in the secreted proteome (Fig. 3).

Figure 4 Genome-wide changes in expression during growth of ActE on AFEX-treated switchgrass (AFEX-SG) versus glucose. Nodes are genes (circles) or KEGG/CAZy functional categories (yellow triangles); edges indicate that the gene belongs to the indicated functional group as defined by either KEGG or CAZy analysis. Gene node sizes reflect expression intensity determined by microarray from growth on AFEX-SG as a log 2 ratio, where the genome-wide average transcriptional intensity was ~10.5 for both substrates. Node colors represent expression changes as the log 2 ratio of AFEX-SG/glucose transcript intensities. Full size image

Figure 5 Expression of ActE CAZy genes on various carbon sources. (A) Hierarchical clustering of expression for 167 CAZy genes from the ActE genome during growth on the indicated substrates. (B) Identity of CAZy genes with distinct changes in expression observed in group 1 CAZy genes during growth in different carbon sources. Information for additional groups is provided in the Supplementary Information. Full size image

Given the large number of differentially expressed CAZy genes identified in the network analysis, we analyzed the expression of this group of genes in cultures grown on different carbon sources (Fig. 5). As with other cellulolytic organisms, there was strong correlation between the content of the secreted proteomes and the most highly expressed genes. Of the 167 ActE genes containing CAZy domains, 68 genes (Fig. 5, group 1) showed distinct increases in expression when grown on different polymeric substrates, 14 genes (group 2, see Supplementary Fig. S10) did not show any appreciable level of expression and 85 genes (group 3, see Supplementary Fig. S11) showed moderate changes in expression with the different substrates. A significant fraction of these genes contained translocation signals for either the Sec or twin-arginine translocation pathways and genes encoding structural polypeptides for these translocation pathways were also highly expressed. Besides correlation with secreted proteins, the transcriptomic studies also gave insight into co-regulated gene clusters that potentially encode functional units for utilization of different polysaccharides by ActE. In the following, the 130 genes with normalized expression intensities in the top 2% of all genes are described.

During growth on cellulose, four CAZy genes (SACTE_0236, SACTE_0237, SACTE_3159 and SACTE_0482) showed >15-fold increase in transcript abundance (Fig. 5) and the corresponding proteins were highly enriched in the secreted proteome. None of these four were obviously placed in a gene cluster and the two most highly expressed genes, SACTE_0236 and SACTE_0237, while adjacent on the chromosome, were transcribed in opposite directions. Nevertheless, these four most highly expressed genes and three others that showed >5-fold increase in transcript abundance (SACTE_3717, SACTE_6428, SACTE_2347, Table 2) were associated with a conserved 14 bp palindromic promoter sequence, TGGGAGCGCTCCCA (the CebR binding element). CebR proteins are LacI/GalR-like transcriptional regulators shown to provide transcriptional control of gene expression in response to the presence of cellobiose or other small oligosaccharides in S. griseus, S. reticuli and Thermobifida fusca26,27,28. Likewise, the genes (SACTE_2285 to SACTE_2289) encoding a CebR regulator (SACTE_2285), a GH1 protein (β-glucosidase), a two-protein cellobiose transporter system and an extracellular solute binding protein were associated with a CebR binding element and were also among the most highly expressed genes during growth on cellulose. These latter five genes have 75% or greater sequence identity with the cellobiose utilization operon identified in S. griseus and S. reticuli26,29. There were only 15 genes annotated as hypothetical or domain of unknown function (12%) up-regulated during growth on cellulose, a considerably smaller percentage of these than in the entire genome (27%).

Table 2 Analysis of upstream DNA sequence elements in ActE genes upregulated during growth on cellulose Full size table

Several characteristics distinguished expression during growth on either xylan or chitin. First, unique sets of genes were induced, as there was only 14% and 10% overlap, respectively, when compared to cellulose. Second, ~33% of the top 2% of genes expressed during growth on either xylan or chitin were annotated as hypothetical or domain of unknown function, which greatly exceeds the unknown fraction in the cellulose secretome. During growth on xylan, two clusters of genes were up-regulated. One extended from SACTE_0357 to SACTE_0370, encoding proteins from the GH11, GH13, GH42, GH43, GH78, GH87 and CE4 families, a LacI-like transcriptional regulator, a secreted peptidase and two sets of inner membrane transporters and associated solute binding proteins. Alternatively, during growth on chitin, three CBM33 proteins were up-regulated (SACTE_0080, SACTE_2313, SACTE_6493) and two of these had an immediately adjacent gene encoding a GH18 (SACTE_6494) or GH19 (SACTE_0081) that was up-regulated.

When ActE was grown on biomass samples, 14 additional CAZy genes were uniquely up-regulated and the corresponding proteins were identified in the proteomic analysis of biomass secretomes (Figs. 3 and 4). A gene cluster extending from SACTE_5858 to SACTE_5864 was uniquely up regulated during growth on biomass. Among these genes, SACTE_5860 and SACTE_5862 are annotated as a twin-arginine translocation pathway protein and an ABC transporter, respectively, while the rest are annotated either as hypothetical protein or as domain of unknown function.