Microbial diversity development from birth to juvenile

After quality filtering and assembly, 2,739,883 16 S rRNA gene sequences from 332 fecal samples and 3,604,489 ITS-1 sequences from 97 samples (Supplementary Table S1) were grouped into 1161 and 1006 OTUs, respectively. Across all 16 S rRNA samples, 98.3% of the total sequences were assigned into five phyla (total 10 phyla, Supplementary Figure S3). Firmicutes and Proteobacteria were the main phyla, which was consistent with other studies (e.g., [7, 9]). Bahl et al. [20] reported that freezing could result in decreasing the abundance of Bacteroidetes. Although Bacteroidetes had a low abundance with average 0.25% ( ± 1.6%) in our study, this was in line with previous studies on the giant panda gut microbiota (e.g. [4, 7,8,9]). Moreover, snap freezing of samples in our study should also preserve the integrity of the microbiota, as shown by Fouhy et al. [21]. However, we cannot exclude that the low abundance of Bacteroidetes may (at least partly) be owing to lysis during storage. The top 20 families included 94.4% of the total sequences and > 85% of the bacterial sequences affiliate to Enterobacteriaceae, Streptococcaceae, Lactobacillaceae, Clostridiaceae, Campylobacteraceae and Veillonellaceae for each group (Supplementary Figure S4). From birth (S1) to juvenile stages (S3/S4), the relative abundances of Firmicutes increased and Proteobacteria decreased significantly (P < 0.00001, T-tests; Supplementary Figure S3). The abundances of bacterial phyla were relatively constant in groups S3/4: this constancy among samples collected over > 12 months is an indication that the giant panda cub gut microbiome has reached a stable state, which suggests that the 1-year-old giant panda gut microbiome has many of the functional attributes of the adult microbiome.

For 16 S rRNA, pronounced shifts in abundant lineages around the second and 12th month seemed to follow dietary changes (Fig. 1a–c; Supplementary Figures S3–S5). During the first 2 months, bacterial communities were largely composed of Enterobacteriaceae, Streptococcaceae, Lactobacillaceae, and Campylobacteraceae (Supplementary Figure S4). Then, Enterobacteriaceae and Streptococcaceae decreased and Lactobacillaceae and Clostridiaceae increased between 3rd and 12th months (Supplementary Figure S4). In addition, Campylobacteraceae was relatively scarce in > 4 month old pandas and Veillonellaceae was relatively abundant between the 3rd and 12th months and scarce in other months (Supplementary Figure S4). After month 12, Clostridiaceae and Streptococcaceae reached highest abundances and Enterobacteriaceae and Lactobacillaceae lowest abundance (Supplementary Figure S4). The non-parametric Spearman rank correlation showed that relative abundance of Streptococcaceae and Clostridiaceae had a positive relationship with ages from birth to juvenile and Enterobacteriaceae was on the contrary (Supplementary Figure S4). At the genus level, eight taxa represented each > 1% of the total sequences across all samples. Among these genera, Escherichia/Shigella and Streptococcus were present in all samples and Clostridium was the only genus having a positive correlation with age (Supplementary Figure S4). Compared with the two other groups, S1 and S2 displayed significantly lower α-diversity (Fig. 1a and Supplementary Figures S5, P < 0.01) and Shannon diversity indices were similar to each other in every group (Supplementary Figure S5).

Fig. 1 16S rRNA gene-based comparison of gut microbiota structure of giant pandas from birth to juvenile. a Relationship between alpha diversity and increasing age. Principal coordinate analysis (PCoA) using unweighted b and weighted c UniFrac distances of 16 S rRNA data. Each point corresponds to a community colored by collection date and diet (Table S1) and the black outer ring indicates metagenomic survey samples. d Heatmap of the 51 OTU-level phylotypes identified as key variables for differentiation between S1 and S3/S4 gut microbiota structure of the giant pandas. The OTUs are arranged according to their Spearman correlation coefficient (SCC) index between the relative abundance of OTUs and age of giant pandas. Each column represents one sample Full size image

Microbial community membership and structure showed significantly higher intra-individual variations between groups than that within each group. Fecal samples collected early in the time series harbored microbial communities more similar to each other than to samples collected later, and vice versa (P < 0.01; PERMANOVA with Monte Carlo). The giant pandas had a long time to shift their diet from panda milk to bamboo and this time was denoted by S2. During this time, supplementary milk and bamboo leaves/stem and/or shoots were provided, but no bamboo could be found in giant panda’s stools. The α- and β-diversity of S2 was overlapping with that of S1 or S3/4 (Fig. 1b–c, Supplementary Figure S5). In addition, the diversity of S3 (bamboo leaves/stem as diet) were not significantly different from that of S4 (bamboo shoots as diet) (Fig. 1b–c).

Some species of Clostridia were reported to be distantly related to known cellulose degraders [7, 9]. In our samples, we detected 196 OTUs affiliating to the class Clostridia (Supplementary Figure S6). Random Forest analysis identified 51 OTUs whose relative abundance reliably discriminated S1 and S3/4 samples (Fig. 1d; Supplementary Table S2). A total of 20 and 31 of these OTUs had a negative or positive correlation between their relative abundance and age, respectively, and none of these OTUs was closely associated with putative cellulolytic lineages (Fig. 1d and Supplementary Figure S6). In addition, six OTUs clustered with Cellulosilyticum lentocellum and Cellulosilyticum ruminicola who both can degrade cellulosic materials [22] and two with Clostridium cellulosi, a cellulolytic thermophile [23] (Supplementary Figure S6). Xue et al. [9] also found some OTUs closely related to the above three species. Zhu et al. [7] found some OTUs closely related to Clostridium cluster I and XIVa, which contains organisms known to digest cellulose. In this study, 10 OTUs affiliated to Clostridium cluster I and six to Clostridium cluster XIVa (Supplementary Figure S6). In addition, two OTUs were assigned into the cluster including Fibrobacter succinogenes, Ruminococcus flavefaciens, and Ruminococcus albus (Supplementary Figure S6) who are the main species involved in fiber digestion in rumen ecosystem [24]. In addition, 61 OTUs belonged to Bacteroidetes including potential cellulolytic organisms. These aforementioned OTUs showed no significant abundance difference between juveniles and cubs, which suggests that these OTUs may be not important for cellulose digestion in giant pandas.

Across all ITS samples, Ascomycota (on average: 63.35% ± 23.76%) and Basidiomycota (on average: 14.47% ± 15.27%) were the main fungal phyla and < 1% of the sequences were assigned into Blastocladiomycota, Chytridiomycota, and Zygomycota (Supplementary Figure S7). In addition, on average 22.08% (SD = 21.57%) of sequences belonged to unclassified phyla (Supplementary Figure S7). The dominant fungal classes were Eurotiomycetes, Sordariomycetes, Saccharomycetes, Tremellomycetes, and Dothideomycetes, which represented > 72% of the fungal sequences (Supplementary Figure S7). Out of the 419 fungal genera found in giant panda feces, some fungi have potential cellulolytic activity, including Aspergillus, Bjerkandera, Ganoderma, Humicola, Penicillium, Postia, Trametes, and Trichoderma, which have not been found in the GI tract of ruminants [25, 26]. Among these fungal genera with potential cellulolytic activity, only Aspergillus had moderately high abundance (on average: 2.82%) and was found in most samples (91 out of 97 samples); however, with no abundance difference between juveniles and cubs. A total of 1006 fungal OTUs were found using the ITS data and the numbers of OTUs increased with age (Supplementary Figure S8) and 253 OTUs were shared among the four groups. No OTUs belonged to the cellulolytic fungus Perenniporia, which was found in giant pandas by Tun et al. [27]. We did not find that fungal composition below phylum level had a positive correlation with age or to be enriched in one of the four groups (Supplementary Figures S9–S10), which may suggest that fungi are less important than bacteria for the gut microbiome development of giant pandas.

Metagenome development from birth to juvenile

Based on the bacterial diversity and Unifrac distances, 57 samples (Fig. 1b–c; Supplementary Table S1) were selected for shotgun metagenome sequencing. In total, 299 gigabases (Gb) of paired-end read sequence data, with an average of 5.25 Gb (ranging from 1.5 to 9.7 Gb) for each sample (Supplementary Table S3), were obtained and a 552 Mbp Pan-metagenome including 461,872 contigs (Supplementary Tables S3–S4) was constructed. Then, Metagenome Linking Group (MLG), the group of metagenome sequences probably deriving from the same microbial genome, was assembled based on the Pan-metagenome. A total of 1406 MLGs ( > 10Kbp, Supplementary Table S5, Supplementary Figure S11) were obtained of which 811 MLGs were identified as Proteobacteria and 304 MLGs as Firmicutes. Out of the 1406 MLGs, 107 MLGs were > 0.5 Mbp with scaffold length totals in every MLG bin. The results of CheckM (version 1.0.7) using the 107 MLGs showed there were 58 MLGs with completeness% > = 50%. Out of the 58 MLGs, 30 MLGs showed completeness% >80% and contamination% < 25% and 9 MLGs received a very good evaluation result with completeness% > 97% and contamination% < 3% (Supplementary Table S6). We are thus confident that the MLGs obtained are of high quality. Twenty-six MLGs revealed significant abundance differences between different age stages (T-test, P < 0.01; Fig. 2, Supplementary Table S7). None of these 26 MLGs belonged to potential cellulolytic bacteria and most Firmicutes MLGs showed higher abundances at S3/S4, whereas most Proteobacteria MLGs were on the contrary (Fig. 2, Supplementary Table S7), which largely agrees with our results from 16 S data (Supplementary Figure S3).

Fig. 2 Heatmap for metagenome linking groups (MLGs) with significant differences in different groups Full size image

From the Pan-metagenome of giant pandas, we predicted 816,364 genes with MetaGeneMark and 681,167 unique genes, which occupied 84% of the contigs. The number of genes increased from birth to juvenile (S1 to S3/4) and more genes were found in samples with bamboo shoots as diet (S4) than stem/leaves (S3) (Supplementary Figure S12). In total, 50.59% of the genes were classified into function database groups, 7.40% to carbohydrate-active enzymes (CAZy) level_2, 38.66% to eggNOG clusters, 25.60% to KEGG orthology, and 19.30% genes assigned to KEGG pathways, respectively. No significant difference of relative abundance among S1, S2, S3, and S4 were found using the functional annotation with the KEGG, CAZy, and the eggNOG databases at broad categories (Supplementary Figure S12). The annotation results of individual genes from the three databases at higher level categories were used to search for abundance differences of gene function between different groups with principal component analysis (PCA). PCA showed that only CAZy was useful for differentiation of group S1 and S3/4 (Supplementary Figure S13) and carbohydrate metabolism related genes in CAZy showed samples from group S1 clustering closer with only few outliers and same as group S3/4, suggesting that an important function of microbiota is to adapt to the diet transition during the development of giant panda from birth to juvenile. This was consistent with [14] who suggested that profiles of CAZy families were strong predictors of diet in mammals.

The Random Forest analysis of all 681,167 unique genes from the 57 samples showed that 1000 genes were sufficient to explain gene composition and functions among different age stages (Fig. 3; Supplementary Figure S14; Supplementary Table S8 and S9). The annotation of the 1000 genes with the CAZy database showed that 467 of the genes were assigned into carbohydrate esterases and glycoside hydrolase genes families involved in polysaccharide degradation [28]. Most (277) of the 467 genes were enriched in groups S3/4 and may be involved in the degradation of polysaccharide of bamboos and/or bamboo shoots, which is consistent with the results of PCA, suggesting that an important function of microbiota is to adapt to the diet transition (Supplementary Figure S13; Supplementary Table S8). Among the 13 GHs including cellulase genes (EC 3.2.1.4 or 3.2.1.91), only two genes were enriched in group S3/4 and belonged to GH12; though with low amino acid identity and low average abundance (Supplementary Table S8; Fig. 3). However, the KEGG annotation of the 1000 genes showed that no cellulase genes enriched in group S3/4 (Supplementary Figure S15; Supplementary Table S8).

Fig. 3 Heatmap of the relative abundance of the 1000 genes obtained from Random Forest (Figure S14). SCC index between the relative gene abundance and age of giant pandas are shown for these genes (More details are shown in Supplementary Table S8) Full size image

Function enrichment analysis of 681,167 unique genes identified 22 ECs (Enzyme Commission) and 81 KOs (KEGG orthologous groups) whose proportional representation in fecal microbiomes differed significantly with age following KEGG database (Supplementary Table S10). The genes enriched in group S3/4 included some genes involved in phenylalanine metabolism, fatty acid biosynthesis, purine metabolism, glutathione metabolism, antibiotic resistance (e.g., transporters), and streptomycin biosynthesis. However, these did not include putative cellulase genes (EC 3.2.1.4 or 3.2.1.91) (Supplementary Table S10). Putative cyanate lyases (EC 4.2.1.104) were also enriched in S3/4 (Supplementary Table S10). Cyanate is toxic to all animals and widely distributed in plants, including bamboo shoots. The high abundance of cyanate lyases in group S3/4, together with the recently reported host rhodanese activity in giant pandas [29], might help giant pandas to prevent cyanide intoxication from bamboo. To our knowledge, this is the first report about cyanate lyases enriched in gut microbiota. In addition, no significant relative abundance difference of genes existed between group S3 and S4. The group S1 samples had an enrichment of carbohydrate-metabolizing genes involved in simple sugar (such as, lactose/galactose, and glucose) uptake and utilization, pyruvate and glycerolipid metabolism. These data likely reflected high quantities of breast milk in diets of giant pandas in group S1/2, which was consistent with the CAZy annotation, which differentiated S1 from the other categories (Supplementary Figure S13). Except these, some genes involved in secretion systems, antibiotic resistance (e.g., transporters), transcription factors, peptidases, and the biosynthesis of phenylalanine, tyrosine and tryptophan were also enriched in group S1 (Supplementary Table S10). These results were similar with the annotation of the 1000 genes of the Random Forest analysis (Supplementary Table S8).

The survey of gene families found three CAZyme classifications according to CAZy and 21 non-supervised orthologous groups (NOGs) according to eggNOG with significant abundance difference following age (Supplementary Table S10). CBM26, a gene family with starch-binding domains, was overrepresented in group S3/4, suggesting that giant pandas have higher capabilities to take-up and utilize starch following the development from birth to juvenile. By contrast, GH108 (EC 3.2.1.17) with putative antimicrobial activity and GT48 (EC 2.4.1.34) involving the cell wall glucan biosynthesis had higher abundance in 2-month-old giant pandas (group S1) than in 1-year-old pandas (group S3/4). Thirteen NOGs, which represented—among others—purine biosynthesis, deacetylase, transport systems, DNA ligase, transposase, heptaprenyl diphosphate synthase, were detected in significantly higher proportions in giant pandas > 1-year old compared with 2 months old pandas. In contrast, eight NOG protein families, involved in outer membrane porins, heat shock proteins, and transcriptional regulators, were depleted in group S3/4.

We also analyzed all the genes involved in starch and sucrose metabolism pathways including the genes degrading cellulose in giant pandas with the KEGG database. A total of 56 KOs were involved in the starch and sucrose metabolism pathway, 35 of the 56 KOs were detected in giant pandas. A total of 29 ECs had at least one KO with significant difference between one group and another group (P < 0.05, T-test) (Supplementary Table S11; Supplementary Figure S16). Three ECs (3.2.1.21; 2.7.1.2; 2.4.1.21) had a positive age-related change (SSC > 0.49) and were enriched in group S3/4. Nine KOs had higher abundance in group S1 than group S2 or S3 and belonged to the putative cellulase gene (EC 3.2.1.4). A putative exoglucanase (EC 3.2.1.91) was annotated only in few samples with very low relative abundance (Supplementary Figure S16). KEGG annotation of all genes in this study showed that no genes related to phenol oxidases (laccases) and peroxidases (lignin peroxidases and manganese peroxidases), the key enzymes in lignin degradation pathways [30], were present in the metagenome.

The carbohydrate-active enzymes genes among mammals

The samples of groups S3/4 were merged with metagenome data from other mammals [11, 14] to compare their putative CAZymes profiles. A total of 24,685 putative GH-encoding genes representing 108 different GH families were annotated. Among the 108 families, 36 were significantly more abundant in giant panda metagenomes than other mammals and 22 were less abundant (Supplementary Table S12; P-value < 0.05, T-test). These enriched families in giant panda metagenomes included α-amylase families (including GH13 and CBM48) that are able to bind and degrade starch [31], gene families with hemicellulose degrading activity including four families (GH1, 4, 8, and 31), and one gene family involved in both cellulose and hemicellulose degradation pathways (GH8, [32]. Other GHs involved in cellulose degradation pathways were significantly more abundant in herbivores than in carnivore and giant panda metagenomes (Supplementary Table S12; P-value < 0.05, Wilcoxon rank-sum test). Considering the total abundance of GH families related to cellulase genes suggested that carnivores had the lowest and herbivores the highest abundance and giant pandas had a slightly higher abundance than carnivores (Fig. 4). In addition, GH90 was only found in the giant panda (Supplementary Table S12), which is involved in endorhamnosidase enzymatic activity to hydrolyze the O-antigen [33].

Fig. 4 Comparisons of the total abundance of putative cellulase gene families of gut microbiomes of giant pandas and other mammals by Wilcoxon signed-rank test. The lines and squares inside boxes represent the median and mean, respectively. ** P < 0.01, *** P < 0.001 Full size image

UPGMA clustering of CAZymes abundance profiles (Supplementary Figure S17) showed that the giant panda microbiota grouped separately from herbivore and carnivore microbiota, but clustered most closely with the Ursidae family (Supplementary Figure S17–18), which reflected that the giant panda and Ursidae microbiomes might have similar polysaccharide utilization potential.

Amylase gene copy number

The amylase sequences showed that some heterozygosity sites exist in giant pandas (Supplementary Figure S19 and Supplementary Table S13), which showed that at least two amylase gene copies are present in giant pandas. This was verified by qPCR that proved the median number of pancreatic amylase gene copies was two (mean = 1.89) in Ursidae (including giant pandas, black bears, and brown bears) and one (mean = 1.12) in Felidae (including tigers, lions, and leopards) (Supplementary Table S14). The complete genome analysis also showed one copy of amylase in strict carnivores (Such as, cheetah, leopard, seal, and others) and more than one amylase copy in Ursidae or omnivores (Such as polar bear, giant panda, and dog) (Supplementary Table S15).