High abundance of propionibacteria in the follicular microbiota

To identify the disease- and health-associated microbial elements in the follicular microbiota, we performed ultra-deep metagenomic shotgun sequencing of the samples collected from 38 acne patients and 30 age-matched healthy individuals (Fig. S1). After removal of human DNA sequences and low-quality reads, we obtained an average of 1.08 gigabase pairs (Gbp) per sample (6.9 × 107 bp – 4.8 × 109 bp), sufficient to cover the microbial diversity of skin samples24. We mapped the cleaned sequencing reads to our reference genome set, which consists of 1,252 bacterial and 272 fungal genomes from the HMP reference genome database and several additional genomes of skin microorganisms: Propionibacterium avidum25,26, Propionibacterium granulosum26, Propionibacterium humerusii27, and P. acnes bacteriophage28,29. To cover different P. acnes strains, we also included the P. acnes pan-genome in the reference19. Consistent with other skin sites23, bacteria were the major organisms found in the follicle (Fig. 1A) with few fungal organisms detected at low relative abundances (Fig. 1B,C). Five main bacterial phyla were found in the samples, including Actinobacteria (95.6%), Firmicutes (2.3%), Proteobacteria (1.2%), Cyanobacteria (0.6%), and Bacteroidetes (0.2%). The dominance of Actinobacteria in the follicle is consistent with previous taxonomic analyses of sebaceous skin sites5,23,30.

Figure 1 Bacteria dominate the skin follicular microbiome. (A) A box-and-whiskers plot comparing the relative abundances of bacteria, fungi, and P. acnes phage in the follicular microbiota of acne patients (n = 38), age-matched healthy individuals (n = 30), healthy individuals over 55 (H55 + ; n = 4), and all healthy individuals combined (n = 34). (B) A few fungal organisms were found in the follicle. Sequencing reads pooled from all subjects mapped to six fungal species, with less than 1X coverage for any species. (C) The relative abundance of P. acnes phage in all the samples suggests an increased prevalence and abundance of P. acnes phage in healthy individuals and a trend of increased phage abundance with age. (D) The relative abundances of bacterial species in the follicle. Each column represents the relative abundances of the bacterial species found in each individual. P. acnes was the dominant skin bacterium in all but one individual. On average P. acnes accounted for 91% of the bacterial taxa identified. An increase in the average relative abundances of P. acnes and P. granulosum was observed in the healthy individuals, whereas an increase in the average relative abundances of minor taxa was observed in the acne group. Five major skin bacterial species (P. acnes, P. humerusii, P. avidum, P. granulosum, and S. epidermidis) are shown separately from the phyla that they belong to. Full size image

At the species level, P. acnes was the most prevalent and abundant. It was found in all 68 individuals with an average relative abundance of 91.0% (Fig. 1D; Table 1). The healthy individuals had a slightly higher relative abundance of P. acnes than the acne patients (93.8% vs. 88.5%). We also compared the differences in P. acnes strain populations between acne patients and healthy individuals. The relative abundances of the dominant P. acnes strains were assigned based on the read coverages of the 16S rRNA SNPs that distinguish the ten most common (top ten) P. acnes ribotypes (RTs)5 (Fig. S2). We found that acne patients had a higher diversity of P. acnes population (Shannon index = 1.07) with more RTs (2.55 RTs per individual) than healthy individuals (Shannon index = 0.79, P = 0.039; 1.97 RTs per individual, P = 0.007). Consistent with our previous study5, RT4, RT5, and RT8 were found to be more abundant and more prevalent in acne patients, with RT4 being statistically significantly different in relative abundance (P = 0.004), while RT2 and RT6 were more abundant in healthy individuals (Fig. S2). The strain association with acne or healthy skin is consistent with previous multi-locus sequence typing and 16S ribotype analysis of large collections of clinical P. acnes isolates5,31,32.

Table 1 Bacterial organisms identified in the follicular microbiome of the three groups. Full size table

Other cutaneous Propionibacterium species, including P. humerusii (2.7%), P. granulosum (1.6%), and P. avidum (0.4%), were also detected (Table 1). Among them, P. granulosum, a resident bacterium of sebaceous skin sites33, was significantly more abundant in the healthy individuals than in the acne patients (P = 0.002).

Other bacterial species frequently identified in the follicle microbiota include Staphylococcus epidermidis (0.9%), Staphylococcus capitis (0.4%), Escherichia coli (0.7%), and Clostridium sp. (0.5%). Minor bacterial taxa were found more prevalent and abundant in acne patients than in healthy individuals with varying presence and abundance (Supplementary Text). On average, acne patients had a higher diversity at the species level than healthy individuals, but not statistically significantly different (Shannon index = 0.63 and 0.44, respectively; P = 0.072). Reduced P. acnes and P. granulosum and increased minor taxa observed in acne patients suggest disruptions of the commensal skin flora in acne development.

The skin microbiome of older healthy individuals is similar to younger healthy individuals

Acne rarely occurs in individuals over the age of 5010,11. The skin microbiome profiles of older individuals with clear skin can be used as independent references for healthy, acne-free skin. However, the skin metagenome of older healthy individuals has not been characterized. We performed metagenomic shotgun sequencing of the samples collected from four healthy individuals of age 55–79 (Fig. S1). The bacterial compositions of the follicular microbiome of these individuals were similar to those of the younger healthy individuals (Fig. 1D). It was previously noted that the production of sebum, a nutrient source for lipophilic bacteria, including Propionibacterium species, decreases with age34. In our analysis we found that the relative abundances of both P. acnes and P. granulosum were higher in older healthy individuals than in acne patients (Table 1), similar to the observations seen in younger healthy individuals. When grouping all healthy individuals together, P. granulosum remained significantly more abundant in healthy skin than acne (P = 0.001). At the strain level, P. acnes populations in older healthy individuals were similar to those of the younger healthy cohort, consisting of mostly RT1, RT2, and RT3 strains, with little or no RT4, RT5, and RT8 strains detected (Fig. S2).

When comparing all 34 healthy individuals, including the four subjects over the age of 55, with the 38 acne patients, we found an increased prevalence and abundance of P. acnes phage in the healthy group (P = 0.05) (Fig. 1A). We also observed a trend between age and the relative abundance of P. acnes phage. Higher relative abundances of phage were found more often in individuals with increased age (Fig. 1C).

Metagenomic composition of the skin microbiome is different between acne patients and healthy individuals

To better understand the role of the follicular microbiota in acne pathogenesis at the molecular level, we investigated the differences in metagenomic composition of the microbiome between acne patients and healthy individuals. Among the sequencing reads mapped to P. acnes, we identified 2,707 operational gene units (OGUs) in the samples, among which 1,943 OGUs (72%) were present in every sample. Rarefaction analysis indicated that the sequencing depth was sufficient to quantitatively compare the P. acnes functional profiles between acne patients and healthy individuals (Fig. S3). Since the microbiome profiles were similar between younger and older healthy individuals at both the species and P. acnes strain level, we combined the healthy individuals from both age groups in the following metagenomic analyses.

To determine whether specific metagenomic elements are associated with acne or healthy skin, we compared the relative abundances of P. acnes OGUs between acne patients and healthy individuals (Fig. S4). As the skin microbiome varies greatly among individuals23,30,35,36, to reduce the potential bias due to specific individuals recruited in our study, we randomly selected 100 subsets of the data, each containing two-thirds of the individuals from each clinical group (26 acne patients and 22 healthy individuals), and determined the metagenomic elements that were differentially abundant between the two groups in each random set. To capture the differences at the species level between the two cohorts, in addition to P. acnes OGUs, we also considered the species found in the samples as additional metagenomic elements and included them in the analysis. To determine the most robust set of metagenomic elements that are associated with either acne or healthy skin, we identified 63 metagenomic elements, whose relative abundances were significantly different (P < 0.05) between the two groups in at least 50 of the 100 random subsets. These 63 elements consisted of 62 P. acnes OGUs (Table S1) and P. granulosum as species.

Among the 62 P. acnes OGUs, 25 were significantly more abundant in acne patients. Two of them are involved in thiopeptide bacteriocin precursor synthesis and transport (PAGK2104 and PAGK2105, as annotated in P. acnes HL096PA1 genome). Thiopeptide bacteriocins belong to a family of microcins, which are produced by Gram-positive bacteria and inhibit the growth of other Gram-positive species by blocking protein translation37. Additionally, 19 of the 25 acne-associated OGUs were from locus 2 (PAGK0160-PAGK0178), a genomic island previously identified mainly in RT4 and RT5 strains and highly associated with acne5,20. Locus 2 spans over 20 Kb and encodes 23 ORFs (Fig. S5). The differentially abundant locus 2 OGUs identified include a number of genetic elements involved in recombination, such as a single-strand binding like protein (Ssb), shown in other bacterial species to be involved in chromosomal transformation38, and resolvase-like protein, previously implicated in genetic integration. Other OGUs include a cluster of Streptolysin S-associated genes (sag) involved in the biosynthesis and transport of a bacterial toxin. Identified and characterized in Streptococcus pyogenes, sag genes are involved in synthesis, post-translational modification, and transport of a ribosomally synthesized bacteriocin, which has been linked to antimicrobial activity as well as invasive infections39,40. The presence of genes homologous to sagB, C, and D in locus 2 suggests involvement of this locus in bacteriocin biosynthesis and maturation. Additionally self-immunity and transport genes are found immediately downstream. Other OGUs in this locus have putative roles in cell viability, virulence, and immunity including ATP-binding cassette (ABC) transporter and ABC-binding protein for translocation of lipids, nutrients and/or toxins, CAAX amino protease, which is thought to be involved in self-immunity41, and partitioning machinery needed for cell replication and division. While the specific functions of genes in this locus in acne are unclear, the absence of locus 2 in healthy individuals and a high abundance in acne patients provides strong evidence for its association with acne.

In addition to locus 2, in our previous studies, two other genomic loci (locus 1 and locus 3) were found enriched in acne-associated RT4 and RT5 strains5,19,20. To examine whether these three loci were overrepresented in the metagenome of acne patients, we plotted the relative abundance (Fig. 2A), the fold change between acne patients and healthy individuals (Fig. 2B), and the prevalence ratio (Fig. 2C), of each OGU encoded in these three loci across all individuals. We found a higher average relative abundance and prevalence of all three loci in acne patients compared to healthy individuals (Fig. 2). Locus 2 was rarely found in the microbiome of the healthy individuals, and was significantly associated with acne (P = 0.002) (Fig. 2). Since most RT4 and RT5 strains harbor locus 2 and are resistant to antibiotics5,19, we examined the treatment history of each individual to determine whether the presence of locus 2 was due to acne treatment. While locus 2 was found in some treated acne patients (n = 8), it was absent in other treated patients (n = 8) and was present in many untreated patients (n = 10). The presence of locus 2 was not significantly different between treated and untreated patients (Fisher’s exact test, P = 1), suggesting that locus 2 is a characteristic of the disease rather than treatment (Fig. 2A). Additional analysis to further support the association of locus 2 with the disease is described in the supporting text (Fig. S6; Supplementary Text).

Figure 2 Differences in the relative abundances of P. acnes OGUs between acne patients and healthy individuals. (A) A heat map showing the relative abundances of the OGUs in P. acnes loci 1, 2, and 3 in acne patients and healthy individuals. Each column represents an OGU, ordered based on the genomic location of the OGUs. OGUs 101–200 in the pan-genome were plotted to show locus 1, flanking OGUs, and locus 2. The 74 OGUs from locus 3, which is a plasmid, are also shown. Each row represents an individual. Acne patients (n = 38) and healthy individuals (n = 34, including those with age over 55) were compared. Individuals within each group were clustered based on the average relative abundance of locus 2 OGUs. Ribotype composition and past and current acne treatments are indicated on the right. Multiple treatments are depicted by more than one color. (B) Fold changes in relative abundance of the OGUs in loci 1, 2, and 3 between acne patients and healthy individuals. Acne-associated OGUs had a fold change >1, while health-associated OGUs had a fold change <1. (C) Prevalence ratio of the OGUs in loci 1, 2, and 3 between acne patients and healthy individuals. The presence of an OGU in a sample is defined as an OGU with at least 1X coverage after normalization. Full size image

Of the 62 differentially abundant P. acnes OGUs, 37 OGUs were more abundant in healthy individuals. In contrast to the enrichment of virulence-related genes observed in the acne metagenome, genes involved in microbial metabolism and nutrient biosynthesis were significantly more abundant in the healthy metagenome (Fig. S4). Examples include glycosyl transferase (PAGK0136), D-alanine–D-alanine ligase (PAGK0821), and cobalamin-independent methionine synthase (PAGK1035), which are involved in polysaccharide, cell wall, and amino acid biosynthesis, respectively. An enrichment of these genes in the healthy metagenome is consistent with the metagenomic study by Mathieu et al.22, which suggested a functional role of the resident bacteria in healthy skin in the exploitation of compounds from the human skin including sugars, lipids, and iron22. Other significantly more abundant OGUs in the healthy metagenome include glycerol uptake facilitator protein (PAGK2214), which aids in carbohydrate metabolism processes by transporting glycerol across the cytoplasmic membrane. Recently it was shown that fermentation of glycerol by P. acnes contributes to skin and host health via the production of short chain fatty acids (SCFA)42. SCFA including acetic acid, lactic acid, and propionic acid were shown to significantly decrease the colonization of skin-associated pathogens such as methicillin-resistant Staphylococcus aureus42.

The balance between acne- and health-associated metagenomic elements shapes the host skin microbiota in acne and health

Our observations of differentially abundant species, strains, and metagenomic elements between the skin microbiome of acne patients and healthy individuals led us to hypothesize that the balance between acne- and health-associated metagenomic elements determines the virulence and health properties of the microbiota in skin disease and health. We define the balance as the ratio between the relative abundances of acne- and health-associated metagenomic elements. When the relative abundances of the 62 P. acnes OGUs and P. granulosum from all individuals were compared, we found that this ratio in acne patients was significantly higher than in healthy individuals (P = 2.9 × 10−5). Figure 3 illustrates the relative abundances of these 63 metagenomic elements across all 72 individuals. The metagenomic profiles of healthy individuals were readily distinguishable from acne patients harboring locus 2. Acne patients with locus 2 had significantly higher relative abundances of acne-associated OGUs (in 24 of 25 OGUs P = 0.00024–0.021, and in 1 of 25 OGUs P = 0.083) (Fig. 3A). However, these subjects only represented about half (47%) of the patients recruited in our study. When we examined the metagenomic profiles of the other acne patients without locus 2, we found that they were distinct from healthy individuals with decreased relative abundances of health-associated P. acnes OGUs (in 23 of 37 OGUs P = 0.0024–0.046, and in 14 of 37 OGUs P = 0.058–0.30). We were able to reveal distinct microbiome profiles between acne patients and healthy individuals that were not readily apparent at the taxonomic level, with differences in the relative abundances of the metagenomic elements that were associated with either acne or healthy skin (Fig. 3).

Figure 3 The relative abundances of acne- and health-associated metagenomic elements in acne and healthy individuals. (A) The relative abundances of 62 P. acnes OGUs, including 25 acne- and 37 health-associated OGUs, and three organisms associated with healthy skin, P. acnes, P. granulosum, and P. acnes phage, were plotted for each individual to illustrate the importance of a balance between these metagenomic elements in health and acne. Each column represents an individual, and each row represents an OGU or an organism. The top ten ribotype composition and acne severity score (acne patients only) of each individual are also shown. (B) The prediction score of each individual based on the relative abundances of the 45 metagenomic elements is shown, where red indicates acne and green indicates healthy skin. The classification of the clinical states had an overall accuracy of 85%. Full size image

In addition to P. acnes OGUs, we found that the relative abundances of P. acnes and P. granulosum were also important factors influencing the clinical state of the skin. As an example, an acne patient in our cohort (labeled with † on Fig. 3) displayed a profile of the 62 P. acnes OGUs that was more similar to the healthy individuals than all other acne patients (P = 5.29 × 10−5). However, the observed low relative abundances of both P. acnes (78% compared to the average of 91%) and P. granulosum (0.070% compared to the average of 1.6%) may contribute to the acne status of this patient. This suggests that a healthy follicular microbiota not only requires health-associated P. acnes strains and genetic elements, but also requires significant dominance of resident propionibacteria. Taken together, our findings suggest that the balance between skin metagenomic elements determines the virulence and health properties of the skin microbiota and is important in skin health and disease.

We further investigated whether the relative abundance profiles of the metagenomic elements are sufficiently robust to correctly classify the clinical state of skin samples. We performed a supervised class prediction analysis based on a modified weighted gene voting algorithm43,44 (Supplementary Text). We generated 1,000 permutations. In each permutation, samples were randomly assigned to either acne or healthy state, with 38 samples in the acne group and 34 samples in the healthy group. We identified differentially abundant metagenomic elements between the two groups among all but one sample, and predicted the clinical state of the withheld sample based on the abundance profiles of the metagenomic element, known as the leave-one-out cross-validation (LOOCV). With the use of the clinically defined acne-healthy grouping, the algorithm predicted the clinical states of 49 out of 72 samples (68%) with a prediction strength threshold of 0.25. Thirty-four of the 49 samples were assigned correctly (69% accuracy). The number of correctly assigned samples based on the clinical grouping was higher than most of the permutated groupings (P = 0.064). The clinical grouping also had a significantly higher prediction accuracy (69%) than all but one of the permutated groupings (P = 0.001) (Fig. 4A).

Figure 4 Class prediction accuracy using leave-one-out cross-validation and weighted gene-voting. (A) For the training sample set (n = 72), using the clinically defined acne and healthy individual grouping, the classifier correctly assigned the clinical states of 34 of the 49 assigned samples (69% accuracy) using a prediction strength threshold of 0.25. This result is statistically significant (P = 0.001), because only one of the 1,000 permutated groupings had a higher accuracy. However, that particular grouping (accuracy of 72%) had fewer samples assigned than the clinical grouping (n = 39 vs 49). This demonstrates that the differences in the relative abundances of the metagenomic elements between acne patients and healthy individuals can be used to predict the clinical states of the skin. When we used the refined set of 45 metagenomic elements, including 43 P. acnes OGUs, P. acnes locus 2, and P. granulosum, we further improved the prediction accuracy of the training sample set to 85%. (B) Consistent with the prediction accuracy on the training sample set, for the independent sample set (n = 10), the 45 metagenomic elements were able to assign 70% of the samples with 86% accuracy. Full size image

We further improved our classifier by using the 63 metagenomic elements, which are the most robust set based on the 100 random samplings as described earlier. Since the 19 OGUs from locus 2 had similar abundance profiles across all samples, to avoid overweighing locus 2 in the classification, we combined these 19 OGUs and used their average relative abundance in the classifier. The refined classifier therefore consisted of the abundance profiles of 45 metagenomic elements: P. acnes OGUs/loci, and P. granulosum. Based on this refined set, we were able to assign the clinical states of 31 of the 38 acne patients (82%, prediction strength threshold of 0.25). Twenty-four acne patients were correctly assigned (accuracy of 77%). Among the healthy individuals, we were able to assign 21 of the 34 subjects (62%) with 20 correctly assigned (accuracy of 95%). Overall, the classifier was able to assign 72% of the subjects with an accuracy of 85% (Fig. 3B; Fig. 4B).

Furthermore, to validate that these metagenomic elements can be used as markers for clinical classification, we collected an “independent sample set” from ten additional subjects, including 4 acne patients and 6 healthy individuals, one of which was over 50 years old. We used the refined 45 metagenomic elements to predict the clinical state of each independent sample (Fig. 4B). Our classifier was able to assign the clinical states of 7 samples (70%) with an accuracy of 86%, highly consistent with the classification results from the training sample set of 72 samples. Based on this, we conclude that the metagenomic elements identified in our study can be used to classify the clinical state of the skin with high accuracy.