Subjects and design

This study is based on samples collected from subjects participating in a randomized, double-blind placebo-controlled clinical trial (clinicaltrials.gov NCT00167700) assessing the effects of probiotic modulation of maternal microbial contact in late pregnancy reported in detail elsewhere30. Pregnant women scheduled to undergo elective caesarean section after 37 weeks of gestation were recruited to obtain placenta samples without risk of contamination taking place during vaginal delivery. Mothers with conditions, which might affect placental and foetal physiology (e.g. pre-eclampsia, intrauterine growth retardation, foetal anomalies, onset of labour, asphyxia) or contaminate the placenta (rupture of membranes, vaginal delivery, infection) were excluded from the study. Amniotic fluid, placenta, meconium, colostrum, infant faeces and maternal faeces samples were available from 15 mother-infant pairs and included in this study. One mother received antenatal antibiotic therapy with clindamycin for a bacterial infection of the skin whilst the remaining 14 mothers received no antibiotics prior to or during the caesarean section. None of the neonates were administered antibiotics. The detailed clinical characteristics of these mother-infant pairs are presented in Table 1. The study was approved by the Ethics committee of the Intermunicipal Hospital District of Southwest Finland and conducted in accordance with the Declaration of Helsinki as well as national legislation and institutional guidelines concerning clinical research. Informed consent was obtained from all subjects.

Sample collection

Amniotic fluid and placenta samples were obtained during sterile caesarean section and stored at −80 °C. Meconium and infant faeces samples were collected from diapers after they had been passed. Meconium, infant faecal samples and maternal faecal samples were collected fresh in sterile plastic recipients, refrigerated and processed without further delay. Colostrum samples were collected in the maternity hospital using milk produced within 24 hours after delivery. The mothers were given written instructions for standardised collection of samples in the mornings and the samples were frozen and stored at −20 °C for later analysis. Before sample collection, the breast was cleaned with an iodine swab to reduce contamination from skin bacteria, and breast milk was collected manually discarding the first drops with a sterile milk collection unit. Total DNA for DGGE and 16S rRNA gene sequencing analyses was isolated from maternal faeces, placenta, amniotic fluid, colostrum, meconium and infant faeces samples using the QIAamp DNA Stool Mini Kit (QIAgen, Hilden, Germany) following the manufacturer’s instructions.

Bacterial culture from placenta and amniotic fluid samples and identification of bacterial isolates

Placenta samples (approximately 10 mg) were kept under anaerobic conditions (AnaeroGen; Oxoid, Hampshire, United Kingdom), and analysed in less than 2 hours to avoid alterations in bacterial viability. Placenta tissue specimens were homogenized in 200 μl of a phosphate-buffered saline (PBS) solution (130 mM sodium chloride, 10 mM sodium phosphate, 0.05% cysteine, pH 7.2) by pipetting and thorough agitation in a vortex mixer (10 s). Each homogenised sample was randomly plated on different culture media (100 μl). The following media were used: Gifu anaerobic medium (GAM agar; Nissui Pharmaceutical, Japan); LB medium (MP Biomedicals, LLC, France). Plates were incubated under anaerobic conditions (Concept 400 anaerobic chamber, Ruskinn Technology, Leeds, United Kingdom) at 37 °C for 48–72 h.

All the viable and cultivable bacteria recovered from placenta and amniotic fluid samples were isolated and re-streaked onto the same agar media. For preliminary identification of the isolates, conventional microbiological methods were used, including analysis of colony and cellular morphology and Gram staining. All isolates were stored at −80 °C in the presence of glycerol (20%, vol/vol) until use for further characterisation.

Bacterial isolates were grown in the same isolation broth media and harvested at the late log growth phase. The bacterial suspensions were centrifuged for 5 min at 6,000 × g and the pellets were used for DNA extraction using the QIAamp DNA Stool Mini Kit (QIAgen, Hilden, Germany) following the manufacturer’s instructions. DNA samples were stored at −20 °C.

The bacterial DNA of each isolate was partially amplified with 16S rRNA gene target primers 968f (5′-AACGCGAAGAACCTTA-3′) and 1401r (5′-CGGTGTGTACAAGACCC-3′). Amplification reactions were carried out in a 50-μl volume containing 10 mM Tris-HCl (pH 8.3), 2.5 mM MgCl2, 1 μM each primer, 200 μM deoxynucleoside triphosphates, and 2.5 U of Taq polymerase (Ecotaq; Ecogen, Spain). The amplification program was 1 cycle at 94 °C for 5 min; 30 cycles at 94 °C for 1 min, 50 °C for 1 min, and 72 °C for 2 min; and finally, 1 cycle at 72 °C for 7 min. The amplification products were subjected to gel electrophoresis in 1% agarose gels, purified using Gel Band DNA purification kit (GE Healthcare, Buckinghamshire, United Kingdom), and sequenced in an ABI Prism-3130XL genetic analyser (Applied Biosystems, CA). Search analyses to determine the closest relatives of the partial 16S rRNA gene sequences retrieved were conducted in GenBank using the Basic Local Alignment Search Tool (BLAST) algorithm, and sequences with more than 97% similarity were considered to be of the same species.

Qualitative microbial analysis using denaturing gradient gel electrophoresis (DGGE)

The composition of the microbiota in the different sample types was analysed by PCR-DGGE. The following strains acquired from the DSMZ culture collection (Braunschweig, Germany) were used as a reference: Bacteroides fragilis DSM 2151T, L. gasseri DSM 20243T, Leuconostoc mesenteroides subsp. mesenteroides DSM 20343T, L. salivarius DSM 20555T, B. breve DSM 20213T, B. longum DSM 20129T, B. bifidum DSM 20456T and B. angulatum DSM 7096. The Lactobacillus strains were routinely grown in MRS broth (Oxoid Ltd., Basingstoke, Hampshire, England) over night at 37 °C. The Bifidobacterium strains were grown in GAM broth (Nissui, Tokyo, Japan) supplemented with 0.5% glucose in anaerobic conditions at 37 °C. Bacteroides fragilis was grown in EG broth (Nissui) supplemented with 5% sheep blood (Biotrading, Mijdrecht, Netherlands) in anaerobic conditions at 37 °C. DNA from the reference strains was extracted as previously described31.

PCR amplification of DNA was carried out as described previously32. Amplification was confirmed by gel electrophoresis in 1.0% agarose. DGGE analysis of each PCR product was conducted with a DCode System (Bio-Rad Laboratories, Hercules, CA, USA) at a denaturing gradient of 35–70% at a constant current of 28 mA for 16 h. The gels were stained for 30 min with SYBR green I nucleic acid gel stain (BioWhittaker Molecular Applications, Rockland, ME, USA).

Statistical analysis of PCR-DGGE profiles

The DGGE images were imported to Bionumerics software, version 6.6 (Applied Maths, St-Martens-Latem, Belgium) for normalisation and band detection and were numerically analysed. Band searching and matching using 1% band tolerance was performed as implemented in the Bionumerics version 6.6. The bands and band matching were manually corrected when seen necessary. Cluster analysis of the DGGE patterns was performed using the unweighted pair-group method using arithmetic averages (UPGMA) based on the Pearson correlation.

Matrices based on the presence/absence and intensities of bands were exported from Bionumerics and imported to Microsoft Excel for the calculation of richness and Shannon diversity indices. The number of bands present in the DGGE profile was used as a measure of richness for the samples. Shannon diversity index, H′, was calculated using the equation H′ = −Σ Pi ln (Pi). Pi is the importance probability of the bands in a gel lane and is calculated as Pi = ni/N where ni is the intensity of band and N is the sum of intensities of all bands in the densitometric profile. Analysis of variance (ANOVA) followed by a Tukey’s post-hoc analysis was used to compare the richness and diversity between the different sample types. The statistical analysis was performed with IBM SPSS Statistics for Windows (Version 21.0. Armonk, NY: IBM Corp.) Moreover, the matrix based on the presence/absence of the bands was imported to Multivariate Statistics Package MVSP version 3.22 (Kovach Computing Services, UK) for Principal Component Analysis (PCA). The PCA was performed using the software’s automated Kaiser’s rule with the other parameters set as default.

Microbial diversity and composition by 16S rRNA gene sequencing PCR

A barcoded primer set based on universal primers 27F and 533R was used to amplify 500 bps of the 16S rRNA genes covering the hypervariable regions V1 to V3 (V1–V3). The PCR was carried out using a high-fidelity KAPA-HiFi polymerase (Kappa Biosystems, US) with an annealing temperature of 52 °C and 30 cycles. Specificity and amplicon size were verified by gel electrophoresis and the amplicons were checked and measured using the Agilent High Sensitivity DNA assay in Agilent 2100 Expert. Purified PCR products were pooled in equimolar amounts, as described by 454 Roche protocols, and submitted for pyrosequencing using the Genome Sequencer FLX Titanium Series (454 Life Science, Branford, USA). All of the procedures followed the manufacturer’s instructions (454 Life Science). The raw sequences reported in this article have been deposited in the EMBL European Nucleotide Archive (accession number PRJEB12221).

From the resulting raw data set provided by pyrosequencing, low quality sequences were filtered out to remove sequences having a length shorter than 100 nucleotides and chimeric sequences were removed using UCHIME software33. A dereplicate request on the QIIME pipeline (v.1.8) commands with default parameters was used for identifying representative sequences for each operational taxonomic unit (OTU) generated from the complete linkage clustering with a 97% similarity and aligned to fully-sequenced microbial genomes (GreenGenes 13_8 database). Quality filtering and final reads number data of the 16S rRNA profiling analysis are presented in Supplementary Table 5. Phylogenetic identification of the organisms on the family and genus levels are presented as read numbers in Supplementary Table 1 and as relative abundances in Supplementary Table 2. To estimate diversity conservatively and reduce noise in patterns of beta diversity, singleton OTUs were removed prior to community analysis. Alpha diversity indices were determined from rarefied tables using the Shannon-Wiener index for diversity, the Chao1 index for richness and Observed Species (number of unique OTUs) and Phylogenetic Distance (PD_whole) were also determined. The Chao1 alpha diversity curves for the sample types as well as individual samples are provided in Supplementary Figure 1. Most of the samples reached a plateau suggesting sufficient coverage of the microbes present in the samples. Microbial Beta-diversity between samples was evaluated and computed from the previously constructed OTU table using UniFrac, a phylogenetic distance metric that measures community similarity based on the degree to which pairs of communities share branch length in a common phylogenetic tree34. Unweighted and weighted UniFrac distances and sample metadata comprised the data matrices used as inputs for principal coordinate analysis (PCoA). Unweighted UniFrac distances compare microbial communities based on presence/absence of members (community membership: presence/absence matrix), and weighted UniFrac also incorporates relative abundance information (community structure:presence/absence/abundance matrix). PCoA plots were used to assess the variation in the composition of microbial communities between samples and to visualise potential clustering of samples by metadata. Samples were also hierarchically clustered based on their inter-sample UniFrac distances using UPGMA (Unweighted Pair Group Method with Arithmetic Mean). Biplots were generated as part of the beta diversity analyses, using genus level OTU tables showing principle coordinate sample clustering alongside weighted taxonomic group data. All beta-diversity measures were performed on OTU tables rarefied to 500 sequences per sample for all samples to account for variations in sequencing depth. Data on assigned sequences at family level shared between samples were used to generate a Venn diagram.

After taxonomical assignment, relative frequencies of different taxonomic categories obtained were calculated using the Statistical Analysis of Metagenomic Profiles program (STAMP v.2.0.0). Statistical differences between experimental groups were estimated by ANOVA analysis with the Games-Howell post-hoc test and the multiple test correction of Benjamini-Hochberg as implemented in STAMP. MEGAN v.5 software was used for hierarchical tree constructions of the microbiota35. Statistical differences between groups of samples were tested using analysis of similarity (ANOSIM – available through QIIME) by permutation of group membership with 999 replicates. The test statistic R, which measures the strength of the results, ranges from −1 to 1: R = 1 signifies differences between groups, while R = 0 signifies that the groups are identical.

Multivariate statistical analysis and clustering were performed using METAGENassist software36. Data-filtering was performed by the interquantile range method followed by quantile normalisation within replicates after log transformation. Hierarchical clustering of OTUs were done using core microbiota defined as the OTUs that are present in at least 50% of the samples and Cluster analysis was performed using the Euclidean distance. Principal component analyses and identification of significant features were performed for all sample groups together.

LDA Effect Size (LEfSe) analysis

To identify taxa with differentiating abundance in the different environments, the LDA Effect Size (LEfSe: Linear Discriminant Analysis Effect Size) algorithm was used with the online interface Galaxy (http://huttenhower.sph.harvard.edu/lefse/)37. LEfSe couples robust tests for measuring statistical significance (Kruskal-Wallis test) with quantitative tests for biological consistency (Wilcoxon-rank sum test). The differentially abundant and biologically relevant features are ranked by effect size after undergoing linear discriminant analysis (LDA). An effect size threshold between 2–3 (on a log 10 scale) was used for all biomarkers discussed in this study.

Inferred metagenomics by PICRUSt

The functionality of the different metagenomes, grouped by sample type, was predicted using the software PICRUSt 1.0.0 (http://picrust.github.io)38. In short, this software allows the prediction of functional pathways from the 16S rRNA reads. First, a collection of closed-reference OTUs was obtained from the filtered reads by using QIIME v 1.8.0 and by querying the data against a reference collection (GreenGenes database, May 2013 version; http://greengenes.lbl.gov) and OTUs were assigned at 97% identity. The resulting OTU table was then used for microbial community metagenome prediction with PICRUSt on the online Galaxy interface (http://huttenhower.sph.harvard.edu/galaxy/). PICRUSt was used to derive relative Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway abundance. The data were analysed statistically by using STAMP v 2.0.0. Supervised analysis was done using LEfSe to elicit the microbial functional pathways that were differentially expressed in the different samples.