Study design and population

We did an analytical cross-sectional single center study (IBD clinics of Campinas State University – Unicamp – Brazil), with CD patients and healthy controls. Inclusion criteria were CD patients with diagnosis confirmed by means of clinical, endoscopic and histological criteria and for control group adults residents in the same house, without previous history of chronic disease. Exclusion criteria included individuals that used antibiotics or probiotics during the previous 2 months. Disease activity in CD patients was assessed by the CDAI score31 and endoscopic findings. CDAI score under 150 were considered inactive disease (clinical remission).

Clinical data, disease classification, medications and comorbidities were collected on the same day of colonoscopy procedure. Fecal samples were collected by the subjects at home using Sarstedt tubes (Sarstedt, Nümbrecht, Germany) filled with a preservative buffer and brought to the IBD clinics within 24 hours after defecation. Stool samples were stored in frozen at -80 °C for microbiota analyses.

Metagenome profile

Total DNA of fecal samples was extracted with the Stool PSP Spin DNA kit (STRATEC Biomedical AG, Germany), an integrated system for collecting, transporting and storing feces samples and subsequent DNA purification.

To profiling microbiota composition, the hyper-variable region (V3-V4) of the bacterial 16S rRNA gene was amplified by following the Illumina 16S Metagenomic Sequencing Library Preparation guide32 which uses the following sequence: 338F - 5′-TCGTCGGCAGCGTCAG ATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG -3 and 785R - 5′-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC-3′ (2 × 300 bp paired‐end an insert size of ~550 bp).

Bioinformatic and statistical data analysis

The fastq sequences were analysed using DADA2 tool as describe by Callahan et al.33, that allows to recover single-nucleotide resolved Amplicon Sequence Variants (ASVs) from amplicon data. The default parameters were used to improve the overall quality of the sequences, the reads were filtered and trimmed using the “filterAndTrim” function implemented in DADA2 as described in https://benjjneb.github.io/dada2/tutorial.html. Low quality bases at the end of the reads were removed and the truncLen option was set to 280 and 220 to trim the forward and reverse fastq files respectively. Moreover, the sequences were also trimmed at the 5′ end using the trimLeft option set to 17 and 21 for the forward and reverse reads respectively. The taxonomic assignment was subsequently performed using the naïve Bayesian classifier method implemented in DADA2 using as reference the SILVA database. Finally, the final phylogenetic tree of the ASVs was obtained using the function AlignSeq implemented in DECIPHER34 R package to create the multiple sequence alignment and the Fast Tree program35.

Statistical analysis was performed on R (Version 3.4.4) using the following R packages: phyloseq (version 1.24.0) to facilitate the import, storage, analysis, and graphical display of microbiome census data36. Data were pre-processed filtering features with less than 10 read counts and present in less than 2 samples36. Vegan (version 2.4.2) for PERMANOVA analysis37. Shannon diversity indices and bar plot graphical were generated by using the R package ggplot2. The longitudinal microbiome studies was carried out from q2-longitudinal, a software plugin for the QIIME 2 microbiome analysis platform (https://qiime2.org)38.

Staining techniques for neutral and acid mucins

The biopsies of large bowel mucosa content of the neutral and acid mucins were determined individually to modify histochemical Periodic Acid Schiff (PAS)39 and Alcian Blue (AB) techniques. The slides were read under an ordinary optical microscope with a final magnification of 200×. The histological parameters were analyzed qualitatively and quantitatively by a pathologist with experience of diseases, of the digestive tract who was unaware of the origin of the material and the objectives of the study. The neutral mucins stained magenta, while the acid mucins stained blue.

Image processing, computer-assisted

The images selected were captured on a video camera that had been coupled to an optical microscope. These images were processes and analyzed using the NIS-Elements (Nikon Corporation. Instruments Company, Japan) software, installed in a computer with good image processing capacity. By means of colored histograms in RGB system (Red, Green, Blue) the software determined the color intensity in number of pixels in each field selected and transformed the final data into percentage expressions by analyzed fields. The final value in the segments with and without intestinal transit was the mean of the values found from evaluating three different fields.

Quantification of Saccharomyces cerevisiae by qPCR

For quantification of Saccharomyces cerevisiae the qPCR was performed using the primers and probe described by Mallant-Hent et al.40. Briefly, reaction was performed in 12 μl total volume containing 1x Universal Master Mix (Applied Biosystems), 150 nmol of both primers (5′-GAA ATG CCA CCG TGA ATG C and 5′-CTT TGG TGG TGA TCC TCT ATG ATT G), 100 nmol of the probe (FAM-TGG CAC CAT GAA CCC TAG CGT CGT T-TAMRA), and 120 ng of DNA extracted from stool samples. This reaction was performed on the QuantStudio 6 Flex Real-Time PCR System (Applied Biosystems - Life Technologies Corp., USA). For the quantification, a standard curve was performed with Saccharomyces cerevisiae (strain was kindly donated by Laboratory of Enzymology and Molecular Biology of Microorganisms/State university of Campinas).

Statistical analyses

Sample size

based on pilot study data, the sample size calculation was done having as the main variable the relative contribution of Proteobacteria percentage. Considering the paired group (pilot with 6 subjects in each group, effect size = 0.90), assuming α in 5% and β in 5% (power 95%) 12 subjects were necessary in each group. The Software used for the calculation sample size was G*Power version 3.1.2. (Program written, concept and design by Franz, Universitat Kiel, Germany, freely available windows application software)41.

The inclusion of the CD patients versus control groups were expressed as medians and percentiles (interquartile range (IQR), 25–75%) for continuous variables and as frequency for categorical variables. For the qualitative variables, the Fischer exact test and the chi-square test (χ2) were select. The Mann-Whitney U-test (non-parametric distribution) was used to compare continuous variables between categories. The significance level adopted was 5% for all statistical tests (p-value < 0.05). Statistical analyses were used according to SSPS v.20.0 software (IBM Inc., Armonk, NY, USA).

Ethical statement

During routine visits, subjects who agreed in participating in the study signed up an informed consent form. All methods were performed in accordance with the relevant guidelines and regulations. The study was approved by Institutional Ethics Review Board at Unicamp, in Campinas, Brazil, under reference number 885.749/14.