Our results indicate that a loss of beneficial microorganisms is more associated with patients with CD than a gain of more pathogenic ones. The beneficial microorganisms include those involved in butyrate production such as Faecalibacterium, 22 Christensenellaceae, Methanobrevibacter and Oscillospira. Our findings confirm the results of many other studies reporting the lower relative abundance of Faecalibacterium in patients with CD and also show that this genus is not missing in patients with UC, thus making it a useful marker to discriminate patients with CD from patients with UC. Christensenellaceae, Methanobrevibacter and Oscillospira have been correlated with subjects with a low BMI (<25), 23–25 and they may interact with the gut immune system to maintain homeostasis. Potential pathogenic microorganisms, termed pathobionts, include Fusobacterium and Escherichia. The former is associated with infections 26 and colorectal cancer 27 , 28 and the latter with IBD. 8 , 29

At baseline, six genera were enriched in patients with CD compared with 12 in HC (FDR<0.003). While only two genera were enriched in patients with UC compared with one in HC (FDR<0.03), thereby suggesting that dysbiosis is also greater in CD than in patients with UC at the taxonomic level, with a significant overall alteration in 18 genera versus 3, respectively ( figure 2 C). In order to uncover microbial signatures of recurrence, we used the Kruskal-Wallis test to compare the faecal samples of patients with UC and CD at the time of recurrence with those of patients who remained in remission after 1 year of follow-up. We did not find significant differences. Furthermore, in order to discover the predictive value of recurrence in patients with CD and UC, using the same test, we compared the baseline faecal samples of those who developed recurrence later on (n=13 for CD and n=18 for UC) with those who remained in remission after 1 year of follow-up (n=21 for CD and n=15 for UC). The results did not reveal any biomarker predictive of recurrence either for CD or UC.

Dysbiosis in patients with IBD. (A) Microbiome clustering based on unweighted (left) and weighted (right) Principal Coordinate Analysis-UniFrac metrics. Significant differences were observed between all controls (All-HC, combining HC, healthy relatives HR(CD) and HR(UC)) and patients with CD (NPMANOVA test; p=0.001 for weighted and unweighted UniFrac indexes) and between all controls and patients with UC (NPMANOVA test, p=0.001 for unweighted and p=0.004 for weighted UniFrac). Microbial richness was calculated based on the Chao1 index (B, left) and microbial richness and evenness on the Shannon index (B, right). Using the Student's t-test, the microbiome of patients with CD presented significantly lower richness and evenness than healthy controls (HC, HR(CD), and HR(UC)) and patients with UC, but patients in remission and in recurrence (CD-RC and UC-RC) did not present significant differences. *p<0.05. (C) Taxonomic differences were detected between HC and UC and between HC and CD using Kruskal-Wallis test (corrected p values; false discovery rate <0.01). CD, Crohn's disease; NPMANOVA, non-parametric multivariate analysis of variance.

We performed a multivariate analysis of variance on distance matrices (weighted and unweighted UniFrac) using the NPMANOVA test. The microbial community of the two groups of controls (relatives (HR) and non-relatives (HC)) were not significantly different from each other (p=0.126 for weighted and unweighted UniFrac distances), except for one genus. Collinsella was more abundant (Kruskal-Wallis test, 52×10 −5 vs 1.7×10 −5 ; FDR=1.6×10 −5 ) in HR compared with HC. Conversely, the microbiome of patients with CD and UC was significantly different from that of controls (relatives and non-relatives (All-HC)) (NPMANOVA test; p=0.001 for weighted and unweighted UniFrac distances for CD; p=0.001 for unweighted and p=0.004 for weighted UniFrac distances for UC) ( figure 2 A). Patients with CD and UC also showed a significant difference in their microbiome (NPMANOVA test, p=0.001 for weighted and unweighted UniFrac distances). Patients with CD but not patients with UC showed a lower microbial α diversity compared with the two groups of controls (p<0.05), as reflected by the Chao1 and Shannon indexes ( figure 2 B).

Microbiome stability. Unweighted UniFrac distances were calculated between different time periods for healthy relatives HR(CD) (relatives of patients with CD), HR(UC) (relatives of patients with UC), and patients with CD and UC (3M, 3 months; 6M, 6 months; 9M, 9 months; 12M, 12 months). CD-RC and UC-RC refer to samples collected during recurrence onset. At 3-month interval, patients with CD and UC presented significant differences in their UniFrac indexes compared with their HR (Mann-Whitney U test, *p=0.01). We compared the UniFrac indexes obtained between samples collected at baseline and the rest of the time points using the mixed-design ANOVA model and found that the microbiome of patients with CD was significantly more unstable than that of patients with UC (mixed-ANOVA, p<0.001). CD, Crohn's disease.

Using the weighted UniFrac distance, a metric used for comparing microbial community composition between samples, we evaluated the stability of the microbiome of patients with UC and CD over time, comparing samples at baseline with the following time points: 3, 6, 9 and 12 months. Over a 3-month interval, patients with CD, but not patients with UC, showed higher UniFrac distances compared with Healthy relatives (HR) (Mann-Whitney test, p=0.01), thereby indicating a higher instability of the CD microbiome compared with controls ( figure 1 ). Conversely, patients with UC presented a more stable microbiome than their relatives (Mann-Whitney test, p=0.015). Furthermore, over 1-year follow-up, we compared the UniFrac distances obtained between samples collected at baseline and the rest of the time points using the mixed-design ANOVA model, a repeated measures analysis of variance. The results showed that the microbiome of patients with CD was significantly more unstable than that of patients with UC (mixed-ANOVA, p<0.001).

To characterise the microbial community of IBD we enrolled 178 participants (40 HC non-related to the patients, and 34 patients with CD and 33 patients with UC, and 36 and 35 healthy relatives (HR) of the patients with CD and UC, respectively) in a longitudinal study (discovery cohort). HR were patients' first-degree relatives. However, information on whether they were living in the same house as the patients at the time of sampling was not available. Non-related HC provided a faecal sample at a single time point, whereas HR provided two samples within a 3-month interval. Patients with UC and CD in remission provided samples at 3-month intervals over a 1-year follow-up. When the patients with IBD developed recurrence, they provided a faecal sample at the onset. During the 1-year follow-up, 13 patients with CD (38%) and 18 patients with UC (54%) developed recurrence. A total of 415 samples were collected for microbiome analysis.

Previous works have shown that smoking habit is associated with IBD. 30 Therefore, we tested the link between smoking and disease severity (remission and recurrence) using the χ 2 test. We found no link between being a smoker or ex-smoker and disease severity. We then studied the association between relative abundance of groups of bacteria and smoking habit using the Kruskal-Wallis test. In patients with CD, a genus belonging to Peptostreptococcaceae was present in a higher proportion in smokers (FDR=0.006), while Eggerthella lenta was found in a higher proportion in non-smokers (see online supplementary material 1 ). In patients with UC, we observed that smokers presented a greater abundance of Butyricimonas, Prevotella and Veillonellaceae (FDR<0.04), while non-smokers had a higher proportion of Clostridiaceae and Bifidobacterium adolescentis (FDR<0.03). We also examined the link between the relative abundance of groups of bacteria and disease localisation for CD and extension for UC (obtained by the Montreal classification). 31 In patients with CD, the disease was localised mostly in the ileum (L1, 35%) and in the ileocolon (L3, 64.7%). The Mann-Whitney test revealed that Enterococcus faecalis and an unknown species belonging to Erysipelotrichaceae were more abundant in stool when the disease was localised in the ileum than in the ileocolon. In patients with UC, the distribution of disease behaviour at sampling was as follows: proctitis (E1, 27.3%), left-sided colitis (E2, 33.3%) and pancolitis (39.4%). Using the Kruskal-Wallis test, we correlated disease behaviour and microbial community composition and found that proctitis was associated with a greater relative abundance of an unknown Clostridiales, Clostridium, an unknown Peptostreptococcaceae and Mogibacteriaceae (FDR<0.05) in stool. Finally, we did not find any relation between the medication use ( table 1 ) and microbiome composition.

Microbial marker discovery

The effectiveness of FC to measure IBD activity was assessed on a subset of faecal samples (from the discovery cohort) provided by 122 participants (figure 3). For patients with CD and UC, FC was measured at baseline and either after 1-year in remission or at recurrence. During remission, FC was significantly higher in patients with CD and UC than in their HR and significantly higher during recurrence than during remission (figure 3). However, FC concentration did not differ between patients with CD and UC, either during remission or at recurrence, making them useless to discriminate the two disorders.

Figure 3 Calprotectin: biomarker of inflammation. Calprotectin was measured in the stool of healthy relatives of CD (HR(CD)) and UC (HR(UC)) patients, and in the stool of patients with CD and UC at baseline (TP0) and after 1-year in remission (RM) and at recurrence (RC). The Mann-Whitney test was used to compare differences between groups. CD, Crohn's disease.

Groups of microbes that presented most significant differences between CD and UC and between CD and HC using the Kruskal-Wallis (FDR<0.05) test were selected to develop an algorithm with the potential to discriminate CD and non-CD (figure 4A). This algorithm retains samples that: “do not contain Faecalibacterium, or Peptostreptococcaceae;g, Anaerostipes and Christensenellaceae;g or contain Fusobacterium and Escherichia but not Collinsella and Methanobrevibacter”. Faecalibacterium, an unknown genus of Peptostreptococcaceae, Anaerostipes, Methanobrevibacter and an unknown genus of Christensenellaceae were abundant in HC and UC and absent or almost absent in CD ones, while Fusobacterium and Escherichia were abundant in patients with CD and almost absent in HC and UC. Collinsella, which was found mostly in UC cases, allowed us to discriminate between UC and CD. With these eight genera, we implemented the algorithm to identify patients with CD.

Figure 4 Microbial marker discovery and validation. Eight bacterial genera showed potential to discriminate between HC (unrelated HC) and patients with CD and UC in the discovery cohort: 34 HC, and 33 patients with UC and 34 patients with CD (A) and in the validation cohort of 2045 faecal samples from HC (n=1247), CD (n=339), UC (n=158), IBS (n=202) and anorexia (n=99) (B). Each blue bar represents the presence of each microbial group for each subject. Participants in each group are underlined with a specific colour code (blue=all HC; red=CD; yellow=UC; green=IBS and purple=anorexia). The plot was performed using an R script on relative abundance of the eight bacterial genera. The gradient of colours for the bars corresponds to white=absent, clear blue=low abundance and dark=high abundance. (C) Unweighted UniFrac Principal Coordinate Analysis representation of the various groups of subjects: HC=unrelated healthy controls, CD, Crohn's disease, Significant differences were found between CD and HC, UC, IBS and anorexia (NPMANOVA test, p<0.001). NPMANOVA, non-parametric multivariate analysis of variance.

Using this algorithm, we first tested its performance on the rest of our sample set collected 3 months after baseline from relatives of HC (167 samples), and 3, 6, 9 and 12 months after baseline for patients with IBD (135 samples for CD and 135 for UC). We obtained an average of 77.7% of true positives for CD detection and an average of 7.3% and 12.8% of false positives for the detection of HC and UC, respectively (table 2). Therefore, the diagnostic accuracy for distinguishing patients with CD from HC and from patients with UC was 85.1% and 82.4%, respectively. Of the 34 patients with CD, the median duration of the disease at sampling was 6.5 years. For four patients, the diagnosis of the disease was made the same year as the sampling, and the algorithm was able to detect three of them (75%).

Table 2 Detection of CD markers in HC, CD, UC, IBS, subjects with anorexia

We validated our method with several unpublished and published data. To evaluate the sensitivity of the markers, we analysed a cohort of 54 patients with CD recruited at the University Hospital Leuven (Belgian CD cohort). Microbial DNA extraction, 16S rRNA gene amplification and sequencing and data analysis were performed in our laboratory in Spain. We generated about 5.2 million high-quality sequence reads for the 187 samples. We applied our algorithm to the whole cohort and identified an overall sensitivity of 81.8% of the samples as being CD (true positive) (table 2). Furthermore, to evaluate the predictive value of recurrence, we performed a Kruskal-Wallis analysis of the faecal samples collected before surgery, comparing patients on the basis of their Rutgeerts scores obtained 6 months after surgery. The results showed that patients who developed postoperative recurrence (with a Rutgeerts score of i3 and i4, n=28) harboured a higher relative abundance of Streptococcus (p=0.002; FDR=0.17) than those who remained in remission (with a Rutgeerts score of i0 and i1, n=26). This result suggests that the presence of Streptococcus in stool samples before surgery is a predictive marker of future recurrence.

To evaluate the specificity of the markers to detect CD versus UC, we analysed a cohort of 41 patients with UC enrolled at the University Hospital Vall d'Hebron (Spanish UC cohort). The study was part of a European project (MetaHIT; http://www.metahit.eu) and included patients with UC in long-term remission. Clinical information is shown in table 1. We extracted and sequenced the faecal microbiome at baseline (ie, collected before any intervention), generating 1.5 million sequence reads and tested our algorithm on this dataset. We obtained a specificity of 95.1% for the detection of CD versus UC (table 2). We also tested the specificity of our algorithm on several non-IBD published datasets, namely on IBS, subjects with anorexia and healthy subjects. IBS and CD may present common symptoms, including abdominal pain, cramps, constipation and diarrhoea, and a simple method that distinguishes CD from IBS could also help reducing unnecessary endoscopies. Therefore, we applied our algorithm to the faecal samples of 125 subjects previously diagnosed with IBS. The sequence data were obtained from a recently published study.32 Of the 125 patients with IBS, the algorithm identified seven as being CD, thus showing only 5.6% of false positives and a specificity of 94.4% (table 2).

The algorithm was then tested against a set of 1016 faecal samples collected at King’s College (London) from a cohort of 977 healthy twin individuals23 and against 158 faecal samples obtained from HC and patients diagnosed with anorexia.33 Comprising healthy female adult twin pairs from the UK, the former study was originally designed to evaluate how host genetic variation shapes the gut microbiome. Our algorithm detected 75 out of 1016 samples (7.3% of false positive) as being CD, thus showing a specificity of 92.7%. The second study was designed to address dysbiosis in patients with anorexia compared with HC and to evaluate the shift in the microbial community after weight gain in patients with anorexia.34 As shown in this study, anorexia is associated with an alteration of gut microbiome composition. In order to evaluate whether changes occur in the gut community as a result of a condition other than IBD, we tested the algorithm on this anorexic cohort. Our tool detected 9 false positives out of 158 samples, thus showing a specificity of 94.3%.

Figure 4B illustrates the profile of the 8 microbial markers in the whole dataset of 2045 faecal samples from the various conditions: HC, CD, UC, IBS and anorexia. The results clearly confirmed that CD is characterised by a different abundance profile of the eight markers compared with the other groups, as also shown by a separate clustering based on the unweighted UniFrac PcoA representation (figure 4C).

To test the accuracy of the method, we also applied it to a set of recently published data recovered from a French cohort of IBD subjects5 although those authors used a different method to analyse the microbial community compared with our approach. In that case, they addressed a different variable region of the 16S rRNA gene (V3–V5 instead of V4) and a different sequencing platform (Ion Torrent sequencing instead of Illumina Miseq). In that study, Sokol et al characterised the microbiome of 235 well-phenotyped patients with IBD and 38 HC. In spite of the technical differences, we re-ran the analysis using their raw sequence data and our sequence analysis protocol (see the Methods section). Using our quality control criteria, we recovered 8.5 million high-quality sequences for 232 patients with IBD (146 CD and 86 UC) and the 38 HC. Our method showed an accuracy of 64% for the prediction of CD versus UC (60% sensitivity and 68% specificity) and of 77% for the prediction of CD versus HC (60% sensitivity and 94.8% specificity), respectively. Moreover, we noticed that this dataset does not carry any sequences belonging to the genus Collinsella and a very low abundance of Methanobrevibacter, which in our algorithm allow the differentiation between UC and CD.