DNA was isolated, and 16S rRNA gene sequencing was performed on stool aliquots and the residual buffer of paired OC-Auto® FIT sampling cartridges from 404 patients. Among these patients, 101 had CRC, 162 had adenomas, and 141 had no colonic lesions. First, we tested whether the bacterial community profiles from FIT cartridges recapitulated their stool counterparts. Second, we compared the number of OTUs shared within FIT/stool pairs from the same patient to the number of OTUs shared between patients (Fig. 1a). FIT cartridges and stool from the same patient (red line) had significantly more bacterial populations in common than those taken from different patients (p < 0.001, two-sample Kolmogorov-Smirnov test), indicating that community membership was conserved within patients across stool and FIT cartridges. Third, we calculated the similarity in community structure between samples using 1-thetaYC index [18]. This metric compares the presence or absence of bacterial populations and their relative abundance. The bacterial community structure of stool and FIT samples from the same patient (red line) were significantly more similar to each other than to stool or FIT from other patients (Fig. 1b, p < 0.001). Finally, we used a Mantel test to determine whether the patient-to-patient thetaYC distances among stool samples were correlated with the patient-to-patient thetaYC distances among FIT cartridges. We found that there was a significant correlation (Mantel test r = 0.525, p < 0.001), suggesting that the inter-patient variation in community structure between the stool samples of patients was conserved in samples from FIT cartridges.

Fig. 1 Bacterial community structure from FIT cartridge recapitulates stool. Density plots showing distribution of the number of shared OTUs (a) and community similarity (b) between groups of samples (*p < 0.001 two-sample Kolmogorov-Smirnov test) Full size image

Next, we observed a significant correlation between the abundance of each genus in the paired FIT cartridge and stool samples (Fig. 2a, Spearman’s rho 0.699, p < 0.001). This suggested that the abundance of bacterial genera was conserved. This correlation was especially strong when comparing the 100 most abundant genera from stool (Spearman’s rho 0.886, p < 0.001). Several bacterial species have been repeatedly associated with CRC, including Fusobacterium nucleatum, Porphyromonas asaccharolytica, Peptostreptococcus stomatis, and Parvimonas micra [9–11, 19]. As expected, the abundance of these species in stool was significantly correlated with their abundance in matched FIT cartridges (all p < 0.001, Spearman’s rho ≥0.352) (Fig. 2b). We observed some biases in the abundance of certain taxa. In particular, the genus Pantoea was detected in 399 of the 404 FIT cartridges with an average abundance of 2.4 % but was only detected in 1 stool sample. The genus Helicobacter was detected in 172 FIT cartridges but only 10 stool samples. Likewise, several genera of Actinobacteria were more abundant in stool samples compared to FIT. Notwithstanding these few exceptions, the abundance of the vast majority of genera were well conserved between stool and FIT cartridges. Overall, these findings suggested that the overall bacterial community structure and the abundance of specific taxa in FIT cartridges and stool were similar.

Fig. 2 Bacterial populations conserved between stool and FIT cartridge. a Scatter plot of the average relative abundance of each bacterial genus in stool and FIT cartridges colored by phylum. b Scatter plots of the relative abundances of the four species frequently associated with CRC. All correlations were greater than 0.35 (all p < 0.001) Full size image

We tested whether the bacterial relative abundances we observed from FIT cartridges could be used to differentiate healthy patients from those with carcinomas using random forest models as we did previously using intact stool samples [11]. Using DNA from the FIT cartridge, the optimal model utilized 28 OTUs and had an AUC of 0.831 (Fig. 3a). There was not a significant difference in the AUC for this model, and the model based on DNA isolated directly from stool, which used 32 OTUs and had an AUC of 0.853 (p = 0.41). Furthermore, the probabilities of individuals having lesions were correlated between the models generated using DNA isolated from the FIT cartridges and stool samples (Spearman’s rho 0.633, p < 0.001, Fig. 3b). We also generated random forest models for differentiating healthy patients from those with any type of lesions (i.e., adenoma or carcinoma). There was not a significant difference in AUC between the stool-based model with 41 OTUs (AUC = 0.700) and the FIT cartridge-based model with 41 OTUs (AUC = 0.686, p = 0.65, Fig. 3c). Again, the probabilities of individuals having lesions according to the two models were significantly correlated (Spearman’s rho 0.389, p < 0.001 Fig. 3d). These findings demonstrated that models based on bacterial DNA from FIT cartridges were as predictive as models based on DNA isolated directly from stool.