Twelve patients starting KD and 11 healthy parents not starting KD were enrolled. For inclusion criteria of the patients, demographics, and treatment details, see Methods and Table 1. Fecal samples were collected at two time points, before and 3 months after starting KD. At the second time point, the ketogenic ratio was 4:1 in 7 children, 3.5:1 in 2, and 3:1 in 3. The ketone levels of β-hydroxybutyric acid were 0.3 ± 0.2 (mean ± SD), range 0.1–0.8 before diet start and 4.1 ± 1.2 with a range of 1.4–5.6 mmol/l after 3 months. Blood glucose levels decreased from 4.9 ± 0.5 (mean ± SD) to 4.3 ± 0.4 mmol/l during the intervention.

Table 1 Patients included in the study (n = 12) and efficacy concerning seizures and behavior at 3 months on ketogenic diet Full size table

Five out of the 12 children were responders concerning seizure frequency with >50% seizure reduction. Three patients, though non-responders concerning seizure frequency, had shorter seizures and less postictal tiredness. Ten children were responders concerning cognition and motor function (see Methods). The two children who did not improve in these aspects were non-responders also concerning seizure frequency and tapered the diet after follow-up at 3 months.

Sequencing results and evaluation

Average sequencing depth was 2.24 ± 0.59 (mean ± SD) million reads for patients and 2.05 ± 0.75 (mean ± SD) million reads for healthy parents. Raw read length was 2 × 300 bp.

A mock community (ZymoResearch) was processed and sequenced identically to all samples. It was used to test marker-based (MetaPhlAn2 and metagenomic operational taxonomic unit (mOTU)), alignment-based (RTG), and kmer-based (kaiju) approaches for taxonomic profiling. Our mock community sequencing data was best recaptured using marker-based taxonomic profiling (Supplementary Figure 1), which was then used for further analysis. We demonstrated that our sample preparation workflow ensured efficient lysis and DNA extraction from different bacterial taxa including both Gram-positive and Gram-negative organisms. However, no fungi from the mock community were identified using any bioinformatics tool tested. A negative control was included in our sample preparation and sequencing procedure, from which 722 paired sequencing reads were obtained, with 428 reads passing quality filtering. None of the reads from the negative control could be identified as a clade-specific marker in MetaPhlAn2.

Raw sequence analysis output tables of taxonomic (MetaPhlAn2) and functional (SUPERFOCUS) profiles are provided as Supplementary Table 1 and Supplementary Table 2.

Gut microbial diversity

Alpha diversity

Alpha diversity indicates species diversity within a single sample. Observed mOTUs, Chao1, and Shannon diversity were lower in patients than in controls already before treatment, i.e., at time point one, and the difference seemed to increase further after treatment, i.e., at time point two (Fig. 1a). However, the difference in diversity between the two time points was not significant in either patients or controls. Thus introduction of KD had little impact on the patients’ total gut microbial diversity. A slight correlation between alpha diversity values and age of the patients at time point one was detected (Fig. 1b). This was only significant for Chao1 (p = 0.044, r2 = 0.345). Thus a higher alpha diversity in the parents may reflect a more mature gut microbiota.

Fig. 1 Microbial alpha diversity analysis. From left to right: Total number of observed metagenomic operational taxonomic units (mOTUs), total species richness Chao1, and Shannon’s diversity index of evenness a. Data are presented as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; points, outliers. Dunn’s test of multiple comparisons with Benjamini–Hochberg adjustment: *p < 0.05, **p < 0.01. Ctrl control, Pat patient. White, Controls' time point 1; Ivory, Controls' time point 2; Green, Patients' time point 1; Red, Patients' time point 2. Correlation of alpha diversity to age in patients b with R2 values as indicated Full size image

Beta diversity

Beta diversity measurements reflect compositional differences between samples. Principal component analysis (PCA) of the taxonomic composition of the gut microbiota revealed clustering of control samples and some clustering of patient samples but patient samples were in general more different from each other than control samples (Fig. 2a). A shift along the x axis was detected at time point one in the patients compared to controls, and a shift along the y axis was detected at time point two in the patients. This indicates that the gut microbiota in patients was different from that of controls and that introduction of KD has an impact on the composition of the gut microbiota.

Fig. 2 Microbial beta diversity analysis. Principal component analysis (PCA) of a taxonomic and b functional profiles. Controls before (white circles), Controls after (ivory squares), Patients before (green circles), Patients after (red squares). Taxonomic profiles of individuals at the phylum level c Full size image

PCA of the functional composition of the gut microbiota (Fig. 2b) at SEED subsystem level 3 reproduced clustering of controls and less clustering of patients as seen in Fig. 2a. At time point one, patient samples were shifted along both x and y axis compared to controls and at time point two this shift was even more pronounced, indicating differences in the functional profile of the gut microbiota between patients and controls and that KD influences the functional composition.

Analysis of taxonomic changes in the gut microbiota during KD

Dominant phyla in fecal samples from patients and controls at time point one were Firmicutes, Bacteroidetes, and Actinobacteria (Fig. 2c), which are typically found in the human gut.11 Here intraindividual changes were detected in both groups. In the control group relative abundance of Actinobacteria, had decreased in several individuals. Other phyla appeared more stable in the parents compared to the patients, where we observed both a decrease in relative abundance of Actinobacteria and Firmicutes and an increase in relative abundance of Bacteroidetes and Proteobacteria.

Next, we applied the linear discriminant analysis (LDA) effect size (LEfSe) method by Segata et al.12 for statistical analysis at all taxonomic levels. Nineteen discriminative features at various taxonomic levels were detected (Fig. 3a). At the phylum level, relative abundance of Actinobacteria was significantly diminished and Proteobacteria increased proportionally 3 months after starting KD. The decrease in Actinobacteria relative abundance could mainly be attributed to a decrease of relative abundance of the genus Bifidobacterium (Fig. 3b), with a mean relative abundance before KD: 15.8%; 3 months into KD: 3.9%. Within the genus Bifidobacterium, two species were significantly decreased: Bifidobacterium longum (mean relative abundance before KD: 8.1%; 3 months after starting KD: 2.4%) and B. adolescentis (before KD: 3.2%; 3 months into KD: 0.2%). The mean relative abundance of Eubacterium rectale and genus Dialister also decreased during KD (2.5% before; 0.5% after, and 2.2% before; 0.4% after, respectively). Genus Escherichia was more relative abundant 3 months after starting KD (mean: 3.1% before; 8.5% after), which was largely due to an increase of Escherichia coli.

Fig. 3 Statistical analysis of taxonomic profiles. Significant changes at all taxonomic levels during dietary intervention (KD) using the linear discriminant analysis (LDA) effect size (LEfSe) method; p < 0.05, LDA > 4.0 a. Cladogram of significant changes at all taxonomic levels during KD b. Green: Patients' time point 1, Red: Patients' time point 2 Full size image

Applying the same parameters to the parental dataset, relative abundance of one unclassified strain of Eubacterium siraeum was significantly increased from time point one to two. None of the features found significantly changed in the patients were identified here.

Analysis of functional changes in the gut microbiota during KD

Significant functional changes in subsystem relative abundances and effect size were calculated (Table 2). Twenty-nine subsystems changed significantly due to KD. Relative abundance of 26 pathways was diminished after 3 months on KD and 3 became more relative abundant. The group with most pathways changed was carbohydrate metabolism, showing reduction of fructooligosaccharides (FOS) and raffinose utilization, sucrose utilization, glycogen metabolism, lacto-N-biose I and galacto-N-biose metabolic pathway; Fermentations: lactate, pentosephosphate pathway; and formaldehyde assimilation: ribulose monophosphate pathway. Analysis of the parental dataset did not detect any discriminative features at either p < 0.01 or p < 0.05 and LDA > 2.0.

Table 2 Significant functional changes at SEED level 3 during KD using the LDA effect size method Full size table

Computational decomposition of post-KD functional shifts was used to predict potential taxonomic contributions to observed functional changes. This analysis suggested that the decrease in Bifidobacterium relative abundance and increase in Escherichia relative abundance both contribute to the decreased relative abundance of multiple pathways, including carbohydrate metabolism pathways (Fig. 4a). In contrast, increases in pathway relative abundance was predicted to be more genus specific. For example, the increase in both Escherichia and Bacteroides relative abundances was inferred to be a potential driver of the increase in the Hemin transport system pathway, a function that has indeed been previously observed in both Escherichia13,14,15 and Bacteroides16,17,18 species. Interestingly, this decomposition also inferred that the increase in Eggerthella relative abundance might drive the increase in the succinate dehydrogenase subsystem (Fig. 4b). This subsystem contains genes encoding fumarate reductase subunits. According to the KEGG19 database annotation of the Eggerthella lenta genome, it encodes three sequences related to fumarate reductase subunits.

Fig. 4 Taxonomic drivers of functional shifts associated with KD. Taxonomic contributions to the shift in each function are shown as bars for functions enriched pre-KD compared to post-KD a and functions enriched post-KD compared to pre-KD b. Bar length represents the size of the contribution. For each function, the position of the bar indicates the type of contribution. The top (bottom) bars represent contributions from genera with higher (lower) relative abundance in the enrichment group (pre-KD for a, post-KD for b). Bars to the left (right) of the vertical black line represent contributions to decreased (increased) function abundance in the enrichment group. The red diamonds indicate the observed increase in relative median function abundance in pre-KD samples a or post-KD samples b. Colors indicate the genus, with genera from the same phylum sharing similar colors: green for Actinobacteria; orange for Bacteroidetes; teal, blue, and purple for Firmicutes; and red for Proteobacteria. Genus labels within bars are provided for genera with notable contributions where bars were wide enough to accommodate labels Full size image

These links between taxonomic and functional changes in our pilot dataset are based on computational inference and may require validation and confirmation in a larger cohort. However, they are useful to generate intriguing hypotheses concerning the taxonomic drivers of observed functional shifts.

Analysis of butyrate production potential

E. rectale, a major butyrate-producing species in the human gut, significantly decreased proportionally in patients during KD. To analyze whether this might impact the total butyrate synthesis potential of the whole community, a bacterial butyrate synthesis gene database20 was used to identify reads from genes involved in butyrate production in our shotgun metagenomic dataset. RPKM (reads per kilo base per million) values were calculated for all genes of the known microbial butyrate production pathways upstream of bcd/etfAB (butanoyl-CoA dehydrogenase, E.C.: 1.3.1.109), as there are downstream overlaps between pathways. gcdB of the glutarate pathway was excluded owing to recruitment of many false positives.21 Four butyrate-producing pathways are currently identified in human gut microbial communities20; the acetyl-CoA, the glutarate, the 4-aminobutyrate, and the lysine pathway. KD intervention did not have any overall significant impact on the relative abundance of any of these pathways (Fig. 5 a, b and Fig. Supplementary Figure 2 a, b). Interestingly though, we found that genes from the most abundant butyrate-producing pathway in the healthy human gut microbiome, the acetyl-CoA pathway, were less abundant in our patients compared to controls already before starting KD. Here a very weak but inverse correlation was detected between age of the patients and relative gene abundance indicating that the lower relative abundance of genes of the acetyl-CoA pathway in our patients (Fig. 5c) compared to our controls before KD was not due to the age difference between both groups. Rather, increasing age in our patients might increase the difference to the controls. An opposite trend was shown for the 4-aminobutyrate pathway (Fig. 5d). This pathway was shown to be less relative abundant in the healthy gut microbiome compared to the acetyl-CoA pathway.20 In our cohort, genes of this pathway were more frequently identified in patients compared to controls before KD and a possible, yet not significant, further proportional increase was experienced during KD. Relative gene abundance seemed to slightly increase with age of the patients, indicating that this observation cannot be explained by age differences in the patient and control groups either. The lysine pathway did not show any significant differences in relative gene abundances within or between the groups at either time point (Supplementary Figure 2a). In the glutarate pathway, the relative abundance of one out of the six genes (hgCoAdA) was significantly decreased in the patients (Supplementary Figure 2b). However, the relative abundance of the concomitant enzymatic subunits hgCoAdB and hgCoAdC or other genes involved in this pathway was not decreased. It therefore seems unlikely that the glutarate pathway relative abundance is significantly different in any group of samples.