Speech samples were recorded during psychiatric interviews as answers to two different requests: “Please report a recent dream” and “Please report your waking activities immediately before that dream”. Each report was transcribed and represented as a speech graph, in which every word represented a node and every temporal connection between consecutive words represented an edge. The visual inspection of speech graphs suggests that dream reports (Figure 1b) vary more across groups than waking reports from the same subjects (Figure 1c).

A semantic and grammatical inspection of the most-frequent words, loops and their corresponding exit nodes showed few differences across dream and waking reports produced by psychotic and control subjects, with major overlap in word repertoire across groups (Supplementary Fig. S1). At the structural level, however, irrespective of meaning, clear contrasts emerged. While waking reports in all groups were typically sequential, with little recursiveness that reflected the linearity of chronological narrative, dream reports were quite convoluted when produced by bipolar and control subjects.

The SGA obtained for all the words in each report (Supplementary Tables S2 and S3) mostly agreed with the SGA obtained with smaller samples (n = 8 per group) and with the use of lexemes10, which require syntactical analysis. While dream-related graphs showed overall good classification quality and significant SGA differences between schizophrenic subjects and the two other groups (bipolar and control subjects), waking-related graphs failed to differentiate between any of the groups for any SGA (Figure 3a, Supplementary Table S4). We also found that nearly all SGA differed between dream and wake reports from bipolar and control subjects (Figure 3a).

Figure 3 SGA using raw data (full reports) differentiate psychopathological groups. (a) SGA boxplots with significant differences among schizophrenic, bipolar and control groups indicated in red and significant differences between dream and waking reports indicated in blue. (N = 20 per group; Kruskal-Wallis test followed by two-sided Wilcoxon Rank-sum test with Bonferroni correction with α = 0.0167). (b) Rating quality measured by AUC, sensitivity and specificity, using all attributes. Notice that dream reports categorize the groups much better than waking reports. (c) Rating quality for the distinction between dream and waking reports. While reports from bipolar and control subjects can be sorted, schizophrenic subjects yield reports that fail to differentiate dream from waking. Full size image

Since schizophrenic subjects produce dream reports with a significantly smaller word count (WC) than dream reports produced by bipolar and control subjects and given the fact that most SGA are strongly correlated with WC (Figure 4), it is possible that the differences between schizophrenic subjects and the two other groups derive solely from verbosity differences that could hinder the clinical applicability of the method. Indeed, bipolar and control subjects used more words than schizophrenic subjects when reporting a dream, making more complex graphs than when reporting on waking (Figure 3a). In contrast, schizophrenic subjects showed impoverished graphs for both dream and waking without any SGA difference between those, with overall low values of most SGA (Figure 3a).

Figure 4 Linear correlation between SGA and word count (WC). Only L1, Density, Diameter, ASP and CC did not present a significant linear correlation with WC. (a) Dream reports. (b) Waking reports. Full size image

To rule out the influence of verbosity, we analyzed the reports using a moving window of fixed word length (10, 20 and 30 words) with a step of 1 word. Each report yielded a population of graphs from which we calculated mean SGA. This procedure revealed that schizophrenic subjects yielded significantly less connected graphs (smaller LCC and LSC) and fewer edges (E) than bipolar and control subjects, for every word length tested and for both dream and waking (Figure 5a for word length = 30). Small graphs (word length = 10 and 20) showed smaller internal distances (Diameter and ASP) in schizophrenic subjects than in control subjects, for both dream (word length 10: Diameter P = 0.0001, ASP P = 0.0001; word length 20: Diameter P = 0.0007, ASP P = 0.0004) and waking (word length 10: Diameter P = 0.0021, ASP P = 0.0019; word length 20: Diameter P = 0.0013, ASP P = 0.0006). Additionally, dream-related small graphs had smaller ATD (word length 10 P = 0.0028; word length 20 P = 0.0106) and waking-related small graphs had smaller distances (word length 10 ASP P = 0.0140; word length 20 Diameter P = 0.0054, ASP P = 0.0043) in schizophrenic subjects, in comparison with bipolar subjects. Altogether the data show that reports from schizophrenic subjects, irrespective of originating from dream or waking, were characterized by small and poorly connected graphs, in comparison with bipolar and control subjects (Supplementary Table S2).

Figure 5 SGA controlled for verbosity differentiate psychopathological groups due to dream reports. (a) SGA boxplots for 30-word speech graphs show significant differences among schizophrenic, bipolar and control groups indicated in red and significant differences between dream and waking reports indicated in blue (N = 20 per group for dream reports; Kruskal-Wallis test followed by two-sided Wilcoxon Rank-sum test with Bonferroni correction with α = 0.0167). Eight subjects reported on waking events using less than 30 words (for waking reports, N = 17 for the schizophrenic and control groups and N = 18 for the bipolar group). (b) Rating quality measured by AUC, sensitivity and specificity, using all attributes. Raw data was compared with mean data obtained using analysis windows of fixed word length (10, 20 and 30 words per window). (c) The rating quality for the SGA-based distinction between dream and waking reports varies considerably across groups, reaching a maximum among bipolar subjects and a minimum among schizophrenic subjects. (d) Group sorting using dream-related SGA is better than classifications based on psychometric scores or waking-related data. Full size image

The reports produced by bipolar subjects, on the other hand, were very different depending on their source: dream events were reported with more recurrence (L3) and connectivity (ATD), higher density, smaller distances (diameter and ASP) and higher clustering coefficient (CC) than waking events (Figure 5a). Control subjects also reported dreams differently (with more E and larger LSC) and only schizophrenic subjects did not show any difference on dream or waking SGA (Figure 5a). When related to dreams, bipolar reports yielded less connected graphs (smaller LCC and LSC) with fewer nodes (N) than control subjects (Figure 5a). We also found graphs with smaller distances when using word length = 10 (Diameter P = 0.006 and ASP P = 0.0071), denoting smaller and less complex graphs in bipolar than in control subjects. None of these differences between bipolar and control subjects occurred in waking-related reports (Figure 5a).

To further explore dream versus waking differences in the reports of psychotic patients, we trained a Naïve Bayes classifier to differentiate among the groups using all SGA as inputs, with SCID results as golden standard. Schizophrenic subjects could be sorted from bipolar and control subjects with AUC between 0.6 and 0.86 for both dream and waking graphs (Figure 3b, Figure 5b, Supplementary Table S5), but only dream-related graphs could sort bipolar from control subjects (Figure 5b). Using raw data, it was possible to sort dream from waking reports among bipolar (AUC = 0.753) and control subjects (AUC = 0.807) (Figure 3c). Using an analysis window with length of 30 words, which provided the best accuracy for group classification, it was possible to automatically sort dream and waking reports among bipolar (AUC = 0.794) and control subjects (AUC = 0.65) (Figure 5c). This contrasts with reports from schizophrenic subjects, which showed no structural differences between dream and waking (Figure 3c, Figure 5c). Overall, the triple sorting of schizophrenic, bipolar and control subjects based on automatically selected attributes (E, LSC and ASP for dream reports; E and LCC for waking reports; word length = 30) was substantially better for dream-related SGA than for waking-related SGA or psychometric scores (Figure 5d).

The investigation of correlations between dream-related SGA and psychopathological symptoms grasped by PANSS and BPRS considering all 60 subjects produced interesting results: Using the attributes that best differentiated schizophrenic subjects from other groups (E, LCC and LSC), we found significant anti-correlations with negative and cognitive symptoms (Figure 6, Supplementary Fig. S2), known to be more frequent among schizophrenic subjects than among individuals with other psychotic syndromes7. Subjects that reported dream graphs with fewer edges or smaller connected components (LCC, LSC) scored higher on PANSS, on the negative PANSS subscale and on PANSS questions regarding flattened affection, poor contact, difficulties on abstract thought, less spontaneous or fluent speech; these subjects also scored higher on BPRS questions about emotional retraction and flattened affection (Figure 6a). Significant anti-correlations in waking reports only occurred between LCC and general psychotic symptoms: Subjects that reported on waking with lower LCC presented higher scores on the PANSS question about judgment and critical capacity and on the BPRS question regarding incoherent speech (Figure 6b).

Figure 6 Dream-related SGA are anti-correlated with specific psychopathological symptoms. (a) Spearman's rho for correlations between individual questions of the PANSS and BPRS scales and SGA obtained from dream reports (N = 60). Note the significant anti-correlations between SGA (E, LCC and LSC) and psychometric variables including total PANSS, PANSS negative subscale and some negative and cognitive symptoms such as flattened affect, poor contact, difficulty in abstract thinking, loss of spontaneity or fluency in speech in PANSS; as well as emotional retraction and flattened affect in BPRS. A 30-word moving window was used for data analysis. Circles indicate P values smaller than the Bonferroni corrected α = 0.00006. (b) Same as before but for waking reports (N = 52). Note the significant anti-correlations for LCC and general psychotic symptoms measured on both scales (loss of criticism in PANSS and incoherent speech in BPRS). Full size image

Finally, to simulate the comparison of an actual psychiatric clinical assessment with a scenario in which graph analysis was employed, we compared the performances of binary classifiers trained with 1) selected SGA from both dreaming and waking, 2) PANSS and BPRS total scores and 3) a combination of both. The attributes selected were those with significant correlation with psychometric scores: E, LCC and LSC for dream reports and LCC for waking reports (Figure 6). We found that SGA sufficed to successfully sort the three groups, differentiating schizophrenic from control subjects with AUC = 0.941, bipolar from control subjects with AUC = 0.722 and schizophrenic subjects from bipolar subjects with AUC = 0.768 (Figure 7a). The psychometric scales were able to properly sort schizophrenic from control subjects (AUC = 0.955) and bipolar from control subjects (AUC = 0.935), but failed to differentiate schizophrenic subjects from bipolar subjects (AUC = 0.376). For a combination of SGA and standard scale scores, schizophrenic subjects were sorted from bipolar subjects with AUC = 0.748, bipolar subjects were sorted from control subjects with AUC = 0.928 and schizophrenic subjects were nearly perfectly sorted from control subjects with AUC = 0.993. Triple group sorting was better for SGA (AUC = 0.767) than for scales (AUC = 0.731) and was optimized by their combination (AUC = 0.849; Figure 7a). To assess the general applicability of the method, reports in Portuguese were translated to English, German, French and Spanish. Figure 7b shows that group classification is remarkably similar across the five most prevalent Western languages.