The experiment was organized in short blocks. In each block, subjects were presented with a specific sequence of spatial locations, which they were asked to continue. The eight possible locations, forming a symmetrical octagon, were constantly visible on the computer screen ( Fig 1B ). On a given trial, the locations forming the beginning of the chosen sequence were flashed sequentially, and then the sequence stopped. The subject’s task was to guess the next location by clicking on it. As long as the subject clicked on the correct location, he was asked to continue with the next one. In case of an error, the sequence was restarted from the beginning: the entire sequence of locations was flashed again, the mistake was corrected, and the subject was again asked to predict the next location. For each sequence, the procedure was initiated by showing only the first two items. Thus, starting with the 3 rd location in the sequence, subjects were given a single opportunity to venture a guess at each step. In order to introduce the task, participants were always presented first with a “repeat” sequence of clockwise or counterclockwise rotating locations. The order of subsequent sequences was randomized.

On each block, a spatial sequence consisting in a succession of 16 locations was presented. These sequences are shown in blue and green labels in Fig 1C . In total, each participant was presented with two “repeat”, two “alternate” and two “2squares”, each spanning the two directions of rotation around the octagon. Two “2arcs”, four “4segments” and one “4diagonals” were also presented in order to test the comprehension of all four axial symmetries and rotational symmetry. In these cases, the direction of rotation was randomized. One exemplar of “2rectangles” and one of “2crosses” were also randomly selected. Finally, two irregular sequences were picked randomly among the 768 sequences of maximal complexity. The starting point of each sequence was picked randomly among the subset of eight locations of the octagon that preserved the global shape.

Specific planned comparisons were performed in order to finely probe the understanding of hierarchical sequence structure. For example, in “4segments”, the even data points correspond to the application of the 1 st -level, shallower level of regularity (axial symmetry), while the odd data points result from a change of starting point, and thus represent a deeper, 2 nd -level regularity that involves a non-adjacent temporal dependency (subjects must remember the starting point of a sub-sequence of 2 items). Consequently, comparing performance on such data points provides information about the representation of nested rules in our paradigm.

The data consisted in a discrete measure of performance (correct or error) for each subject, each sequence item, and each ordinal position from 3 rd to 16 th . Because those data were discrete (even after averaging performance over a subset of sequences or ordinal data points), we used Friedman’s non-parametric test for paired data (a non-parametric test similar to a parametric repeated-measures ANOVA). When necessary, we used a Bonferroni correction for multiple comparisons (across 14 data points for educated adults, 8 data points for other subjects). To quantify the evolution of performance over time, we calculated for each subject the Spearman’s rank correlation of error rates with ordinal position, and compared the mean correlation coefficient to 0 using a Student t-test. When the evolution of performance over time was evaluated on a small number of ordinal positions (3 or 5, as happens in experiments 2 – 4 ), we used Friedman’s test for multiple conditions. Finally, whenever we needed to compare performance between groups of subjects on a specific condition (e.g. adults and children, as will arise in experiment 2 ), given that we had discrete measures (correct or error), we used Fisher’s exact test when the number of measures per subject was 1 or 2; and the Wilcoxon rank-sum test for independent samples whenever comparing the means of 3 or more conditions.

Results.

As a baseline, we first examined the performance with “irregular” 8-item sequences, which contained no obvious geometrical regularity. The evolution of average performance across the two successive repetitions is shown as a background gray curve in all panels of Fig 2. The mean error rate decreased across trials (mean rank correlation of error rate with ordinal position: ρ = -0.51 ± 0.05, Student t-test: t 22 = 10.3, p <7.10−10). This improvement could be decomposed into two contributions: rote memory and anticipation. First, performance was better in the second half of each block, i.e. during the repetition of the sequence, than in the first half, when the sequence was introduced, indicating rote memory (Friedman test: F = 15.7, p<10−4; point-by-point comparisons revealed a significant difference at all but the last location, ps<0.05). Second, performance improved even within the first half, even before the full sequence had been presented (anticipation; r = -0.4 ± 0.08, Student t-test: t 22 = 5.1, p <4.10−5). This finding indicates that subjects took advantage of the fact that the 8 locations were sampled without replacement, thus narrowing the choice of remaining locations. Yet memory for past locations was not perfect, as shown by the fact that performance on data points 7 and 8 remained worse than the chance level expected if subjects perfectly avoided past locations (respectively 85 ± 6% vs 50%; and 54 ± 8% vs 0% errors; One-Sample Wilcoxon Signed Rank Tests: both ps< 0.001).

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 2. Performance of adult participants in experiment 1. Top panels show the evolution of error rate across successive steps (data points 3–16 in adults) for each regular sequence (error bars = 1 SEM). The gray curve in the background shows the error rate for irregular sequences, which serve as a baseline. Bottom panels show the percentage of responses at a given location for each data point. White dots indicate the correct location. Vertical dashed lines mark the transition between the two 8-item subsequences that constitute the full 16-item sequences. https://doi.org/10.1371/journal.pcbi.1005273.g002

Irregular sequences served as a baseline with which to compare other regular sequences. In every regular sequence, the mean error rate was significantly lower than in the irregular baseline (“repeat”: 2.5 ± 0.9%; “alternate”: 25.5 ± 4%; “2arcs”: 15 ± 1.4%; “2squares”: 23.5 ± 3.7%; “4segments”: 15 ± 1.4%; “4diagonals”: 27 ± 4%; “2rectangles”: 38 ± 3.2%; “2crosses”: 27.5 ± 3.2%; “irregular”: 59.5 ± 3.8%; Friedman tests, all ps<0.001). Moreover, in every case, participants performed significantly better than baseline even before the full presentation of the 8-item sequence (averaged error rate of data points 6–8 for “repeat”: 0%; “alternate”: 19.6 ± 6.1%; “2arcs”: 8 ± 2.1%; “2squares”: 16.7 ± 5%; “4segments”: 4 ± 2%; “4diagonals”: 17.4 ± 4.7%; “2rectangles”: 33.3 ± 5.2%; “2crosses”: 18.8 ± 5.2; and “irregular”: 69.6 ± 3.8%; all ps<10−4).

Thus, sequence regularity facilitated both rote memory and anticipation. Crucially, as predicted, these effects were captured by our measure of complexity: the mean error rate was highly correlated with K across sequences (for all data points: Spearman’s ρ = 0.75 ± 0.04, Student t-test: t 22 = 21, p <10−11; for data points 6–8: ρ = 0.73 ± 0.04, t 22 = 21, p < 10−9, Fig 3, top panel). Furthermore, complexity in our language gave a better account of adults’ behavior than alternative encoding strategies which did not use geometrical features such as rotations and symmetries, but used only the distance between successive locations. We computed two variants of sequence complexity devoid of geometrical content: the normalized jump length, measuring the average distance between locations in a sequence, averaged over the number of jumps; and complexity in a degraded language where the primitives were only ±1, ±2, ±3, +4, and repetition (S1 Fig). In both cases, obvious outliers were observed (e.g. the complexity for “4segments” in the second case reached the maximum value of 16, which is inconsistent with the data). Moreover, correlations of those measures with total error rate were significantly lower than those obtained with the full language (normalized jump length ρ = 0.60 ± 0.03, t(44) = 3.23, p = 0.003; complexity in degraded language: ρ = 0.51 ± 0.03, t(44) = 4.88, p < 10−4).

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 3. Complexity predicts error rates. For each sequence, the y axis represents the mean error rate, and the x axis the sequence complexity, as measured by minimal description length. Panels show data from French adults (top, experiment 1), preschool children (middle, pooling over experiments 2 and 3), and Munduruku teenagers and adults (bottom, experiment 4). For each group, a regression line is also plotted and the Spearman’s correlation coefficient is displayed. In French children and Munduruku adults, the “4diagonals” and “2crosses” are clear outliers—as explained in the main text, the regression can be improved by assuming that their “language of thought” does not include rotational symmetry P. https://doi.org/10.1371/journal.pcbi.1005273.g003

We then examined the pattern of errors in each regular sequence. Unsurprisingly, for the “repeat” sequence, which only consisted in the repeated application of the +1 or -1 rule, all error rates verged on 0 and were far below the baseline (all ps< 0.001 corrected). The fact that subjects were already able to complete the sequence after seeing only the first two items suggests that they quickly recognized and applied the primitives+1 and -1, and treated repetition as a default assumption.

For “alternate”, after a systematic error at the 3rd data point (error rate = 95%), the error rate continuously decreased over the first half of the sequence (mean correlation coefficient: ρ = -0.68 ± 0.06, Student t-test: t = 11.8, p < 5.10−11) and dropped to 15 ± 6% at the 7th data point. Even though “alternate” induced more errors than “repeat” (overall: F = 23, p< 10−6), performance was significantly better than “irregular” (all ps< 0.05 corrected, except at the 3rd and the 5th data points). Thus, although “alternate” was more difficult than “repeat”, participants were able to identify and combine the rules +1 and +2.

For“2arcs” and “2squares”, performance profiles were similar. At all data points except the 5th, 9th, 13th and 16th, error rates were significantly below the baseline (all ps< 0.05 corrected). The data points with high performance correspond to the application of the lowest-level rule (+1 for “2arcs” and +2 for “2squares”), therefore providing evidence that this superficial rule was quickly learned. On the contrary, data points 5, 9 and 13, corresponding to the application of the higher-level rule, exhibited more errors than their neighbors (Friedman test: F = 23, p = 2.10−6). At data point 5, the error rate was not significantly below the irregular baseline in “2squares”, and it was even worse than baseline in “2arcs” (error rate at 5th data point in “irregular”: 70 ± 6%; “2arcs”: 91 ± 4%, F = 6.23, p = 0.013; “2squares”: 76 ± 8%, F = 0.69, p = 0.41). Errors at this point consisted primarily in the continued application of the lower-level rule. Importantly, however, performance on data point 5, 9 and 13 improved over time (“2arcs”: Friedman test: F = 37, p < 9.10−9; “2squares”: F = 18.6, p < 9.10−5), and error rates at data points 13 fell significantly below baseline in “2arcs” (p< 0.05 corrected), indicating that subjects eventually learned both 1st and 2nd-level rules.

For”4segments”, error rate fell significantly below baseline for all data points (all ps < 0.001 corrected), except points 3 and 9. Within each block of 8 items, error rate decreased quickly and continuously to 0 (rank correlations for the 1st half: ρ = -0.82 ± 0.02, t 22 = 36.4, p < 0.001; and the 2nd half: ρ = -0.62 ± 0.04, t 22 = 15.8, p = 2.10−13). These results suggest that the 1st and 2nd-level rules forming the “4segments” sequence were easily identified and applied. Separate analyses indicated that the mean error rate was similar for horizontal, vertical, and oblique symmetries (vertical: 11.5 ± 1.6%; horizontal: 16.1 ± 2.8%; oblique: 16.8 ± 2.1% and 15.5 ± 2.5%; Friedman test for differences between the four types of symmetries: F = 4.3, n.s.). Thus, adult participants easily identified all axial symmetries.

The performance in“4diagonals” indicated that rotational symmetry was harder to identify than other symmetries (comparison of “4diagonals” and “4segments”; respectively 27.3 ± 4% vs 15 ± 1.4% errors, F = 7.3, p = 0.007). A saw tooth pattern (Fig 2) indicated that even data points had systematically lower error rates than odd ones (Friedman test: F = 18, p < 3.10−5), suggesting that the application of rotational symmetry (1st-level rule) was easier than that of the rotation of the starting point (2nd-level rule). Even data points exhibited error rates significantly lower than baseline (all ps< 0.02, ps < 0.001 corrected except for data points 10, 14 and 16). On the contrary, odd data points exhibited no difference with baseline, again suggesting that the 2nd-level rule was harder to understand than the 1st-level one. Nevertheless, there was a small but significant improvement over time on both odd and even data points (rank correlation for odd data points: ρ = -0.4 ± 0.07, t = 5.5, p < 2.10−5; rank correlation for even data points: ρ = -0.39 ± 0.06, t = 7.32, p < 6.10−11).

In “2 rectangles”, like in “2squares”, data points 5, 9 and 13 corresponded to the application of the deepest (3rd-level) rule. None of these exhibited an error rate lower than the baseline (data point 5: 60.9 ± 10.6% vs 69.6 ± 6.2%, F = 0.28, p = 0.6; data point 9: 78.2 ± 9% vs 54.3 ± 8.5%, F = 4, p = 0.046; data point 13:47.8 ± 10.9% vs 41.3 ± 9.5%, F = 0.69, p = 0.4), and there was no improvement over time (Friedman test: F = 4.1, p = 0.13), suggesting that participants did not manage to understand how the starting point of the rectangle changed. At the immediately subsequent data points 6, 10 and 14, that corresponded to the construction of the first side of the rectangle, performance improved compared to points 5, 9 and 13 (respectively 46 ± 7%vs 62 ± 5% errors, F = 2.88, p = 0.089), although it was still not significantly lower than baseline (Fs< 0.5, ps> 0.4). At subsequent points (7, 8, 11, 12, and 15, 16), the error rate further improved (14 ± 4% errors, Friedman comparison with 3rd-level rule: F = 22, p < 3.10−6) and became significantly lower than baseline (all ps< 0.05 corrected), indicating that the 1st and 2nd-level rules that allowed to complete the rectangle were systematically learned.

Finally, for “2crosses”, the performance profile resembled that of “4diagonals”: on even data points, the error rate was systematically lower than the baseline (all ps< 0.03 corrected except at the 14th data point) and globally lower than the error rate on odd data points (F = 10.7, p = 0.001), indicating that participants easily identified the most superficial rule. Additional evidence for a 3-tiered organization was observed. The error rate was significantly higher on data points 5, 9 and 13, corresponding to the starting point of the cross (3rd-level rule, 41 ± 7% errors) than on data points 7, 11 and 15, corresponding to the starting point of the second branch of the cross (2nd-level rule 26.1 ± 7% errors, Friedman comparison between 2nd and 3rd levels: F = 4.45, p = 0.035). No such difference was seen between data points 5, 9, 13 and 7, 11, 15 in “4diagonals” (F = 1.9, p = 0.17). On data point 7, 11 and 15, the error rate was in turn significantly higher than on subsequent data points 8, 12 and 16, corresponding to the completion of the cross (1st-level rule, 4.35 ± 3.3% errors, Friedman comparison between 1st and 2nd levels: F = 5.33, p = 0.021). On data points 6, 10 and 14, corresponding to the construction of the first branch of the cross (17.4 ± 5.2% errors, the error rate was also significantly lower than on data points 5, 9 and 13 (F = 9.3, p = 0.002). Finally, on data points 3, 5, 11 and 15, the error rate was not significantly lower than the baseline. In summary, 2nd and 3rd levels rules, though eventually learnt, were harder to grasp than the 1st level rule.