Developing curricula for developing countries Many children in developing countries grow up in economically poor environments and often also suffer from poorly performing educational systems. Dillon et al. designed inexpensive, locally sourced games—five for mathematics and five for social cognition—for use in preschools in Delhi. They measured the effects of these interventions 3, 9, and 15 months later. Compared with those who played social games, the kids who played math games showed enhanced performance on both nonsymbolic and symbolic math assessments at the 3-month time point. However, only the nonsymbolic improvements persisted for as long as a year. Science, this issue p. 47

Abstract Many poor children are underprepared for demanding primary school curricula. Research in cognitive science suggests that school achievement could be improved by preschool pedagogy in which numerate adults engage children’s spontaneous, nonsymbolic mathematical concepts. To test this suggestion, we designed and evaluated a game-based preschool curriculum intended to exercise children’s emerging skills in number and geometry. In a randomized field experiment with 1540 children (average age 4.9 years) in 214 Indian preschools, 4 months of math game play yielded marked and enduring improvement on the exercised intuitive abilities, relative to no-treatment and active control conditions. Math-trained children also showed immediate gains on symbolic mathematical skills but displayed no advantage in subsequent learning of the language and concepts of school mathematics.

Enrollment and attendance in primary school in developing countries has greatly expanded over the past few decades (1, 2), but children’s learning outcomes remain poor. In 2014, 87% of Indian children in grade 2 and 52% of Indian children in grade 5 could not read a simple passage of text that they should have been able to read by grade 2 (3). Poorly adapted curricula may be partly to blame (4–6); such curricula build on the verbal and mathematical skills that preschool children with educated parents gain by interacting with family members who can read, count, and calculate. But first-generation school children may be hampered by a lack of opportunities to engage, as preschoolers, with literate and numerate adults during activities that exercise basic verbal and numerical abilities (7–9).

This problem can be addressed either by dampening the level of instruction in primary school (10) or by bolstering children’s experiences during the preschool years. Some early childhood interventions have targeted parents, training them to interact with or support their children (11–16). Alternatively, preschools for poor children, led by educated adults who play games that exercise their cognitive abilities, may better prepare children for school.

This idea is intuitively appealing and has received considerable support from both academics and policy-makers (17). Indeed, there is evidence that preschool education influences later life outcomes. In the United States, a number of observational studies have found substantial short- and long-term impacts of the flagship preschool program, Head Start (18–20). However, a recent large-scale randomized study found only small and short-term effects of Head Start, perhaps because Head Start may not be much better than the alternative preschool choices available to poor U.S. children (21). In developing countries, several of the studies reviewed in Engle et al. (22) also found positive effects of preschool access on child development. For example, in one recent randomized trial in Mozambique, access to preschool increased children’s school enrollment, fine motor skills, and problem solving, although not their later language development (23).

Many scholars have emphasized the importance of preschool quality (24), but little work has revealed what constitutes a quality program. In the United States, even carefully designed preschool mathematics curricula based on cognitive science (such as the Building Blocks program) have produced only small effects for only a portion of the students at a portion of the measured time points (25). Rigorous randomized controlled trials in resource-poor settings have found that training programs for preschool teachers in Chile (26) or Malawi (16) had no effect on children’s learning.

These results underscore how little we know about how to train teachers to prepare children for primary school: The teacher training or the curriculum they implemented might not have been intense enough, or the teaching practices and curricula themselves might not have been effective. Moreover, if such practices and curricula are ineffective, we do not know enough about what was trained to draw more general conclusions from the findings. Is the basic intuition—that exercising children’s spontaneously developing cognitive abilities in preschool leads to greater school achievement—wrong? Or did the chosen curricula fail to engender, in poor children, the skills that develop spontaneously in preschool children in wealthy families and communities?

To address this question, we designed a game-based mathematics curriculum for poor children in the slums of Delhi, India. The curriculum is based on decades of cognitive science research on the spontaneous development of children’s numerical and spatial reasoning. We then tested the effectiveness of this curriculum in a large-scale field experiment. We found that our intervention effectively and durably improved children’s spontaneously developing numerical and spatial abilities, and we were therefore able to test whether this improvement led in turn to an increase in children’s learning of the symbolic mathematics taught in school. Our study is thus the first to field-test a central conjecture of contemporary basic research in psychology and cognitive science, which has, formally or informally, motivated the development of most modern preschool curricula: that children’s learning of the symbolic mathematics taught in school would be facilitated by adult-led activities that exercise their intuitive cognitive abilities during the preschool years. In particular, we focus on numerical and spatial abilities that emerge in infancy and function throughout life among people from diverse cultures (27–37).

Despite the importance of this conjecture, most of the evidence supporting it comes from the laboratory rather than from the field. A small number of controlled training experiments constitute this literature. Like adults (38), elementary school children who are trained to add or compare arrays of dots on the basis of number show enhanced performance on the kinds of symbolic arithmetic problems presented in school, both in the United States (39) and in Pakistan (40). However, these studies, and other controlled training experiments focusing on spatial skills, measure symbolic mathematical gains over very short time periods [from immediately after training to ≤1 month after training; e.g., (39, 41, 42)], providing no insight on whether any of these gains would persist or enhance learning of new mathematical concepts. A large body of longitudinal research has probed relations between early- and later-developing mathematical abilities across more diverse populations and at longer time scales (43–46), but these studies could be indicative of natural correlations among different abilities, rather than of causal relations (47). Individuals who are mathematically talented, or who receive rich exposure to mathematical material in their homes, may perform well on intuitive, nonsymbolic mathematical tasks as well as learned, symbolic mathematical tasks, even if the abilities underlying these skills and tasks are not causally related.

This study investigates the basic cognitive mechanisms promoting children’s learning of mathematics by designing a curriculum that provides children who have minimal access to books, board games, or literate and numerate adults an opportunity to exercise these informal numerical and spatial skills, and by testing the curriculum’s efficacy after the first year of formal schooling. Moreover, it demonstrates an approach to developing and testing a cheaply implementable intervention to improve children’s school readiness in resource-poor contexts, which, if effective, could be scaled up across preschools.

Intervention and experimental design The intervention took place over a 4-month period and involved 214 preschools in Delhi, India. These preschools were run by our partner organization, Pratham, a large nonprofit focused on improving and evaluating education throughout India. In poor neighborhoods of urban areas, many children now attend such preschools. They are not systematically run by the government, but are often private or, like Pratham’s, run by a nonprofit organization. Our math games curriculum was designed to be scalable and easy to implement in such a context: We used inexpensive, locally printed materials, and locally hired adults administered the games after 2 to 4 days of training. Children in all of the preschools were of mixed ages, but only the children who were expected by their teachers to begin primary school after the completion of the intervention were assessed and treated. The final sample included 1540 children (mean age, 4.9 years; range, 2 to 12 years). Almost all of the children were between 3 and 7 years of age (97.1%) and most were 4 to 5.5 years old (83.8%). Each preschool was randomly assigned to one of the three treatment conditions. In the math condition (70 schools), children played five games (Figs. 1 and 2) (48) that build on intuitive numerical and geometric abilities that emerge spontaneously in the first 3 years, that are associated with achievement in school mathematics, and that encourage children to communicate using the language and symbols of primary school mathematics through social play with literate and numerate adults as well as peers. Two games tasked children to add and compare large sets of dots based on their relative numerosity: abilities that are universal (34), emerge in infancy (49), and correlate with mastery of symbolic arithmetic in children (43, 50) and adults (44). A third game required that children establish exact one-to-one correspondence relations between sets of one to four two-dimensional shapes and sequences of one to four movements on a linear board, relating numerical magnitudes to positions on a line: abilities that emerge in infancy (51, 52) and produce short-term enhancements in children’s symbolic number concepts (7, 53). Finally, two games challenged children to find a geometric property (e.g., shape, parallelism, connectedness) that distinguished one figure from a group of others, or to place objects at locations indicated on a set of small-scale geometric maps: two early-developing, universal abilities (54–57) that are believed to promote learning of a variety of mathematical concepts (42). Fig. 1 Materials from three math games and the corresponding social games. The math games focused on comparison of numerical magnitudes (top left), categorization of different shapes (middle left), or symbol reading based on an analysis of the features of a geometric form (bottom left). The corresponding social games focused on comparison of emotional intensities (top right), categorization of different emotional expressions (middle right), or symbol reading based on an analysis of a face’s gaze direction (bottom right). One pair of corresponding math and social games (top) involved sorting cards into one of two piles, depending on the color of the larger number or more intense expression of happiness. Another pair of corresponding games (middle) involved finding the figure that did not belong with the other figures based on its shape or expression. A third pair of corresponding games (bottom) involved using the shape of a figure or the gaze direction of a face on a 20 cm × 20 cm map to find a corresponding location on a 1 m × 1 m mat, which appeared at varied orientations; children placed an object on the location on the mat that was indicated on the map. The dot arrays in the top left math game were created with Panamath (82), a free program for generating numerical stimuli. The faces in the top two social games were obtained from Gao and Maurer (72), who adapted them from the face battery created by Tottenham et al. (83). The faces at middle right have been pixelated for display purposes only; children played this game with nonpixelated faces. Fig. 2 Children in the intervention playing the math and social versions of the linear board game. In the math game (top), there was one deck of face-down cards; children spun a spinner whose arrow indicated how many cards they could choose from the deck. When turned over, the cards displayed either one to four small figures or an “X”; children moved their token forward on the board by one space for each figure on their card(s). In the social game (bottom), there were two decks of face-down cards, each with a different color on their back. The spinner depicted a face; when spun, its gaze indicated which colored deck(s) children could choose from. When turned over, each card displayed another face looking at a colored dot; children moved their token forward according to the colors on the board indicated by the gaze direction on the cards. Children therefore used either number or gaze to establish correspondences among the spinner, the cards, and the movements of a token on a board. The overall cost for a group of six children to play the games for 4 months was $316 (table S22). This figure includes the cost of materials as well as a teacher’s and monitor’s time and training. The materials represented $217 of these costs, which suggests that if these games were scaled up, the actual operating costs would be substantially lower because materials could be reused and produced in larger quantities. In the no-treatment control condition (72 schools), children received a systematic preschool curriculum designed by Pratham. This curriculum targeted five main aspects of child development: physical development, language development, social and emotional development, cognitive development, and creative development. Perhaps most relevant to learning mathematics in school, children played memory games, learned about sequences and matching, and learned numbers (as words and arabic numerals) and spatial concepts (such as small/big and near/far). For three 1-hour sessions per week, children in the math condition played games instead of receiving the Pratham curriculum. During game play, the Pratham teachers focused on the younger children in their classes (who did not participate in game play), thereby reducing the time devoted to the regular Pratham curriculum for the older children. We did not specify what Pratham content teachers should reduce in order to evaluate a realistic intervention in which a preschool would chose to replace part of their curriculum with ours. It was possible that the math games could have had either positive or negative effects on primary school outcomes, regardless of their mathematical content: They could have had a positive effect if the games themselves were more effective than the current practices, or a negative effect if the symbolic skills provided by the regular curriculum were more immediately useful. To distinguish between these possibilities and to test the specific effects of the games’ mathematical content, our experimental design included a third group of schools assigned to the active control condition (70 schools). Children in these schools played games that followed the same rules and procedures as the math games and were comparably challenging and engaging (48) but focused on two social cognitive abilities that are critical to assessing the intentions of others: emotion reading (58) and gaze following (59) (Figs. 1 and 2). Like the abilities exercised by the math games, these abilities arise in infancy (60, 61) and predict later cognitive skills (62). They are also thought to foster language development and pedagogical learning in early childhood (63) and may be related to future labor market success (64, 65). The games in the active control condition therefore were truthfully presented to teachers and children as potentially valuable for enhancing school readiness. Because these games have the same rules as the math games, they further allowed us to distinguish the general effects of game play (e.g., communication, language, taking turns, etc.) from the specific effects of the mathematical content. In the math and active control conditions, each game was introduced to children with easy practice problems, and children progressed as a class through a diversity of material during regular game play. As the intervention progressed, classes were also presented with more difficult problems to maintain children’s engagement and interest in the games (48). Progression to these more difficult problems was gradual, as the games were meant to encourage in children a sense of confidence and success with the game content (66). There was no presumption that each class group would necessarily complete all or even many levels of the games. We created several levels in order to keep children engaged throughout the duration of the intervention. Game play sessions were run by intervention teachers hired by Pratham: typically, young women with a high school education but no college degree. These teachers received brief training from our research team on how to play the games with children and how to evaluate children’s performance. Each intervention teacher was responsible for two preschools, in which she led three 1-hour sessions per week and kept notes during each session about the game, level, and deck that was played and the individual performance of each child. To monitor the implementation of the program, a separate team of “process monitors” made unannounced visits to the preschools, collecting data on game play frequency, adherence to the rules, and children’s attention to and facility with the game content (48).

Evaluation: Data collection and empirical specifications Assessments evaluating the effects of the intervention were administered at four time points: during the month before the intervention (baseline), 0 to 3 months after the intervention (endline 1), 6 to 9 months after the intervention (endline 2), and 12 to 15 months after the intervention (endline 3). For all tests, children were tested individually on a laptop computer by local nonexperts trained by our team to administer assessments to young children. Assessors were unaware of children’s condition assignments and had not been involved in the game play curricula. Unlike the games, the assessments were presented in a nonsocial context, and they included difficult problems, challenging time constraints, and no informative feedback. Tests of children’s concepts and skills built on Pratham’s experience evaluating children’s learning of mathematics throughout India (67, 68). Tests of nonsymbolic numerical and geometric abilities were based on research in cognitive science assessing these abilities in children and adults (43) in diverse cultures, including remote cultures with minimal education (34, 54). School-relevant assessments focused on comparing and adding numbers presented as words and arabic numerals (69) and answering verbal questions about shape properties, similarity, and symmetry (70). Social skills were measured by evaluating children’s sensitivity to gaze direction (71) and emotional expressions (72). Standardized measures of intelligence were not given, both because of resource limitations and because the aim of the intervention was to enhance children’s learning of school mathematics. Because mathematics learning relates to children’s mastery of language, to their developing executive functions, and to their motivation to tackle challenging problems, we also presented children with assessments of language and reading based on Pratham’s tests of these abilities, and we adapted tests of executive function (73) and motivation for school learning (74) from tests that are widely used in cognitive science laboratories. Children in the math games, social games, and no-treatment preschools exhibited similar baseline achievement as well as similar characteristics across basic demographic measures (Table 1). Table 1 Demographic information and baseline scores for children randomized to the three conditions of this study. Individual tests of joint equality of the math treatment, social treatment, and no-treatment control (with standard errors clustered at the school level) for each measure (48) revealed no differences between groups. A χ2 test of joint equality across all measures also revealed no difference (composite χ2 test P value, 0.944). View this table: The baseline assessment and, for some children, the first endline assessment were presented to children in their preschools; the remaining endline assessments were presented to children in their homes. We surveyed 94%, 87%, and 84% of the original sample at the three successive endlines; 80% of the children were surveyed at all three endlines. There were no significant differences in the baseline test scores, demographic variables, or treatment statuses of those who dropped out of the study and those who completed the assessments at all of the time points (table S1). At endlines 2 and 3, respectively, 83% and 91% of the tested children were enrolled in primary school, and the proportion was similar in all treatment conditions (table S13). Following a prespecified “intention to treat” design, we included children in all assessments whether or not they were enrolled in primary school [see (48) for analyses comparing children who did and did not progress to primary school] or received the intervention assigned to their preschool. The randomized design enabled straightforward analysis. The main results are apparent by comparing the descriptive statistics for each test across the different conditions at all three endlines. We performed joint Fisher randomization inference tests of statistical significance on these basic comparisons, evaluating the hypothesis that children in the math treatment perform better on math questions than children in the social treatment or in the no-treatment control (table S2) (48). Our analyses were based on a regression specification, which was preregistered, along with a complete preanalysis plan, on socialscienceregistry.org (48). We used the following specified regression framework: (1)where y i,j represents the endline value of an outcome for child i in school j; math j is an indicator variable for whether school j was treated with the math games; social j is an indicator variable for whether school j was treated with the social games; age i,j is age, in months, of child i in school j; gender i,j is an indicator variable for gender of child i in school j; baseline i,j is the baseline value of an outcome for child i in school j; and ε i,j is an error term for child i in school j. Because the treatment was administered to all children in a given school, the standard errors are clustered at the school level. Our analysis plans also called for a specification without a baseline control. The two specifications revealed largely the same findings (48). Our primary outcomes comprise four measures. Each outcome is based on a composite Z-score (computed by taking an average of the Z-scores for each individual on each test, relative to the mean and standard error of the control group’s average baseline performance on that test), and so the coefficients for these outcomes can be interpreted as effect sizes in terms of standard deviations. The “math composite” includes all of the math tests, the “nonsymbolic composite” includes the math tests of approximate numerical comparison and of finding a deviant shape, and the “symbolic composite” includes the math tests assessing knowledge of number words and shape names, abilities to compare and add numbers presented as words and/or symbols, and (at endlines in which they were presented) facility in answering verbal questions concerning relations of shape similarity and symmetry. At the baseline and first endline, the symbolic tests probed abilities that develop spontaneously in children living in educated families, prior to the start of schooling (for example, children’s mastery of ordinary terms for shapes, such as “egg”). At the two later endlines, these tests focused primarily on abilities that are taught in school (for example, children’s mastery of geometric terms for shapes, such as “rectangle”). The “social composite” includes a test probing sensitivity to gaze direction and (at endlines in which it was administered) a test probing knowledge of emotion words.

Results On the basis of the data collected by the teachers and process monitors during the game play, we first asked whether the two game-based interventions were implemented, engaged children’s interest, and led to improved performance over the course of the intervention. Both the math and social games were played regularly, and most children attended to the game play. In the preschools where the math and social games were played, all of the children attended to the games on 52% and 53% of the observed sessions. Most schools progressed through all of the materials included in the first level of play in at least one of the five games (93% in the math treatment and 89% in the social treatment), and most classrooms remained engaged with the games through the materials of the first two levels (table S6). Children performed well in the first level of each game (between 61% correct and 87% correct, depending on the game, in the first two rounds of play with those materials), and their scores improved 7 percentage points, on average, between the first two and last two times that a level was played (table S6). Thus, we successfully designed a scalable preschool math games curriculum that was implemented as intended, led to progress within the game itself, and engaged children with its content. The mean percentages of correct responding for each test, treatment group, and endline are reported in Table 2, and they tell a clear story. At endline 1, children in the math games group had a higher proportion of correct responding on all math tests than did the children in the two control groups. For example, they scored 36% correct on the test of geometric sensitivity (chance = 17%), whereas the no-treatment and social treatment groups scored 25% and 29%, respectively. In contrast, and as expected, children in the social games group had a higher proportion of correct responding on gaze sensitivity. At endlines 2 and 3, children in the math games group still performed best on the nonsymbolic math tests, but not on the symbolic measures targeting the concepts taught in school, which were very similar across the three groups. Children in the social games group still performed better on the test of gaze sensitivity. Table 2 Descriptive statistics for each test, for each treatment group, and for each endline. Mean percentages of correct responding are listed in each cell; standard deviations are in parentheses. View this table: Results from Fisher permutation tests (table S2) confirm the statistical significance of these findings. Compared to the no-treatment control, we reject the hypothesis of no effect of the math treatment on all of the math assessments, the symbolic assessments, and the nonsymbolic assessments for all endlines taken together and for endline 1 individually. At endlines 2 and 3, we reject the hypothesis of no effect overall and no effect on the nonsymbolic assessments, but not on the symbolic assessments. In contrast, we do not reject the hypothesis of no effect of the social games compared to the no-treatment control, for all endlines taken together and for each endline individually. To summarize these results effectively, we used our preregistered regression framework. We first tested whether, immediately after the intervention, children who had exposure to the math games curriculum had higher scores on the math assessments than those who did not. Consistent with the descriptive statistics, at the first endline, the math games led to a significant increase in the overall math composite: 0.25 standard deviations versus the no-treatment control [t(213) = 5.88, P < 0.001]. There was also an impact of the social games on the overall math composite compared to the no-treatment control, but this effect was smaller than that of the math games (Fig. 3 and Table 3). Playing the math games therefore had a positive effect on children’s subsequent performance on tests evaluating their sensitivity to number and geometry, relative both to children who received only the regular preschool curriculum and to children who played games with similar rules and materials but with no mathematical content. Fig. 3 Z-scores of children’s performance on the three primary math outcome measures after the intervention, but before the start of primary school. Colored bars show the impact of the math and social treatments on each outcome measure. Error bars represent standard errors clustered at the school level; coefficient estimates indicate differences from the no-treatment control. On the coefficients, asterisks indicate a rejection of the null hypothesis of no difference compared to the no-treatment control (omitted category): *P < 0.05, **P < 0.01, ***P < 0.001. Table 3 Coefficients from a linear regression model estimated using ordinary least squares, controlling for age, gender, and baseline test scores for each of the four main outcomes. Assessment time points consist of 3-month intervals beginning immediately after the intervention (before the start of primary school), 6 months after the intervention (midway through the first year of primary school), and 12 months after the intervention (after 1 year of primary school). The first two rows for each endline panel compare math and social treatments to no treatment (respectively), the third row indicates the results of a two-sided test of equality between the math and social coefficients, and the fourth row presents the no-treatment control group’s mean performance. Standard errors (in parentheses) are clustered at the school level. *P < 0.05, **P < 0.01, ***P < 0.001. View this table: The math games led to a particularly large increase in the nonsymbolic math composite: 0.42 standard deviations above the no-treatment control [t(213) = 7.34, P < 0.001]. The social games also led to an increase on the nonsymbolic math composite, but that increase was smaller than that of the math games (Fig. 3 and Table 3). The social games had a similarly large impact on the social skills measure (0.44 standard deviations), whereas the impact of the math games on this measure was smaller (Table 3). These findings suggest that the difference between the impacts of the two treatments was due to their different content, replicating and extending prior evidence that children’s early-developing sensitivity to mathematical and social information improves with experience and exercise (42, 75, 76). Do the gains in preschool children’s intuitive, nonsymbolic numerical and geometric skills lead to improvements in their knowledge of the symbols and language of formal mathematics? Consistent with this possibility, the children who played the math games outperformed the children in the no-treatment control group by 0.13 standard deviations on the symbolic math composite [t(213) = 2.70, P = 0.007] at the first endline, whereas the children who played the social games did not. Nonetheless, the children in the math games group showed only a relatively small advantage over those in the social games group on this composite measure; the difference was significant only at the 10% level (Fig. 4 and Table 3). Fig. 4 Z-scores of children’s performance on the three math composite measures at endlines 2 and 3. Colored bars show the impact of the math and social treatments on each measure midway through the first year of primary school (EL2) and after 1 year of primary school (EL3). Error bars represent standard errors clustered at the school level; coefficient estimates indicate differences from the no-treatment control: **P < 0.01, ***P < 0.001. Further analyses of the first endline focused on children’s performance on the individual assessments. Relative to the no-treatment control, the math treatment had individual impacts on the tests probing nonsymbolic numerical and geometric abilities as well as the tests probing knowledge of number words, arabic numerals, and shape names, although not the test of simple verbal arithmetic (table S4). There was no impact of the math or social games on the test of executive function, although performance on that test showed moderate test-retest reliability and strong effects of age (tables S3, S8, S12, and S14). The measure of motivation also showed no impact of the math or social games, but children performed poorly and inconsistently on this measure (table S15). The effects of the math games are therefore not attributable to changes in executive function, but we cannot determine whether they depend on changes to children’s motivation. According to prior research, the effects are unlikely to be rooted in children’s expectations about the effects of nonsymbolic mathematical training (77). As in laboratory-based studies (38–40, 42, 78), we thus observed that nonsymbolic mathematical training caused short-term gains in symbolic mathematical outcomes. It is noteworthy that these gains were evaluated over the 3 months that followed the end of the intervention—a substantially longer follow-up than that of a typical laboratory study in this domain. Moreover, the gains in mathematical skills after the intervention were due to the specific math games training, as opposed to the rule-based structure of the games or the increased attention of children in that treatment group. This training intervention, implemented in a field environment with minimally trained teachers and assessors, captured and sustained children’s interest over months of game play. Its findings thus suggest that it is possible to translate the findings from basic cognitive science research into field experiments in children’s everyday environments. In light of these findings, we asked whether children’s mathematical gains persisted in the longer term. The benefits of most educational interventions are short-lived, even when initial gains are significant (79) and especially when training is not reinforced by “booster” sessions (80). Remarkably, this was not the case in our study. At the two later endlines, 6 months and 1 year later, the overall math composite remained significantly improved in the math games group, and the gains were stable in magnitude between endlines 2 and 3 [compared to the no-treatment control group, 0.12 standard deviations at endline 2, t(208) = 2.74, P = 0.007; 0.14 standard deviations at endline 3, t(213) = 2.77, P = 0.006] (Fig. 4 and Table 3). Although the effect of the math treatment was smaller at endlines 2 and 3 than at endline 1, the increase in math ability that the children in the social games group experienced at endline 1 vanished by endline 2 (0.01 standard deviations compared to the no-treatment control group). Thus, the differential impact of the math versus social games on the overall math composite was remarkably constant over the endline assessments (Table 3). The gains on the nonsymbolic composite by children in the math games group proved enduring through the first year of primary school: 0.29 standard deviations compared the no-treatment control group at endline 2 [t(208) = 4.59, P < 0.001] and 0.32 standard deviations at endline 3 [t(213) = 4.32, P < 0.001]. The persistence of these improvements is striking because children had no access to the game materials after the intervention ended, and their homes and schools provided no opportunities to engage in related game activities or anything resembling them. Do such enduring gains in preschool children’s nonsymbolic mathematical skills also improve their readiness to learn new mathematical content in primary school? The answer here was a decisive “no.” By midway through the first year of primary school, the effect of the math games training on the symbolic composite measure had disappeared. Although the math games caused persistent gains in children’s nonsymbolic mathematical abilities, they failed to enhance children’s readiness for learning the new symbolic content presented in primary school. To better interpret these negative findings, we asked whether the symbolic math assessments used in the later endlines were unreliable or invalid for this population. Contrary to these concerns, there were strong intertemporal relations (i.e., test-retest reliability) for each test for children in the no-treatment control group, indicating that our tests were highly reliable (table S5). Moreover, older children in the no-treatment control group performed better than younger children on each of the math tests, which suggests that the tests were indeed sensitive to developmental changes in these abilities (table S16). Although the difficulty of the assessments increased across the three endlines, children showed stable performance on the symbolic math tests (Table 3 and table S4). These findings suggest that the assessments were valid measures of children’s symbolic mathematical knowledge and that children were in fact learning mathematics during the first year of primary school. Finally, we asked whether the failure to show persistent gains in symbolic mathematics, despite enduring gains in nonsymbolic abilities, resulted from inherent differences between the ways that Indian and Western children think about or learn mathematics. Contrary to this possibility, the Indian preschool children in the no-treatment condition exhibited the characteristic profile of correlations across skills found in Western children. Among children in the United States, there are strong correlations between nonsymbolic and symbolic numerical abilities (43, 81) as well as correlations between sensitivity to the shapes of geometric forms and abilities to interpret geometric maps (30). Similarly, the performance of the Indian children on the nonsymbolic math composite strongly correlated with performance on the symbolic math composite, both within and across time points. We also replicated the more specific correlations in the mathematical cognition literature, relating nonsymbolic numerical acuity to symbolic number abilities, and relating sensitivity to geometric forms to performance on the tests of verbal geometric reasoning. Indeed, each of these correlations survived controls for performance on the tests in the other domain, both within and across time points (Table 4). These parallels between our findings and those of laboratory studies of children in the United States and other developed countries suggest that the negative findings of our intervention do not stem from differences between the mathematical concepts of children in poor and wealthy countries, nor from differences in the ways that those concepts were measured in the lab and in the field. On the contrary, the preschool mathematical abilities revealed by laboratory studies of Western children proved to be both generalizable and robust, but an intervention exercising those abilities failed to enhance poor Indian children’s learning of school mathematics. Table 4 Coefficients from linear regression models estimated using ordinary least squares and controlling for age and gender. Top: This model illustrates the relation between nonsymbolic numerical discrimination (43) and a symbolic numerical composite score (including tests probing knowledge of number words and simple arithmetic) calculated with separate regressions at each contemporaneous time point [baseline (BL) and three endlines (EL)] as well as across time points. All regressions control for the effects of nonsymbolic and symbolic geometric abilities. Bottom: This model illustrates the relation between nonsymbolic geometric sensitivity and a symbolic geometric composite score (including tests probing knowledge of shape words and judgments about shape properties) calculated with separate regressions at each contemporaneous time point [baseline (BL) and three endlines (EL)] as well as across time points. All regressions control for the effects of nonsymbolic and symbolic numerical abilities. *P < 0.05, **P < 0.01, ***P < 0.001. View this table:

Discussion This study underscores the importance of field experiments to elucidate universal cognitive mechanisms underlying children’s learning of mathematics. Previous research, based on robust correlations and laboratory studies using short-term training, raised the strong possibility that (i) universal, early-emerging mathematical abilities would improve with exercise over the preschool years, and (ii) such exercise would enhance children’s subsequent learning of primary school mathematics. Our study demonstrates that the first part of the conjecture was correct, but not the second. Children’s readiness for learning formal mathematics in India appears to require something more than improvement in nonsymbolic numerical and geometric skills through games that make mathematics fun and show children that they can and do improve in this domain. On the positive side, our results show that it is possible to translate the subtle manipulations of the laboratory into implementable interventions in the field. Children learned, played, and enjoyed the games. Their intuitive mathematical abilities improved with practice, and these improvements persisted more than a year after the completion of a game-based intervention that exercised them, despite the removal of the games and the absence of any similar resources to sustain children’s gains. The assessments of children’s cognitive gains, based not on standardized tests of intelligence but on laboratory-based measures of sensitivity to number and geometry, yielded findings that are highly similar to the findings from laboratory experiments in developed countries, despite large differences in the conditions under which the tests were administered and in the lives and environments of the children who took them. They revealed strong correlations between poor preschool children’s early-emerging and intuitive numerical and geometric abilities and the symbolic mathematical content of primary school, just as in other populations from developed countries. Finally, parallel to the short-term results found in laboratory-based training experiments, the improvements in children’s intuitive mathematical abilities had a positive impact on their simultaneous learning of numerical and spatial language and symbols, which were used in the preschools where children played the games. Nonetheless, the preschool intervention had no evident effect on children’s subsequent learning of mathematics in primary school. We conclude that a preschool intervention that effectively fosters an attunement of intuitive mathematical skills, in social and communicative contexts, is not sufficient to promote children’s later learning of school mathematics, at least as that learning is measured at the end of the first year of primary school and in the Indian context. This finding echoes the negative findings from other randomized controlled trials in developing countries, and it suggests a possible explanation. Preschool interventions may fail unless they are designed to complement a central feature of primary school in these settings—that is, a strictly symbolic curriculum. Indeed, exploratory analyses show that the children who returned to preschools after the intervention showed more enduring effects of the math treatment than those who went on to primary school (table S13). These findings suggest two ways to redesign the intervention to make it more successful. First, a math treatment might be more effective at fostering school readiness if the games were presented in a way that connects their nonsymbolic mathematical content directly to the mathematical language and symbols used in school. For example, children could be introduced to mathematical language and symbols along with the card and board games that mainly exercise their intuitive abilities, or they could play versions of the games that alternate between pictorial materials and materials presenting words and symbols. Second, nonsymbolic math games training might be more effective if training coincided with children’s learning of formal mathematics rather than preceding that learning. Future field experiments could test these and other possibilities. Our findings underscore both the promise and the necessity of rigorous testing of reforms to school curricula inspired by basic science, using scalable programs over extended time frames in the environments in which those curricula will be implemented. Laboratory-based experiments provide the most sensitive setting for discovering the cognitive and neural underpinnings of children’s learning, but they alone do not reveal the causal factors that produce knowledge over long time spans, nor the most effective means for enhancing that knowledge in school. For those questions, cognitive science and public policy may advance in tandem, through research in homes and in classrooms, testing interventions that combine the diverse processes that together allow children to master new cognitive challenges.

Supplementary Materials www.sciencemag.org/content/357/6346/47/suppl/DC1 Materials and Methods Supplementary Text Figs. S1 to S20 Tables S1 to S22 References (84–92)

http://www.sciencemag.org/about/science-licenses-journal-article-reuse This is an article distributed under the terms of the Science Journals Default License.