Participants

Thirty healthy right-handed participants (20 females, mean = 24.2 ± SD = 2.1 years) with no history of neurological or psychiatric disorders gave their informed consent to participate in this study. They were assigned to three groups: 1) real tDCS during mathematics video game training (n = 10, 4 males; mean age = 24.6 ± SD = 3.8 years); 2) sham tDCS during mathematics video game training (n = 10, 2 males; mean age = 23.9 ± SD = 2.5 years); and 3) active control group that received real tDCS during non-mathematical visuospatial tasks (n = 10, 4 males; mean age = 24.1 ± SD = 2.0 years). The groups did not differ in terms of gender χ2(2) = 1.2, p = 0.55. Participants were matched across groups on their performance on a standardized mathematics test (Wechsler Individual Achievement Test, Second UK Edition, WIAT-II UK). Unfortunately, we did not assess participants’ previous experience of video gaming at the individual level to examine if this might have incidentally led to a difference between the groups. This study was given ethical approval by the Berkshire Research Committee and the methods were carried out in accordance with approved guidelines.

Video game

Based on our previous research that showed more effective training outcomes for training that involves body movements16, we designed an adaptive video game that requires participants to indicate the position of fractions on a visually presented number line by moving their body side-to-side. Participants’ movements were captured by a motion-detecting device, KINECTTM (Fig. 1c). Participants performed four practice trials before their first training session. RTs and accuracy (difference between correct and estimated positions on number line) of responses were recorded.

Analyses were performed on data up to level 3 because four participants did not perform beyond this level on the first day (2 tDCS; 2 sham). These levels were categorized based on their level of difficulty: Easy, Medium and Hard and were selected through a pilot study prior to the current experiment (Fig. 1b) (For full stimulus list of our game, see Supplementary Table S1). Each fraction level was allocated 4 levels of ‘precision’, which specify the amount of deviation from the target allowed for a correct response. The lowest precision corresponds to ±7% allowed deviation based on the number line range, with two intermediate levels defined in steps of ±1%. We chose these specific levels of precision as pilot data indicated that they are appropriate for our targeted population; the easiest level (7%) was the least demanding but it was not too easy, while the most difficult level (4%) was challenging but not unattainable. The levels in between (5%, 6%) provided gradually more challenging trials to stretch the capacity of our participants. Taken together, the level of difficulty was defined by the fraction category and level of precision within that category. The fraction category was defined by the size of the fractions (Fig. 1b), which increase gradually from Easy to Hard and therefore require more precise mapping on the number line. The more difficult fractions do not share the lowest common multiple with the easier fractions (see Supplementary Table S1 for full stimulus list).

The precision requirement was calculated based on the deviation from the target within each category of fractions (Easy, Medium, Hard). In the easiest precision level (7%), a correct answer is accepted as long as it falls within ±7% of the target fraction. In the most demanding precision level (4%), participants were required to map within ±4% of the target to have their trial considered correct. For example, in the Easy level, the fraction 3/5 (0.6) is presented. A response that is considered correct within the 7% level would be 0.558–0.642 and 0.576–0.624 for 4% level. Each trial consisted of a presentation of a fraction challenge, a response window and an immediate assessment of a response with feedback (Fig. 1a). Participants started the game by mapping positions of Easy fractions with the lowest precision, ±7% from target. After 3 consecutive correct answers, they were promoted to the next level requiring a higher precision of response, i.e., ±6%, followed by ±5% and ±4%. After 3 consecutive incorrect answers, participants were demoted by a category or precision level. For example, participants would be demoted to a lower precision level of 7% if they produced 3 incorrect responses at 6% precision, or to Easy fractions at precision level of 4% if they produced 3 incorrect responses at the Medium level at the precision level of 7%. This game was adaptive in order to challenge participants close to their maximal capacity. Note that to account for all levels of the game achieved by our participants, we conducted an analysis of inter-individual variability in performance based the overall levels achieved by our participants (not restricted to level 3).

Transcranial direct current stimulation

During real stimulation, 1 mA tDCS was delivered to the bilateral dlPFC (right, F4: anode, left, F3: cathode) for 30 minutes on 2 separate days (within 3 days) during mathematics video game training. The dlPFCs were chosen as stimulation sites, as these are key areas involved in learning17 including mathematical learning4,18 and are hubs for a range of domain-general, executive functions17. Therefore, we inferred that these would be ideal stimulation sites to maximise the potential of transfer effects.

We chose to apply a right-anodal, left-cathodal montage for several reasons: 1. Anodal tDCS to the left dlPFC with a reference electrode on the contralateral supraorbital region improved performance on digit span, but only in the forward order19. Instead, when repetitive transcranial magnetic stimulation (rTMS) at 1 Hz was applied over the right dlPFC to transiently disrupt its function, performance on both forward and backward digit span were impaired20 (note that in contrast to rTMS at 1 Hz, which has an interference effect, the anodal electrode in our study, which is assumed to influence cortical excitability was placed above the right dlPFC); 2. We chose to adopt a bilateral instead of unilateral tDCS montage (stimulating the contralateral dlPFC instead of using a reference electrode) because other studies have shown stronger and more specific effects of the anodal tDCS in bilateral compared to unilateral tDCS21,22,23 and a previous study showed that such montage is more effective in producing localised current flows24; and 3. We chose the current montage rather than the opposite montage (right-cathodal, left-anodal) as the latter was shown to impair learning in another training paradigm25. Stimulation was delivered via a wireless tDCS cap with two 25 cm2 circular sponge electrodes (Neuroelectrics, Barcelona). The current densities were not localised to one hemisphere (Fig. 1d).

The sham group received identical training as the tDCS group, but stimulation was only applied for 30 seconds (15 seconds ramp-up, 15 seconds ramp-down at the beginning and at the end of the training). This brief stimulation is assumed to produce negligible effects on neuronal populations beneath stimulation electrodes, but induces scalp sensations that are indistinguishable from real tDCS26,27. The active control group was stimulated with the same protocol as the real tDCS group, but during non-mathematical visuospatial tasks (from the Wechsler Abbreviated Scale of Intelligence, WASI-II test).

Active Control Training

Participants in the active control group were given 15 minutes to perform in each Block Design and Matrix Reasoning subtest of WASI-II (total duration of up to 30 minutes).

In the Block Design task, participants were asked to arrange a few blocks into 10 specified designs or items shown in the Stimulus Book. Each side of the blocks could be purely red, purely white or divided into 2 equal triangles (half red, half white). Participants were given 4 blocks for the first 5 items and 9 blocks for the following 4 items. They had two practice trials using items 1 and 2 to ensure that they understand the task. They were then required to solve the first 5 items within 60 seconds for each item and the following 4 items within 120 seconds for each item. They were allowed two trials for the first 2 items and were given a score of 2 for a correct response on the first trial, a score of 1 for a correct response on the second trial and a score of 0 for incorrect response on both trials. For the remaining items, they were scored based on the time taken to solve each trial. For items 3–9, they were given a score of 4 for correctly assembled designs within 21–60 seconds, a score of 5 within 16–20 seconds, a score of 6 within 11–15 seconds and a score of 7 within 1–10 seconds. For item 10, they were given a score of 4 for recreating the designs correctly within 66–120 seconds, a score of 5 within 46–65 seconds, a score of 6 within 31–45 seconds and a score of 7 within 1–30 seconds. For each items 11–13, they were given a score of 4 for recreating the designs correctly within 76–120 seconds, a score of 5 within 56–75 seconds, a score of 6 within 41–55 seconds and a score of 7 within 1–40 seconds. The maximum raw score is 71. In the event that the participant had completed his/her blocks in less than 15 minutes, they were required to wear the tDCS cap for the remaining duration while they prepared for the second task, Matrix Reasoning (e.g., filling in their personal details).

In the Matrix Reasoning task, participants were given 30 incomplete matrices or series from the Stimulus Book and were required to select the response option that completes each matrix or series. A correct response was given a score of 1 and an incorrect response was given a score of 0. The maximum raw score is 35. In the event that the participant completed the task in less than 15 minutes, they were required to wear the tDCS cap for the remaining duration.

Please note that, in line with the WASI-II administration guidelines, the active control group did not receive any feedback on their performance and participants’ training was characterized by repeating the task on the next day. The two tasks were not presented in alternated order across or within participants over the three testing days. Overall, each participant completed 10 Blocks Design items and 30 Matrix Reasoning items on each day.

Cognitive assessments

Cognitive assessments were conducted before, immediately after and 2 months after training in order to assess the generalizability of training (transfer) and longevity of effects. Participants were tested on their mathematical achievement and working memory capacity (verbal and visuospatial). Mathematical achievement was tested using the Wechsler Individual Achievement Test 2nd UK Edition (WIAT-II, UK). This included tests on numerical operations and mathematical reasoning. The composite scores were used to control for any differences in mathematical achievement during group allocation and to assess the effect of individual differences on training outcomes for real and sham tDCS. Verbal and visuospatial working memory capacities were assessed using Digit Span and Corsi blocks respectively (both forward and backward for each test).

Statistical Analyses

Response times and accuracy (absolute deviation of answer from target) up to level 3 and within ±3 standard deviations (SD) of the mean were analysed by 4-way mixed analyses of variance (ANOVAs). Time (day 1, day 2) x category (Easy, Medium, Hard fraction problems) x precision (±7%, ±6%, ±5%, ±4% from the exact answer based on the number line range) were the within-subject factors and group (tDCS, sham) was the between-subject factor. Two months later, data were analysed using a 3-way ANCOVA with category, precision and group. We included day 1 RTs as a covariate to control for baseline performance as it correlated with performance 2 months later (n = 20, Pearson r = 0.72, p < 0.01; Spearman r = 0.84, p < 0.01). Note that on Day 1, one participant from the tDCS group did not perform up to Level 3 and was therefore excluded from the RT and accuracy analyses and data provided in Table 1.

Table 1 Game RT and accuracy with improvements within- and between groups. Full size table

Overall performance on the game was taken into account when we analysed the relationship between the overall levels (sum of all precision levels in all categories achieved) and participants’ baseline mathematics abilities using both Pearson and Spearman correlation analyses. Note that one participant was excluded for being an outlier (>6 SD from the mean) and when we assessed the gain from the training at the end of the game, performance on day 1 was controlled for as baseline performance, as it was correlated with performance on day 2. We controlled for day 1 instead of subtracting it as it is recommended as the best method suited to our design28,29.