Working memory capacity is notoriously limited to a handful of items, creating one of the central bottlenecks of human cognition, but can be improved by training. The neural basis of this improvement remains a matter of debate, as human imaging studies have produced contradictory results about the mechanisms that effect improved capacity. To resolve this controversy, we recorded neuronal activity from monkeys while they were being trained to improve their ability in maintaining multiple stimuli in memory. Our results reveal that improvement of working memory is effected by a more distributed activation of the prefrontal cortex and invariant temporal dynamics of neuronal activity. These changes render the prefrontal network more robust, allowing it to maintain more items in memory.

The amount of information that can be stored in working memory is limited but may be improved with practice. The basis of improved efficiency at the level of neural activity is unknown. To investigate this question, we trained monkeys to perform a working memory task that required memory for multiple stimuli. Performance decreased as a function of number of stimuli to be remembered, but improved as the animals practiced the task. Neuronal recordings acquired during this training revealed two hitherto unknown mechanisms of working memory capacity improvement. First, more prefrontal neurons became active as working memory improved, but their baseline activity decreased. Second, improved working memory capacity was characterized by less variable temporal dynamics, resulting in a more consistent firing rate at each time point during the course of a trial. Our results reveal that improved performance of working memory tasks is achieved through more distributed activation and invariant neuronal dynamics.

Working memory is the ability to maintain and manipulate information in mind (1). The capacity of human working memory is notoriously limited; only a handful of items can be held in memory over a period of seconds, creating a central limitation of human cognition (2). Individual abilities are reliant on working memory capacity (3, 4). Recent results suggest that capacity can be improved by training in working memory tasks (5, 6). The extent over which performance improvements generalize to tasks that were not part of the training has been a matter of intense debate (7, 8). No less contentious has been the idea that computerized training can improve cognitive function in healthy adults (9).

Human fMRI studies have produced conflicting results about the effects of training, with some studies suggesting increases (10⇓⇓⇓⇓–15) and others decreases in activity (16⇓⇓–19). The former are interpreted as reflecting a higher level of activation or recruitment of a larger cortical area, the latter as suggestive of improvements in efficiency (20). Humans are able to effectively reduce working memory load by grouping or “chunking” multiple stimuli (21). For example, a series of 10 digits comprising a phone number can be remembered more easily as a set of three groups of three to four numbers. The effects of training remain speculative, however, and the concept of efficiency is poorly defined at the neural level.

Working memory is thought to be mediated by persistent activity in a network of interconnected neurons behaving as a bump attractor and representing remembered stimuli in the peak of network activity, which may drift in time, resulting in loss of precision (22⇓–24). Neurons that are excited by their preferred stimuli remain active in the delay period of a working memory task. When the capacity of the network is exceeded, information about an item may decay or merge with another item, resulting in loss of information about this item (SI Appendix, Fig. S1). The changes in neuronal activity that allow the network to increase its capacity are unknown. Computational studies simulating networks of neurons generating persistent activity show that improved capacity could be achieved through increased excitatory coupling, resulting in increased activity representing stimuli in the delay period or reduced external drive resulting in lower baseline and stimulus-driven firing rate (25, 26). However, recent work has revealed considerable dynamics in the time course of delay-period activity, and their effects on models of capacity is unclear (27). Alternative mechanisms have also been proposed as the neural correlate of working memory, some of which do not depend on elevated activity during the delay period at all (28, 29). The site of information maintenance has also been debated, with some studies suggesting that information about stimuli is maintained in posterior areas, rather than the prefrontal cortex itself (30, 31). Our study sought to determine the changes in neural activity that effect improvement in working memory performance after practice.

Results

We trained two monkeys to perform a working memory task that required them to remember the spatial locations of multiple stimuli appearing on a visual display and to indicate if a second display with an equal number of stimuli was identical to the first (Fig. 1 A and B). The monkeys’ performance declined monotonically as a function of the number of stimuli as the load of information maintained in working memory increased (Fig. 1C and SI Appendix, Fig. S2). The effect of number of stimuli on performance was highly significant (one-way ANOVA, F 4,545 = 35.5, P = 2.25 × 10−26 for monkey EL; F 4,105 = 26.65, P = 2.90 ×10−15 for monkey DA). Based on the pattern of correct and error responses, we were able to determine the working memory capacity K (Fig. 1D) in this task (defined as the set size multiplied by hits − false alarms and divided by 1 − false alarms). Behavioral sessions and neuronal recordings were collected over a period of several weeks, during which performance of the animals improved gradually (Fig. 1 E and F). A linear regression of capacity on successive recording days showed a positive slope for each animal (b = 0.022, F 1,46 = 14.290, P = 4.5 × 10−4 for monkey EL; b = 0.040, F 1,20 = 1.952, P = 0.178 for monkey DA). We relied on a median split to distinguish between sessions of low and high performance based on estimated capacity. The monkeys achieved an overall performance level of 76.7% and capacity K = 2.24 items in the low-performance sessions and 82.6% correct trials and K = 3.33 in the high-performance sessions. Errors that differentiated sessions of high performance from those of low performance involved mostly displays with four to five stimuli, as evidenced by differences in capacity between these groups of sessions plotted as a function of number of stimuli (Fig. 1G).

Fig. 1. Working memory capacity task and behavior. (A) Successive frames illustrate the sequence of events in the match/nonmatch task. The monkeys were required to remember the locations of all of the squares in the cue stimulus during a delay period. A second display then appeared, which contained the same number of stimuli. If one of the squares appeared at a new location, the display constituted a nonmatch; if the displays were identical, they constituted a match. Two choice targets of green and blue color appeared at the top and bottom location (randomly alternating in different trials), and the monkey was required to saccade to a green target if the two sequential displays matched each other or to a blue target otherwise. One animal was trained in a variation of this task that required a lever release for a matching stimulus instead of choice targets. (B) The 24 possible locations where stimuli could appear in the spatial match/nonmatch task. (C) The percentage of correct trials is shown as a function of the number of stimuli in the display for the two monkeys. (D) Estimated capacity in low- and high-performance sessions are plotted for the two monkeys. (E) Capacity estimated in successive daily sessions for monkey DA when neurophysiological recordings were also obtained. Line represents linear regression. (F) Capacity for monkey EL. (G) Difference in capacity between sessions of high and low performance, which were determined based on a median split separately for each monkey, plotted as a function of number of stimuli.

Neural Responses to Multiple Stimuli. We recorded from areas 8 and 46 of the dorsolateral prefrontal cortex while the animals were performing the task (Fig. 2A). A total of 305 neurons were obtained (218 and 87 neurons from the 2 monkeys, respectively). Of those, 111 neurons (n = 61 for monkey EL, n = 50 for monkey DA) exhibited significant selectivity across different displays (one-way ANOVA, P < 0.05) and therefore could be informative about the displays that needed to be maintained in memory. We relied on these selective neurons for most analysis; results from all neurons are also reported in some figures and in the SI Appendix. The mean firing rate of selective neurons exhibited a highly dynamic time course, starting to increase before the first stimulus display even appeared in the screen (time −1 to 0 in Fig. 2B), peaking shortly after the appearance of the stimulus, decreasing further after the stimulus disappeared, but increasing again at the end of the delay period and during the second stimulus presentation (Fig. 2B and SI Appendix, Fig. S3). Fig. 2. Firing rate for displays of varying stimuli. (A) Schematic diagram of the monkey brain highlighting areas where recordings were performed. Recordings in dorsolateral prefrontal cortex (PFC) sampled areas 8 and 46. AS, arcuate sulcus; PS, principal sulcus. (B) PSTH represents mean population activity obtained during presentation of varying stimuli. (C and D) Mean firing rate is shown, averaged across displays of different numbers of stimuli, during the cue period (C) and the delay period (D). Data from two monkeys (n = 111 neurons that were selective to the stimuli during the cue period or the delay period). Bars represent SEM. Across the population of selective neurons, mean firing rate increased monotonically as a function of number of stimuli during the cue presentation period (Fig. 2 B and C). As the location of the stimuli was randomized in each session, displays with more stimuli were more likely to activate a neuron, and higher levels of activity were elicited across the population. The difference in firing rate between displays with different numbers of stimuli was highly significant (repeated-measures ANOVA, F 4,440 = 25.3, P = 7.0 × 10−19). During the delay period (Fig. 2D), a generally higher firing rate was present for displays with more stimuli, but the effect was less consistent (repeated-measures ANOVA, F 4,440 = 3.88, P = 4.12 × 10−3). It is important to emphasize that this increase in firing rate for more stimuli appearing at randomized locations applied to the overall population activity pooled together. When we examined responses involving a single stimulus appearing in the receptive field of a neuron, activity decreased as additional stimuli were added outside the receptive field (SI Appendix, Fig. S4, red trace). The same was true when two stimuli appeared in the receptive field and additional stimuli appeared outside (SI Appendix, Fig. S4, dark blue trace), and so on. This is akin to effects of crowding and lateral inhibition (32, 33). The firing-rate difference reached statistical significance for the condition of one stimulus appearing in the receptive field as increasing numbers of stimuli were added outside (one-way ANOVA, F 4, 443 = 5.38, P = 3.08 × 10−4) and for two stimuli in the receptive field as increasing numbers of stimuli were added outside (F 3, 317 = 3.41, P = 0.018 for two stimuli; F 2,181 = 1.23, P = 0.293 for three stimuli). In addition to the cue and delay periods, we also examined neural activity during the presentation of the second display, and we distinguished between match and nonmatch responses (SI Appendix, Fig. S5). Across the population of prefrontal neurons, responses to the second presentation of stimuli were generally higher when the display constituted a nonmatch rather than a match, an effect often referred to as repetition suppression (34). This effect, too, was sensitive to the number of stimuli. The absolute difference between match and nonmatch responses decreased as a function of number of stimuli in the display (one-way ANOVA, F 4, 630 = 2.73, P = 0.029).

Firing Rate Changes Associated with Performance Improvement. As our neurophysiological recordings were obtained over a period of time during which performance in the task improved (Fig. 1 E and F), it was possible to examine the neural changes that accompanied increased ability to maintain a greater memory load. A total of 203 neurons were recorded in low-performance sessions based on a median split depending on performance (as in Fig. 2G). Of those, 53 (26%) were selective for the stimulus pattern. A total of 102 neurons were recorded in high-performance sessions. The percentage of neurons selective for stimulus pattern increased substantially to 57% (58 of 102 neurons). This increase in percentage of selective neurons was statistically significant (χ2 test, P = 1.39 × 10−7). We hypothesized that, as performance of the working memory task improved, neuronal firing rate among neurons selective for the stimuli would also increase, as previous studies of the relationship between neuronal activity and working memory performance across sessions in simpler working memory tasks have shown (35). Unexpectedly, we found that prefrontal activation generally decreased in sessions of higher performance (Fig. 3 A and B). A two-way ANOVA with number of stimuli and low/high performance as factors revealed that the main effect of performance was significant in the fixation period (F 1,545 = 18.44, P = 2.08 × 10−5; Fig. 3C), the cue period (F 1,545 = 10.85, P = 1.05 × 10−3; Fig. 3D), and the delay period (F 1,545 = 5.86, P = 0.016; Fig. 3E). For the sample period (Fig. 3F), prefrontal activation was slightly higher for sessions of higher performance, although the difference did not reach significance (F 1,545 = 3.27, P = 0.071). Fig. 3. Activity in low- and high-performance sessions. (A and B) Population PSTH obtained during presentation of multiple stimuli for (A) low-performance sessions and (B) high-performance sessions. Data from two monkeys (n = 111 neurons). (C) Averaged firing rate of different stimuli number for low- and high-performance sessions during the fixation period. (D–F) Raw firing rate and evoked firing rate after subtracting the baseline fixation rate in the cue period (D), delay period (E), and sample period (F). (G) Autocorrelation function plotting correlation coefficient between firing rate in the first 500 ms of fixation and every successive 500-ms interval for low- and high-performance sessions. Similar results were obtained if we split sessions purely chronologically, rather than based on performance (SI Appendix, Fig. S6). These analyses were based on correct trials. As the improvement in performance involved a decrease in the rate of trials that end up being incorrect, we also repeated our analysis by using correct and error trials (SI Appendix, Fig. S7). The results including the error trials were essentially identical; a significant decrease in activity was present for the fixation period (F 1,545 = 18.73, P = 1.79 × 10−5), the cue period (F 1,545 = 10.97, P = 9.90 × 10−4), and the delay period (F 1,545 = 6.98, P = 8.48 × 10−3) as a function of cumulative number of sessions. The difference in activity between low- and high-performance sessions was already present from the baseline fixation period, we were therefore interested to examine the evoked neuronal firing rate relative to the baseline and to compare this measure between sessions of low and high performance (Fig. 3 D–F, dotted lines). Compared with low-performance sessions, no significant difference was found in sessions of higher performance in the cue period (F 1,545 = 0.37, P = 0.543), but now a prefrontal activation increase was found in the delay period (F 1,545 = 22.49, P = 2.71 × 10−6) and the sample period (F 1,545 = 49.45, P = 6.12 × 10−12). The same effect was present if we divided sessions chronologically (SI Appendix, Fig. S6). The combined effect of a greater proportion of neurons being activated and a lower level of absolute firing rate of these neurons resulted in a more distributed representation of stimulus information across the prefrontal population. This effect could be seen when we performed a receiver operating characteristic (ROC) analysis comparing responses for the best vs. worst display among displays of equal numbers of stimuli for all neurons recorded in the low- and high-performance sessions (Fig. 4 and SI Appendix, Fig. S8 for the delay period). A greater proportion of neurons achieved values of area under the ROC curve greater than 0.75 (a midpoint between chance and perfect performance) in the high-performance sessions. Fig. 4. ROC analysis. (A and B) ROC values for each recorded neuron in low-performance (A) and high-performance sessions (B) are shown. Dark red colors indicate high selectivity between different stimulus displays of the same number. Dashed white lines indicate the ROC value of 0.75. Data from two monkeys (n = 305 neurons). We also examined the selectivity for match or nonmatch responses in low- and high-performance sessions (SI Appendix, Fig. S5G). Although, across the population, a substantial difference in firing rate was evident for match and nonmatch responses during the presentation of the second stimulus, this proved not to be predictive of low vs. high performance. A two-way ANOVA with factors number of stimuli and low/high performance group revealed no main effect of number of stimuli (F 4,625 = 1.81, P = 0.126), performance (F 1,625 = 0.75, P = 0.386), or interaction between number of stimuli and performance group (F 4,625 = 0.71, P = 0.584). The results remained essentially unchanged when we repeated this analysis including error and correct trials (F 1,625 = 1.29, P = 0.256 for main effect of performance; SI Appendix, Fig. S5H). A negligible difference was evident between high- and low-performance conditions particularly for five-stimulus displays, for which the greatest improvement in performance was observed. For this reason, and although the match/nonmatch effect was predictive of behavior across all sessions (SI Appendix, Fig. S5F), it could not account for the working memory performance improvement.

Changes in Neuronal Dynamics. Effects of working memory enhancement may not be limited only to mean firing rate changes between conditions. Indeed, the effects of dynamics in the representation of stimulus information have recently began to be appreciated (27). We therefore proceeded to examine changes in the neural activity beyond simple changes in firing rate averaged across entire epochs. An additional difference between low- and high-performance sessions was that the envelope of firing rate changes became much more stereotypical in high-performance sessions [evidenced by the smoothness of curves in the population peristimulus time histogram (PSTH) of Fig. 3 A and B]. We quantified that change by performing a demixed principal component analysis (dPCA), which identifies components of firing rate related to different aspects of the task and stimuli (36). Similar to PCA, this technique reduces the dimensionality of the neuronal firing rate matrix, identifying the dimensions that capture the most variance in the neuronal population firing rate across conditions. Importantly, dPCA segregates variance into components that take into account experimental parameters, which, in our case, included the time of stimulus presentation, the different displays used, and the categorical decision between match and nonmatch. The results of this analysis indicated that “condition-independent” components (i.e., components unrelated to which stimulus display appeared on the screen or whether the trial involved a match or a nonmatch) represented 50% of the total firing rate variance in sessions of low performance and increased to 64% in sessions of high performance (pie charts in Fig. 5). In other words, a larger proportion of variance could be explained by changes in firing rate that tracked the time course of the trial (e.g., ramping in anticipation of the initial stimulus, cue transient, decline of activity in the delay period, sample transient). In contrast, components representing the stimulus and mixtures of stimulus and other parameters decreased from 46% in low-performance sessions to 32% in high-performance sessions (Fig. 5). The analysis also confirmed that representation of decision variables (essentially whether a stimulus was a match or nonmatch in the context of our task) changed little between low- and high-performance sessions: 24% of the total firing rate variance was accounted for by decision variables and mixtures in the low-performance, compared with 18% in the high-performance sessions. Fig. 5. dPCA analysis of low-performance (Left) and high-performance sessions (Right). The top row of graphs represents the first two condition-independent components of dPCA analysis, the second row represents the first two stimulus-related dPCA components, the third row represents the first decision-related dPCA component, and the bottom row represents the first stimulus/decision mixture dPCA component. Data from two monkeys (n = 199 neurons with sufficient numbers of trials for this analysis from low-performance sessions; n = 92 for high-performance sessions). As a result of the invariant dynamics of firing in high-performance sessions, the firing rate of a neuron during the fixation period was highly predictive of firing during the delay period in the high-performance sessions and did not simply reflect a “baseline” level of activity. As a way to quantify the regularity of firing rate time course, we performed an autocorrelation analysis, measuring the correlation coefficient between the firing rate of a neuron at the first 500 ms of the fixation period and subsequent 500-ms intervals (Fig. 3G). Increases and decreases in firing rate were highly stereotypical, resulting in much greater positive and negative deviations in the high- vs. low-performance sessions. This effect could also be seen in the error bars of Fig. 3E (SI Appendix, Fig. S6H shows sessions split chronologically). The SD of delay period firing rate minus fixation firing rate was considerably lower in the high-performance (σ = 3.2 spikes per second) than in the low-performance sessions (σ = 12.7 spikes per second). Note that variability of the absolute rate of firing during the delay period was not appreciably greater in the high- and low-performance sessions (error bars for solid lines vs. dotted lines in Fig. 3E). As the differences in firing patterns between low- and high-performance sessions were already evident in the fixation period, before the appearance of the first stimulus display, we calculated the time constant of the autocorrelation of firing rate in the fixation period (SI Appendix, Fig. S9). This is a measure of the neuron’s intrinsic time scale that quantifies the stability of firing dynamics (37). The time constant increased from 97 ms in the low-performance sessions to 131 ms in the high-performance ones, a significant difference (permutation test, P = 7.77 × 10−5).

Relationship with Behavior. Not all changes we identified between low- and high-performance sessions were predictive of behavior in the task. To identify the critical aspects of neuronal activity that were associated with performance, we compared activity patterns in correct and error trials in the high-performance sessions (SI Appendix, Fig. S10 A and B). Firing rate differences were generally subtle between correct and error trials. Whereas a huge decrease in baseline activity was evident in high-performance compared with low-performance sessions (Fig. 3), the average rates in correct and error trials were very similar (SI Appendix, Fig. S10C) and no significant difference was evident (paired t test, t 58 = 0.4563, P = 0.650). Collapsed across all stimulus conditions, trials when a cue elicited higher activity were slightly more likely to result in errors (paired t test, t 58 = 2.835, P = 6.33 × 10−3). The same was true for the sample presentations (SI Appendix, Fig. S10C). The strongest predictor of performance was the regularity of the firing rate time course. The correlation coefficient between firing rate during the fixation period and every other time point during the trial was greatly reduced (i.e., flattened) in error trials (SI Appendix, Fig. S10D). In other words, trials in which firing rate did not ramp up or down during the time course of the trial were more likely to result in errors. The differences between sessions of high vs. low performance and correct vs. error trials were additive; the most pronounced dynamics were observed in correct trials of high-performance sessions and the least in error trials of low-performance sessions (SI Appendix, Fig. S10 D–F).