Protocol

Seventy-one healthy volunteers underwent a protocol consisting of memory encoding and both immediate and delayed retrieval, with retrieval tests organized around a ~12 h retention interval containing wakefulness or sleep (Fig. 1A). Participants in the wake group (N = 25) performed encoding and immediate testing (Test 1) in the morning, and delayed testing at night (Test 2). This timing was reversed for participants in the sleep group (N = 46). Furthermore, 62-channel EEG was recorded throughout the procedure in the sleep group to allow sleep staging as well as future examinations of the neural correlates of encoding, retrieval, and memory reactivation during sleep.

Figure 1 Protocol overview. (A) Study timeline. (B) Trial structure at encoding (with lateralized stimulus presentation). (C) Trial structure at retrieval (with central stimulus presentation). Full size image

During encoding, participants maintained fixation in the center of the screen while a series of visual stimuli was briefly presented in either the left or right visual field (VF) (Fig. 1B). Importantly, all pictures shown on one side were of mild positive emotional valence and low arousal (“neutral valence”), while those presented on the opposite side were of strong negative valence and high arousal (“negative valence”); this manipulation served to facilitate future decoding of category-specific EEG signatures (see Discussion). Subjects rated each item’s perceived valence immediately following its presentation on a continuous scale from very negative to very positive. Participants were not told that their memory would be tested later to render the encoding phase incidental. Assignment of emotional category to VF was counterbalanced across subjects. This additional factor SIDE (negative-left/neutral-right: N = 36; negative-right/neutral-left: N = 35) was included to ensure that any differences between negative and neutral categories could not be due to asymmetric emotional processing by the two hemispheres40,41,42.

During the immediate and delayed retrieval tests, subjects viewed previously seen (“old”) and novel (“new”) stimuli presented in the center of the screen (as opposed to lateralized; Fig. 1C). Different old and new items were used for immediate and delayed tests. Subjects then used sequentially presented continuous scales to rate each item’s 1) old/new status (recognition memory), 2) VF of presentation during encoding (contextual memory), and 3) perceived valence. For items rated “new”, subjects rated VF as if they were actually “old” items (see Methods).

Control Analyses

We first assessed whether time of day affected vigilance and whether that, or the VF of negative item presentation, had any effect on valence ratings or sleep architecture.

Vigilance

We determined subjective vigilance in two ways both prior to encoding and prior to the delayed test. Sleepiness, as assessed by the Stanford Sleepiness Scale43 (SSS), did not differ between the sleep and wake groups either before encoding (sleep: 2.6 ± 0.8; wake: 2.2 ± 0.6; Wilcoxon Rank Sum: Z = 1.7, P = 0.09) or after the 12 h interval (sleep: 2.5 ± 1.1; wake: 2.4 ± 0.9; Z = 0.2, P = 0.81). Similarly, self-reported ability to concentrate (continuous scale from 0–100) did not differ systematically pre-encoding (sleep: 74.4 ± 15.6; wake: 80.6 ± 13.8; t(69) = 1.7, P = 0.10) or before the delayed test (sleep: 74.1 ± 18.4; wake: 75.8 ± 15.9; t(69) = 0.4, P = 0.70), indicating no time of day effect on vigilance.

Encoding Valence

During encoding, subjects rated the emotional valence of each picture on a scale from –1 (negative) to +1 (positive) immediately following its lateralized presentation. An ANOVA with between-subject factor GROUP (sleep/wake) and within-subject factor VALENCE (negative/neutral) on subjects’ trial-averaged ratings indicated that negative items were rated significantly more negatively than neutral ones (main effect of VALENCE), with no effect of GROUP or GROUP*VALENCE interaction (Table 1–encoding). Post-hoc tests indicated that the valence effect was present in both groups (sleep: t(45) = 30.4, P < 10−30; wake: t(24) = 16.0, P < 10−13), whereas subjective ratings did not differ systematically between the sleep and wake groups for either neutral (t(69) = 1.7, P = 0.09) or negative (t(69) = 0.8, P = 0.41) items, indicating no time of day effect on perceived valence. Moreover, single-subject independent t tests comparing negative and neutral items indicated that each of the 71 individuals rated negative items significantly more negatively than neutral items (all P adj < 10−14, where P adj indicates the adjusted P value using the False Discovery Rate [FDR]44). We further added SIDE (negative-left/negative-right) as a between-subject factor, but found no significant main or interaction effects involving this factor (all F(1,67) < 2.5, all P > 0.12), indicating that the VF where each emotional category was presented did not impact ratings. Thus, these results establish that subjects can accurately gauge the emotional valence of stimuli presented away from fixation, regardless of presentation side, and that these effects are similar in the morning and evening.

Table 1 Valence ratings at encoding and immediate test (mean ± SD). Full size table

Sleep Architecture

Overall sleep architecture for the 46 overnight participants was in line with typical values for young, healthy subjects (Table 2). We compared sleep parameters as a function of SIDE to examine if the VF where negative vs. neutral items had been presented affected subsequent sleep. Time and percentages spent in each sleep stage did not differ between groups. We did observe differences significant at an uncorrected threshold in parameters related to sleep quality (total sleep time, sleep efficiency, sleep latency), which were driven by two outliers in the negative-left group with unusually long sleep latencies (76 and 81.5 min; all others < 36 min), one of whom also had the lowest observed sleep efficiency (72.6%). However, no comparison survived correction for multiple testing. Thus, macroscopic sleep structure did not vary depending on the hemisphere initially processing negative or neutral information prior to sleep.

Table 2 Sleep architecture parameters (mean ± SD). Full size table

Contextual Memory

We next analyzed subjects’ memory for the VF in which each item had been presented during encoding (i.e., contextual memory). Continuous left/right ratings were dichotomized and relabeled “correct” and “incorrect” according to original presentation side. In keeping with previous aproaches13,45, we focus below on contextual memory for correctly recognized items (i.e., hits).

Immediate Test

At immediate testing, subjects correctly indicated the original presentation side significantly above chance in each condition (negative-sleep: 73.9 ± 14.0%; neutral-sleep: 74.7 ± 16.9%; negative-wake: 75.6 ± 16.5%; neutral-wake: 78.3 ± 15.5%; t tests vs. 50%: all P < 10−7). An ANOVA with between-subject factor GROUP and within-subject factor VALENCE did not reveal significant baseline differences (all F(1,69) < 0.8, all P > 0.37; Table S3). Additional analyses including factor SIDE are presented in Supplementary Information.

Change over 12 h

Next, we evaluated the change in contextual memory across the 12 h interval (delayed – immediate). As predicted, we observed a selective preservation of contextual memory across 12 h for negative items in the sleep group (Fig. 2A). Whereas performance decreased across wake for negative (−11.1 ± 13.8%; one sample t test vs. zero: t(24) = 4.0, P = 0.0005) and neutral items (−5.5 ± 15.2%; t(24) = 1.8, P = 0.08), and for neutral items across sleep (−7.8 ± 14.2%; t(45) = 3.7, P = 0.0005), memory for negative items was unchanged across sleep (0.2 ± 14.3%; t(45) = 0.1, P = 0.94). Across conditions, there was a significant GROUP*VALENCE interaction (F(1,69) = 5.7, P = 0.02), a main effect of GROUP (F(1,69) = 4.2, P = 0.04), but no effect of VALENCE (F(1,69) = 0.2, P = 0.67). Post hoc tests indicated that the change in contextual memory for negative items in the sleep group was significantly different from that for neutral items in the same group (paired t(45) = 2.4, P = 0.02), and from that for negative items in the wake group (independent t(69) = 3.2, P = 0.002). The comparison with the neutral-wake condition was in the same direction as for the other two comparisons, but did not reach significance (t(69 = 1.6, P = 0.12). Within the wake group, forgetting of negative and neutral items did not differ significantly (t(24) = −1.5, P = 0.26). Further analyses indicated that these effects did not depend importantly on whether negative items had been presented to the left or right during encoding (Supplementary Information).

Figure 2 Change in memory across 12 h. (A) Contextual memory for hits was selectively preserved for negative items in the sleep group. (B) Recognition memory (hit rate) dropped similarly for each condition. Error bars reflect standard error of the mean with between-subject variability removed. Full size image

Contextual Memory Control Analyses

An exit questionnaire indicated that 73% of subjects (52/71) became explicitly aware of the mapping between VF and emotional category at some point during the protocol. These proportions were very similar for the sleep (74%, or 34/46) and wake (72%, or 18/25) groups (χ2 = 0.03, P = 0.86). A further breakdown of the participants who became aware of the VF-valence contingency indicated that 63% reached this insight prior to the 12 h interval (sleep: 62%, wake: 67%; encoding: 25% [sleep: 26%; wake: 22%], immediate test: 38% [sleep: 35%; wake: 44%]), while only 12% first realized this relation at delayed test (sleep: 15%; wake: 6%). The remaining 25% (sleep: 24%; wake: 28%) could not specify first rule awareness. Subject proportions were similarly distributed over these categories for the sleep and wake groups (χ2 = 1.28, P = 0.73). Thus, these findings indicate that when subjects attained explicit awareness of the VF-valence relation, they typically extracted this rule early in the protocol, with no indication that time of day and/or sleep vs. wakefulness affected the time course of reaching this insight.

Regardless of explicit insight, it is possible that, during retrieval, participants determined each item’s original presentation side solely based on its perceived valence. To address this issue, we examined baseline performance at immediate test in two sets of analyses. Because the sleep and wake groups did not differ at immediate test with respect to either contextual memory or valence, groups were combined.

First, we performed within-subject across-trial correlations, where we correlated items’ valence ratings with their corresponding continuous “correct side/incorrect side” confidence ratings. This was done separately for negative and neutral, and old and new, items. We then extracted the slopes of these regression lines to examine the link between item valence and judgment of encoding side at the group level. Slopes differed significantly from zero for all four item categories (all t(70) > 5.4, all P < 10−6), indicating that more extreme valence ratings were associated with higher-confidence scores towards the correct side. We reasoned that if subjects exclusively relied on item valence to determine item placement, these relations should be similar for old and new items. However, slopes differed significantly between old and new items for both valence categories (negative-old: −0.42 ± 0.32; negative-new: −0.30 ± 0.37; t(70) = 3.8, P = 0.0003; neutral-old: 0.33 ± 0.31; neutral-new: 0.23 ± 0.35; t(70) = 2.6, P = 0.01), indicating that side judgments for old items went beyond the “baseline” association with valence seen for new items.

Second, if subjects merely relied on item valence, contextual memory for hits should not differ from performance for items with other recognition statuses (i.e., correct rejections, misses, and false alarms), as valence information is equally available in each of these cases. However, when we compared baseline VF ratings for hits to the other three categories, separately for negative and neutral items, we found significantly higher performance for hits for 5/6 comparisons (paired t tests, all P adj < 0.00004) except for hits vs. false alarms for negative items (P adj = 0.70). To follow up on the latter finding, we examined the change in performance on the contextual task for false alarms across 12 h. Unlike the differential retention of encoding side for previously presented negative and neutral items across sleep, no such valence-related difference was seen for proportions of false alarms assigned to the “correct” side (negative: −1.5 ± 22.9%; neutral: −7.5 ± 32.9%; t(43) = 1.0, P = 0.34), suggesting that the emotional advantage was tied to previously encoded items.

Combined, these findings suggest that while valence-based strategies contributed to judgments about encoding side, contextual memory for hits relied on additional memory-related processes. Moreover, the empirical finding that sleep, but not wake, selectively stabilizes contextual memories of negative, but not neutral, items cannot be explained by a pure valence-based account.

Contextual Memory and Sleep Parameters

Next, we examined whether overnight changes in contextual memory for hits were related to sleep parameters (all variables in Table 2, except WASO, sleep efficiency, and sleep latency). Separate analyses for negative, neutral, and pooled emotional categories, indicated a strong positive correlation between percentage of time spent in NREM (N1 + N2 + N3) sleep and contextual memory change across emotional categories (Fig. 3; Spearman R = 0.33, Robust regression P = 0.003). A corresponding negative correlation was found for REM percentage. These correlations remained significant after correction for multiple comparisons (P adj = 0.05). We also observed a negative correlation with minutes spent in REM (R = –0.33, P = 0.01), although this relation did not survive correction for multiple comparisons (P adj = 0.16). Correlations with individual emotional categories were not significant for any sleep parameter (all P adj > 0.40), nor were correlations between proportion of time spent in NREM sleep and contextual memory change different for the two emotional categories (z test for correlated correlations: Z = 0.34, P = 0.63).

Figure 3 Overnight change in contextual memory for hits, pooled across emotional categories, was positively related to NREM sleep percentage. P value (uncorrected) and regression line from robust fit, R value from Spearman correlation. Full size image

In sum, larger proportions of NREM (and smaller proportions of REM) sleep are related to better retention of contextual details regardless of emotional category, suggesting that NREM sleep promotes consolidation of episodic memories in a valence-independent fashion.

Item Recognition

Next, we assessed old/new recognition memory by dichotomizing subjects’ continuous old/new response to each stimulus as “old” or “new”, and then calculating standard metrics of hit rate (HR), and discriminability (d’). Absolute performance at immediate and delayed test for these metrics, as well as for the false alarm rate (FAR), is reported in Table S1.

Change over 12 h

We evaluated changes in recognition memory across the 12 h interval (delayed – immediate) as a function of GROUP and VALENCE (Fig. 2B). HR decreased significantly for both negative and neutral stimuli in both groups (one-sample t tests vs. zero: all P < 10−6), indicating robust forgetting. However, neither GROUP nor VALENCE affected the rate of forgetting (statistics in Table 3–HR). Adding SIDE as a between-subject factor did not reveal additional significant main or interaction effects involving this factor (all F(1,67) < 2.5, all P > 0.12), indicating that forgetting was similar when negative items had originally been presented in the left or right VF. Similar to findings for HR, while subjects’ ability to discriminate old from new items decreased significantly in each condition (one-sample t tests vs. zero: all P < 10−4), forgetting was not affected by GROUP or VALENCE (all F(1,69) < 1.9, all P > 0.17; Table 3–d′). Again, adding SIDE did not yield significant effects (all F(1,67) < 2.7, all P > 0.10). Changes in FAR are reported in Table S2.

Table 3 Change in recognition memory across 12 h (mean ± SD). Full size table

For the sleep group we examined whether sleep parameters correlated with overnight memory changes. For each recognition metric (HR, FAR, d’), we performed separate analyses for negative, neutral, and pooled emotional categories. We found no significant correlations (FDR correction across 39 comparisons for each recognition metric: all P adj > 0.58).

In sum, while overall recognition memory deteriorated markedly across a 12 h interval, wake vs. sleep did not affect the overall rate of forgetting, nor did it impact retention of negative and neutral items differently, contrasting markedly with the effects reported for contextual memory.

Emotional Valence

Next, we examined whether and how sleep influences subjective ratings of item valence from immediate to delayed testing. We assessed valence separately for old and new pictures to determine whether changes in emotional reactivity are specific to memorized items.

Immediate Test

We first sought to ensure that baseline valence ratings at immediate test differed reliably between negative and neutral items, and did so similarly for the sleep and wake groups. Separate ANOVAs for old and new items indicated highly significant main effects of VALENCE, with no effect of GROUP and no GROUP*VALENCE interaction, indicating no effect of time of day on emotional ratings (Table 1, Test 1–old and Test 1–new). Post hoc paired t tests indicated that negative and neutral ratings differed for each combination of sleep/wake and old/new (all P < 10−13), and for every individual (all P adj < 10−5). VF of presentation at encoding (SIDE) did not impact affective ratings at immediate test, for either old or new items (all F(1,67) < 1.7, all P > 0.20).

Adding the factor OLD/NEW to the VALENCE*GROUP ANOVA again revealed a significant effect of VALENCE (F(1,69) = 1126.6, P < 10−43), as well as an OLD/NEW*VALENCE interaction (F(1,69) = 28.0, P < 10−5), but no effects involving GROUP (all other F(1,69) < 1.8, P > 0.18). Post hoc tests revealed that the OLD/NEW*VALENCE interaction stemmed from more negative ratings to negative items for old compared to new items (sleep: t(45) = 3.3, P = 0.002; wake: t(24) = 2.3, P = 0.03; Table 1), and from old neutral items being rated more positively than new neutral items in both groups (sleep: t(45) = 3.2, P = 0.002; wake: t(24) = 2.3, P = 0.03; Table 1). These findings indicate that items previously encountered during encoding elicit a stronger emotional response at test than novel items, with this emotional potentiation effect similar for negative and neutral items and independent of time of day.

Change over 12 h

We next analyzed changes in valence ratings across the 12 h interval (delayed – immediate; Fig. 4). An ANOVA with factors OLD/NEW, GROUP, and VALENCE yielded significant OLD/NEW*VALENCE (F(1,69) = 11.9, P = 0.001), GROUP*VALENCE (F(1,69) = 4.5, P = 0.04), and VALENCE (F(1,69) = 8.8, P = 0.004) effects (all other F(1,69) < 1.3, P > 0.26). Post hoc tests indicated that whereas a period of wake did not result in significant changes for any condition (one sample t tests vs. zero, all P > 0.16), an interval of sleep did.

Figure 4 Change (delayed – immediate) in affective ratings to centrally presented items. Error bars reflect standard error of the mean with between-subject variability removed. Full size image

Specifically, sleep led to less negative ratings of negative items for old (t(45) = 3.0, P = 0.004), but not new (t(45) = 0.2, P = 0.84) items, suggesting that sleep reduces the emotionality of previously memorized negative items. Moreover, sleep led to less positive ratings of neutral items (which, overall, had positive ratings at immediate test), but did so for both old (t(45) = 4.1, P = 0.0002) and new (t(45) = 3.4, P = 0.002) items, suggesting a depotentiation of emotional reactivity to mildly positive material regardless of item novelty. Indeed, while for negative items we observed a significant old/new difference in emotional change across sleep (t(45) = 3.3, P = 0.002), this effect was absent for neutral items across sleep (t(45) = 0.5, P = 0.64). No significant old/new differences were found for wake (negative: t(24) = 1.6, P = 0.11; neutral: t(24) = 1.8, P = 0.09), although, numerically, values followed the pattern seen in sleep. In line with this observation, emotional attenuation was stronger in sleep vs. wake only for the neutral new items (t(69) = 2.0, P = 0.05). In contrast, sleep and wake did not differ significantly for any of the other conditions (all t(69) < 1.5, P > 0.16), indicating that sleep’s effect on emotional ratings was primarily the reduction of positive affect to previously unseen neutral items.