Participants

Twenty right-handed subjects (11 females, age 20–31 years) with no history of neurological or sleep disorders participated in this study. They filled in questionnaires about their sleep habits and had an interview with a sleep specialist prior to recordings. Sleep habits matched the general population standards. Participants were monitored for 7–10 days prior to the recording session through actigraphy and sleep diaries to ensure stable sleep/wake rhythms. The sample size was determined based on previous studies on (i) sensory processing during full-night polysomnographic recordings50 and (ii) learning of acoustic noise using electroencephalographic (EEG) recordings30. This protocol was approved by the local ethics committee (Comité de Protection des Personnes, Ile-de-France I, Paris, France).

Sleep Study and Noise-Memory Paradigm

On the day of the recordings, participants were first familiarized with the stimuli used in our protocol (white-noise acoustic stimuli). They were equipped for polysomnographic recordings and performed an initial pre-sleep phase while remaining awake (Fig. 1c, 41 ± 1 min, mean ± standard error of the mean (SEM) across participants) consisting in the detection of repetitions in acoustic noise (see below). They went then to bed and were asked to perform the same task as long as they were awake. Stimuli were continuously presented over the whole night (sleep phase: 494 ± 20 min). Finally, upon awakening, participants underwent a memory test (post-sleep phase) without being explicitly told so; i.e., they were instructed to keep on detecting repetitions within noise trials (89 ± 2 min). Polysomnographic equipments were removed at the end of the post-sleep phase.

We used a variant of the noise-memory paradigm (Fig. 1b)29, which had been optimized for Electroencephalographic (EEG) recordings30. Recording sessions were preceded by a short familiarization phase during which we played sounds with or without repetitions to participants while indicating to them which stimuli included or not repeated patterns. Then, each recording session was separated in three different phases (Fig. 1c). In the initial pre-sleep phase, participants were instructed to discriminate the following: (i) noise stimuli (N, duration: 3.5 s), i.e., acoustic stimuli made of ever-changing white noise and thus deprived of any repeating sequence, (ii) repeated-noise (RN) stimuli in which a 0.2 s white-noise target was presented 5 times to listeners (Fig. 1b). In RN trials, the noise targets were interleaved with ever-changing white-noise fillers to keep stimulus duration similar to N trials (3.5 s). The first target was presented 0.8 s after stimulus onset and targets were presented every 0.5 s. Both target and fillers being made of white noise (no sample-to-sample predictability), such concatenation is seamless as illustrated in Supplementary Fig. 2 (no change in sound envelope for example). Repeated noise targets differed from one trial to the other. In the RN condition, we thus introduced a repetition of the same piece of acoustic information within but not across trials (Fig. 1b). A different set of RN targets was presented to each participant (Fig. 1a). Unknown to participants, another set of repeated targets (N = 5 for each participant) was randomly selected to be recurrently presented across the entire pre-sleep phase. Such trials were termed RefRN stimuli and correspond to the presentation of the same targets both within and across trials. Classically, RefRN trials are associated with improved repetition-detection performance compared to RN trials29,30,31, 69. From the perspective of participants, RefRN and RN trials differed only through prior exposure as they shared the same structure. Thus, the difference in performance between RefRN and RN trials can be used to titrate longer-term perceptual learning. The ability to differentiate RN trials from N trials on the other hand may involve the rapid formation of memory to noise30 (Fig. 1a). We provide two audio exemplars of N (Supplementary Audio 1) and RN/RefRN (Supplementary Audio 2) stimuli.

Lastly, a fourth type of stimuli (Reference Noise, RefN) was introduced to balance the number of trials with and without repeating patterns. In these trials, the 0.3-s-long noise snippets used to build RefRN trials were used and injected every 0.5 s. However, contrary to RefRN, we used different RefRN targets to build a single RefN. Thus, there was no within-trial repetition of a target in RefN trials but RefN trials did contain fragments that were previously played to participants (RefRN targets) and potentially learnt. Our expectation was that RefN trials would probe participants to wrongly indicate the presence of repetitions due to the presence of known fragments. However, these RefN trials did not differ from N trials in the pre-sleep phase, neither regarding behavior nor EEG recordings, and thus they were not further analyzed.

Response handles were attached to participants’ hands, who were instructed to indicate the presence of a repeating pattern by pressing the right or left handle (the ‘response-side/stimulus-condition’ mapping was counterbalanced across subjects). Response-side and reaction times (RTs) were recorded for further analysis. Participants were instructed to remain awake and to respond to stimuli during the entire pre-sleep phase while remaining eyes-closed. Stimuli were played every 5.5 to 7.5 s (jitter: uniform distribution) with a break every 64 trials.

In the sleep phase, similar stimuli were used. N and RN stimuli were freshly generated for each N or RN trial. However, different sets of RefRN targets were played in periods of wakefulness (same as the pre-sleep phase), NREM (N = 5 NREM RefRNs), and REM sleep (N = 5 REM RefRNs) according to an online assessment of vigilance states (Fig. 1c). In practice, when participants were awake, only the RefRN containing the wake targets were played (wake RefRN). In NREM sleep (NREM2 and 3), the NREM set of RefRN was played to participants, and, in REM sleep, the REM set was used. Each time participants awoke, the RefRN list was set back to wake RefRN targets. In addition, when the NREM or REM RefRN sets were played, the duration of stimuli was increased (6 s instead of 3.5 s in wakefulness) in order to double (10 vs. 5) the number of within-trial repetitions in RefRN and RN trials. Yet, the general structure (0.2 s-long targets separated by 0.3s-long white-noise fillers) was conserved. Participants were instructed to respond to stimuli as long as they would remain awake and to resume responding in case of an awakening. They were verbally remembered to do so by the experimenter, in case of prolonged awakening without responses (no response while participants were awake and stimuli were being played for about 5 min). Stimuli were played every 6.5–9.5 s in wakefulness and every 9–12 s in sleep (jitter: uniform distribution).

Finally, in the post-sleep phase, participants were tested on all RefRN targets presented in the pre-sleep and sleep phases (N = 5 wake, NREM and REM RefRN targets per participant) along 5 new RefRN targets. Task instructions remained the same (detection of repetitions in noise) and participants were not informed of the presence of previously presented noises. Each RefRN was tested in a separate 5-minute block along freshly generated RN and N trials and was presented 8 times. The order of presentations of wake, NREM, REM, and new RefRN was randomized. Stimuli were played every 5.5–7.5 s (jitter: uniform distribution). Participants were instructed to remain awake and to respond to stimuli in the entire post-sleep phase. However, in some cases, participants failed to indicate the presence or absence of repetitions. Post-sleep blocks with more than 20% trials without responses were excluded from our analysis (17 out of 400 blocks in 20 participants). Participants never received feedback on their response in the pre-sleep, sleep and post-sleep phases.

All stimuli were randomly generated to create acoustic white noise (sampled at 44,100 Hz). Each stimulus is therefore made of thousands of normally distributed numbers. White-noise stimuli have a flat spectrum on average, constant amplitude envelope, and are deprived of short-term regularities (no sample-to-sample predictability) or salient features making the detection of any pattern very difficult (Supplementary Fig. 2). In addition, as stimuli were randomly generated, prior exposure could be precisely controlled as the probability, for each participant, to have encountered the exact same noise segments before the experiment is close to 0. The white-noise learning paradigms provide therefore a unique opportunity to investigate the learning of novel sensory information. Stimuli were presented to participants using the PyschToolbox extension70 for Matlab (Mathworks Inc., Natick, MA, USA) and were played at 50 dB (soundcard: Echo Indigo, Echo Digital Sound Corp., Santa Barbara, CA, USA) through a loudspeaker placed near the bed to ensure comfortable listening conditions while minimally disturbing sleep.

Contrasts of interest and expected results

As thoroughly discussed recently30, the noise-memory paradigm allows exploring the rapid formation of memory to noise at different time scales. The fact that listeners could discriminate between RN and N trials demonstrates their ability to detect the reoccurrence of a nondescript noise segment embedded in running noise after only few presentations (max: 5 in the pre-sleep phase) and despite the statistical similarity between targets and fillers. Therefore, the RN vs. N contrast reveals the formation of a form of shorter-term memory to noise (Fig. 1a, right). On the contrary, RefRN and RN stimuli have identical structures (Fig. 1b). They only differ through participants’ prior exposure. Improvement in repetition-detection performance for RefRN compared to RN trials can only be explained by the formation of longer-term memory to noise (time scales of minutes or hours; Fig. 1a, right). Such longer-term learning of acoustic noise has been confirmed by several studies29,30,31, 69. Importantly, Agus and colleagues showed that such learning was preserved after 2 weeks29. We thus used the RefRN vs. RN contrast to focus on longer-term memory (across-trial) while the RN vs. N trials were used to target shorter-term memory (within-trial; Fig. 1a). The RefRN vs. N contrast focuses on the cumulative effect of shorter- and longer-term memory.

Electrophysiological recordings

Participants were equipped for polysomnographic recordings according to the ASSM guidelines38. We continuously recorded electroencephalographic (EEG, N = 19 derivations, 10–20 montage), electro-oculographic (EOG, N = 2 derivations, placed above and under the right and left canthus, respectively), electromyograhpic (EMG, one derivation on the chin and two derivations on right and left abductor pollicis brevis (thumb flexor muscle) recording muscle activity associated to hand responses), and electrocardiographic (ECG, N = 1 derivation) data in parallel with video monitoring. To ensure the reliability of data collection through hours of recordings, AgCl electrodes were attached to participants’ scalp using an adhesive paste (EC2, Natus Neurology Inc., Middleton, WI, USA). This technique, while minimizing electrodes’ displacement, limits the number of channels that can be recorded. Electric signals were amplified through a B1IP or B2IP MEDATEC amplifier (Medical Data Technology SPRL, Bruxelles, Belgium). The signal corresponding to the EEG and EOG channels was recorded as the difference in voltage between each sensor and a ground electrode placed on participants’ scalp, near the vertex (i.e., near Cz). EEG electrodes were re-referenced offline to the averaged mastoids, and EOG electrodes were re-referenced to the opposite mastoids. During recordings, both EEG and EMG were re-referenced to the opposite mastoids. EMG consisted in bipolar derivations with two recording electrodes placed few centimeters apart on participants’ skin. EEG, EOG, ECG, and EMG data was recorded at a 200 Hz sampling rate. Impedances of scalp electrodes were generally below 5kΩ. An external channel was used to synchronize EEG data with stimuli presentation times.

Participants were constantly monitored during both wakefulness and sleep. As explained above, during the sleep phase, a given set of RefRN (wake, NREM, or REM) was selected according to participant’s vigilance state. To do so, the vigilance state was assessed online using standard guidelines38 by an experienced scorer (TA) and confirmed offline by two scorers (TA and DL) blinded to experimental conditions (see below and Supplementary Table 1).

Behavioral indices of perceptual learning

To behaviorally assess listeners’ ability to detect the presence of repeating noise segments, we computed their sensitivity to the presence of these repetitions by means of a d′ index71. The d′ index has the advantage to take into consideration participants’ biases for one response (presence of repetitions) or the other (absence), facilitating the averaging across participants. A significant deviation of the d′ from 0 indicates participants’ ability to reliably discriminate the two conditions of interest at the group level. d′ indexes were computed for RefRN and RN conditions independently and for each participant (see Eqs. 1 and 2):

$$d_{{\rm{Re}}\,{\rm{fRN}}}^\prime = z\left( {{\rm{Hi}}{{\rm{t}}_{{\rm{Re}}\,{\rm{fRN}}}}} \right) - z\left( {{\rm{F}}{{\rm{A}}_{\rm{N}}}} \right)$$ (1)

$$d_{{\rm{RN}}}^\prime = z\left( {{\rm{Hi}}{{\rm{t}}_{{\rm{RN}}}}} \right) - z\left( {{\rm{F}}{{\rm{A}}_{\rm{N}}}} \right)$$ (2)

where z(x) corresponds to the z-score for proportion x, Hit RefRN corresponds to the proportion of correct responses for RefRN trials, Hit RN corresponds to the proportion of correct responses for RN trials, and FA N corresponds to the proportion of incorrect responses for N trials. Extreme performances (100%/0%) were adjusted to the equivalent of half of a single correct/incorrect response71 to avoid infinite d′ values. As previously shown, RefRN trials were associated to higher d′ indexes compared to RN trials (Fig. 2a, top).

Reaction times (RTs) also capture the formation of memory traces to noise30. Typically, RefRN trials lead to faster responses, often anticipating the end of the stimulus presentation window (<3.5 s; Fig. 2a, middle). We therefore combined the improvement in response accuracy and rapidity to titrate the amount of learning. To do so, we used the Behavioral Efficacy (BE) index, which we used in a similar experimental context30. Inspired by the Inverse Efficiency Score72, BE was defined as follows:

$${\rm{B}}{{\rm{E}}_{{\rm{Re}}\,{\rm{fRN}}}} = d_{{\rm{Re}}\,{\rm{fRN}}}^\prime \times \left( {\frac{{{\rm{R}}{{\rm{T}}_{\rm{N}}}}}{{{\rm{R}}{{\rm{T}}_{{\rm{Re}}\,{\rm{fRN}}}}}}} \right)$$ (3)

$${\rm{B}}{{\rm{E}}_{{\rm{RN}}}} = d_{{\rm{RN}}}^\prime \times \left( {\frac{{{\rm{R}}{{\rm{T}}_{\rm{N}}}}}{{{\rm{R}}{{\rm{T}}_{{\rm{RN}}}}}}} \right)$$ (4)

where RTs for RefRN and RN trials were computed from stimuli onsets. Intuitively, BE is increased for high d′, and if the RTs to the stimulus of interest were faster than the N baseline. BE was higher for RefRN trials compared to RN trials (Fig. 2a, bottom).

Behavioral data was analyzed in the pre-sleep (Fig. 2a) and post-sleep (Fig. 3) phases but not in the sleep phases due to the absence of behavioral response during sleep. Trials without responses were not included in behavioral analyses. In the sleep phase, RefRN targets were presented according to participants’ vigilance state. However, in the course of the night, some of these RefRN have been presented around microawakenings, as assessed by a double offline scoring (N = 38 over 100 RefRN targets in NREM sleep and 18 over 100 in REM sleep). However, the isolated presentation of NREM targets during wakefulness can hardly explain the suppressive effects observed for NREM targets. Nevertheless, in the post-sleep phase, to avoid this confound and to make sure that the positive effect for REM targets could not be due to these awakenings, BE was computed when excluding all NREM or REM RefRN heard around (micro)-awakenings (Supplementary Fig. 4c), which led to identical results as in Fig. 3.

Offline sleep scoring of polysomnographic recordings

Polysomnographic data was analyzed using a combination of SPM (Functional Imaging Laboratory, Univ. College London, London, UK), FieldTrip73, and EEGlab74 toolboxes running on Matlab (Mathworks Inc., Natick, MA, USA).

Polysomnographic data (EEG, EOG, EMG, and ECG data) was preprocessed according to established guidelines. EEG data was high-pass filtered above 0.1 Hz and then low-pass filtered below 30 Hz (5th order two-pass Butterworth filters). EMG was were band-pass filtered between 60 and 80 Hz (5th order two-pass Butterworth filter). In addition, EEG, EOG, EMG, and ECG were notch-filtered around 50 Hz to reduce line noise. Vigilance states were assessed online using standard guidelines38 by an experienced scorer (TA) and confirmed offline on 20s-long windows by two scorers (TA and DL) blinded to experimental conditions. Polysomnographic was were continuously scored on 20-s-long windows as follows: wakefulness, NREM sleep stage 1 (N1), NREM sleep stage 2 (N2), NREM sleep stage 3 (N3), tonic REM sleep (tREM), and phasic REM sleep (pREM). The NREM sleep stages were here labeled as NREM1, NREM2, and NREM3 to avoid confusions with the ERP nomenclature. Only Fz, C3, C4, and Pz EEG derivations from the classical 10–20 montage were used for scoring. The disappearance of the rhythms associated to wakefulness such as alpha oscillations ([8–10] Hz) and the apparition of slow rolling eye movements were indicative of the transition to NREM1. NREM sleep hallmarks (K complexes and sleep spindles) marked the transition to deeper stages of NREM sleep (NREM2 and NREM3). REM sleep was characterized by the recovery of an EEG signal similar to wakefulness coupled with a highly reduced EMG and the occasional presence of rapid eye movements (REMs) performed with eyelids closed. Epochs of REM sleep containing at least one REM were scored as phasic REM sleep while epochs of REM sleep without any REM were scored as tonic REM sleep. In addition, epochs showing signs of arousal (body movements, increase in alpha oscillations, or oscillations above 16 Hz) in association with trial onsets were marked, and the corresponding trials were not included in the sleep analyses.

Supplementary Fig. 1 shows representative examples of these different sleep stages and Supplementary Table 1 summarizes sleep scoring across participants. In addition, the spectral profiles of sleep stages were in accordance with the literature (Supplementary Fig. 7b, c). Finally, in NREM sleep, slow waves and sleep spindles were detected using automated algorithms to perform quantitative analyses on the influence of these sleep patterns (see below). Spatial distributions of average densities are shown in Supplementary Fig. 8, which are again in accordance with the literature44, 62.

As the offline scoring was performed post hoc, in some cases the scoring of a given trial did not correspond to the RefRN list that was played at that time. This may be due to errors during the online assessment of vigilance states or to the participant suddenly transitioning to a different sleep stage. To avoid potential confounds, the offline scoring was used as a reference and the corresponding trials were discarded from the analyses of sleep recordings. On average, for the NREM list, this happened 2.5 ± 0.6 times (mean ± SEM) in wakefulness, 4.7 ± 0.8 in NREM1, and 5.5 ± 1.1 times in REM sleep (compared to 207.6 ± 10.0 RefRN trials on average in NREM2, and 144.7 ± 9.1 in NREM3). For the REM list, this happened 1.1 ± 0.3 times in wakefulness, 3.3 ± 0.9 in NREM1, 3.5 ± 0.9 times in NREM2, and never in NREM3 (compared to 80.8 ± 7.4 RefRN trials on average in REM sleep).

Identification and detection of sleep cycles and rhythms

Sleep cycles were individualized using participants’ hypnograms (97 cycles in 18 participants, 5.6 ± 0.2 per participant, mean ± SEM). In sleep cycles having different durations (86 ± 3.6 min), we normalized cycles’ length to be able to average variables of interest across cycles (N = 18 bins). The progression within the cycles was therefore expressed in percentage of the total duration (Fig. 7; Supplementary Fig. 9). Eighty-two (82) cycles in 18 participants were eventually included in the analysis, the others not having enough RefRN or RN trials (at least 20 trials per condition and per bin, see below). Two participants were not included in the sleep-cycle analysis due to the difficulty of clearly identifying sleep cycles.

Slow waves and sleep spindles were detected in NREM sleep using algorithms that have been presented in details elsewhere75, 76. For each slow wave, we extracted its onset, peak-to-peak amplitude (amplitude), down-to-up state slope (slope), number of negative peaks, and spatial expanse (i.e., for a given channel of reference, here Cz, and for each slow wave, the proportion of channels also showing a slow wave in a 100 ms window centered on the reference slow wave’s starting point). For each spindle, we computed its frequency by extracting the peak in power (estimated through a Fast-Fourier Transform, FFT) within a [11, 16] Hz window. Spindles with a frequency below 13 Hz were declared slow spindles and spindles with a frequency above 13 Hz, fast spindles75. Scalp distributions of the densities of detected events are shown in Supplementary Fig. 9. It is worth noting that the slow-wave detection well-replicated recent findings on the changes in slow-wave properties from light to deep NREM62. In particular, the density of slow waves and the number of negative peaks robustly increased during sleep cycles while their slope or spatial expanse decreased (Supplementary Fig. 9). As for the spindle detection, it replicated the known frontal distribution of slow spindles and centro parietal distribution of fast spindles77.

In order to compute the percentage of trials associated with slow waves, fast, and slow sleep spindles (Fig. 6b), we examined, for each trial in NREM2 and 3 stages, whether a slow wave or fast or slow sleep spindle was detected during the presentation of the stimulus. The channel used corresponded to the electrode with the highest density for the corresponding graphoelement (slow waves and slow spindles: Fz; fast spindles: Pz, green dot in Supplementary Fig. 8).

Electrophysiological Indexes of Perceptual Learning

Electrophysiological (EEG) data was first high-pass filtered above 0.1 Hz (5th order two-pass Butterworth filter) and then epoched on large temporal windows ([−14, 14] s) around stimulus onsets. EEG was were then low-pass filtered below 20 Hz (5th order two-pass Butterworth filter), and a notch-filter at 50 Hz was also applied to reduce line noise. A second epoching on shorter windows was performed ([−2, 7] s). Data was corrected for baseline activity after each epoching by subtracting prestimulus activity for each EEG derivation. Minimal artifact rejection was applied for the ERP (Figs. 2 and 4) and power analyses (Fig. 5) trials for which the maximal absolute value of the EEG signal in at least one of the central electrodes (C3, C4, and Cz) was higher than a given threshold (500 μV) were excluded from our analyses. We set here a high threshold to prevent discarding high-amplitude slow oscillations (slow waves, K complexes) as artifacts. Muscular artifacts were not corrected for by other means. It is worth noting that, in the sleep analyses, muscular activity and movements minimally impacted the EEG recordings as trials associated with arousals were marked and discarded during the online scoring. On average, 0.59 ± 0.34% of epochs were removed in NREM2, 0.84 ± 0.48% in NREM3, and 0.65 ± 0.67% of epochs in REM sleep (mean ± SEM across 20 participants). In this study, we focused mainly on central electrodes (C3, C4, and Cz in the 10–20 montage) as these electrodes show the largest responses to sounds35 and noise repetition30. All analyses were performed on these electrodes except when stated otherwise. When analyzing the data from C3, C4, and Cz altogether, MEPs, spectral power or ITPC were computed on each channel independently. The results of these analyses were then averaged across channels for each participant. The corresponding statistical analyses and plots show therefore brain activity averaged across central electrodes.

We focused on either stimulus-locked event-related potentials (ERPs; Supplementary Fig. 7) or target-locked MEPs (memory-evoked potentials: Figs. 2, 4). Stimulus-locked ERPs show EEG potentials triggered by the transition from silence to noise irrespective of experimental conditions. Indeed, we here focused on the late auditory-evoked potentials (AEPs35) occurring within the first 500 ms following stimulus onset and therefore before any presentation of a RefRN or RN target (starting at 800 ms). As classically observed, these AEPs present stereotypical and state-dependent profiles35.

We also computed target-locked MEPs. For target-locked MEPs, the EEG signal was high-pass filtered above 1 Hz instead of 0.1 Hz to get rid of slow drifts (as in ref. 30). These MEPs had the particularity to be computed within the stimulus presentation window. White noise being deprived of significant fluctuations in acoustic energy or salient perceptual landmarks that usually trigger ERPs (e.g., silence-to-noise transition in the case of AEPs), any deviation from the N condition for RefRN and RN trials can be interpreted as an indication that the brain had detected the presence of the repeated noise segment. We termed the ERPs associated to repeated targets’ Memory-Evoked Potentials (MEPs) to emphasize the fact that they parallel perceptual learning30. Comparing AEPs and MEPs can provide means to explore the neural mechanisms underlying MEPs and in particular whether they share common generators. Target-locked MEPs were first averaged from the 2nd to the last target (wake trials: 4 targets per trial; sleep trials: 9) for each trial and then averaged across trials for each participant and condition. A baseline correction (baseline: [−0.1, 0]s before target onset) was applied to each target.

Time–frequency decompositions were performed using the EEGlab toolbox74 on the EEG data epoched around stimulus onsets (Fig. 5). We employed the wavelet method. For a given scalp sensor, we obtained the decomposed signal s(t,f) for each time point (t) and frequency (f) in its complex representation:

$${S_{t,f}} = {A_{t,f}}{{\rm e}^{i\varphi_{t,f}}}$$ (5)

where A(t,f) reflects the amplitude of the EEG signal at a given frequency and time and φ(t,f) reflects its phase.

Power response for each condition and vigilance state was extracted from this time–frequency decomposition (Fig. 5). Power response was normalized by pre-stimulus onset activity ([−0.25, 0] s) and expressed on a log-scale as decibels.

Inter-trial phase coherency (ITPC) was also computed using wavelets. We focused on a frequency band ([1.5, 3.5] Hz) around stimulus presentation (2 Hz) based on previous studies30, 31 and the pre-sleep phase (Supplementary Fig. 3). ITPC describes how the phase of the EEG signal is reproducible across trials for a given condition and participant. Thus, high ITPC values across participants indicate that each participant exhibited a reproducible phase for the corresponding time and frequency (for a given condition), even if the particular phase differed between participants. To compute ITPC, we extracted the phase of the signal for each time and frequency (φ t,f ) and averaged it across n trials using Euler’s formula:

$${\rm ITP}{{\rm C}_{t,f}} = \sqrt {\left( {\frac{1}{n}{{\left( {\mathop {\sum}\limits_n {\cos \left( {{\varphi _{t,f}}} \right)} } \right)\!\!}^2} + \frac{1}{n}{{\left( {\mathop {\sum}\limits_n {\sin \left( {{\varphi _{t,f}}} \right)} } \right)\!\!}^2}} \right)} $$ (6)

The presence of ERPs and the increase in ITPC are tightly linked: ERPs (and MEPs) lead to higher ITPC values as they have a reproducible shape across trials. We recently showed that the increase in ITPC associated to noise-learning could be explained by the presence of MEPs30. However, ITPC has several advantages compared to ERPs: (i) it allows targeting a certain frequency range; (ii) contrary to ERPs, it is not affected by high-amplitude physiological events (such as slow waves) or artifacts; (iii) it can capture non-time-locked activity (see ref. 30 for a comparison between ERPs and ITPC in the context of the noise-memory paradigm). We therefore focused on ITPC to compute an EEG index of perceptual learning (Figs. 4b, 6 and 7).

Such EEG index was particularly useful during sleep where behavioral responses are abolished, preventing the computation of any behavioral index of learning. Based on our previous work and on Supplementary Fig. 3a showing an increase in ITPC for RefRN and RN trials around 2 Hz in the pre-sleep phase, we extracted the average ITPC on a [1.5, 3.5] Hz window and during stimulus presentation (pre-sleep phase and memory test: [0.8, 3.8] s; sleep-phase: [0.8, 5.5] s). In the pre-sleep phase, the ITPC around 2 Hz was larger for RefRN compared to RN and correlated with behavioral performance (Fig. 2d). Thus, ITPC appeared here as a good proxy to assess the occurrence of perceptual learning and quantify it. ITPC was computed on C3, C4, and Cz channels separately and then averaged across these channels for each participant separately.

Lastly, as can be noted in Eq. 6, ITPC depends on the number n of trials on which it is computed. We kept this number identical across conditions: in the pre- and post-sleep phases, for each participant, the condition with the smallest number of trials was chosen as the reference and trials were randomly picked for the other, more numerous condition. During the night, ITPC was computed by dividing all sleep cycles into fixed windows of 20 stimuli presentations (either RefRN or RN). This was done either by cycle when focusing on the within-cycle dynamics (Fig. 7b-c) or when considering the entire night (Figs. 4b and 7d). These fixed windows were slid trial-by-trial. Thus, if a cycle (or a night) contained n RefRN trials, we obtained n-19 ITPC values (that were then binned in 18 bins for each cycle, Fig. 7b). When examining the entire sleep recordings, we pooled data across participants. As the number of windows differed between RefRN and RN trials, statistical tests are unpaired in Fig. 4b and we subtracted the average ITPC for RN trials to the ITPC values for RefRN trials in Fig. 7d.

To obtain the power spectra displayed in Supplementary Fig. 7, we used a fast-fourier transform (FFT) and extracted the power for all trials altogether (and not per condition). We then averaged it in time across the entire epoch ([−2, 7]s).

Statistics

Parametric statistics were used (Student t-tests to compare pairs of variables, Pearson’s method for correlations) when variables could be approximated to the normal distribution (Kolmogorov–Smirnov test). Otherwise, we used nonparametric statistics (Wilcoxon rank-test (u-test) to compare conditions and Spearman’s method for correlations) when data was not normally distributed. All tests applied here were two-tailed tests. When comparing two distributions or a distribution with a reference value, we estimated the effect size using Hedges’ g 78.

For Fig. 3c, two data points were detected as outliers when using an algorithm based on the ‘median absolute deviation’ method (see ref. 43 and the ‘robust correlation’ toolbox for Matlab). We therefore also reported the correlation coefficients when including these two data points in the Results section. However, as these correlation coefficients were not obtained when including all participants, they should be considered with caution.

For the null results illustrated in Fig. 3a, a nonparametric method (Bayes factors) was used to test the plausibility of the null hypothesis34. These Bayes factors are reported in the text. A Bayes factor comprised between 3 and 20 is usually considered as positive evidence for the null hypothesis, while a Bayes comprised between 20 and 150 reflects strong evidence for the null hypothesis79.

We also used a stepwise regression analysis (with forward selection) to examine the respective influence of NREM and REM sleep substages on the learning effects observed upon awakening. The aim was here to better assess the impact of sleep stages on perceptual learning while taking into account the fact that the amount of trials in these sleep stages are not independent from each other.

Statistics used for time and time–frequency plots were corrected for multiple comparisons by means of a cluster-permutation approach80. The rational is the following: each cluster was constituted by the samples (in a 1D (time plots) or 2D (time–frequency) space) that consecutively passed a specific threshold (here, P < 0.05 except for Fig. 7c where P < 0.01). The cluster statistics were chosen as the sum of the t-values of all the samples within the cluster. Then, we compared the cluster statistics of each cluster with the maximum cluster statistics of 1000 random permutations and obtained a nonparametric P-value (P cluster ). Significant clusters are displayed as horizontal bars or contours on plots; P cluster are reported in the text and figures’ legends.

When computing ITPC on small windows throughout the entire sleep recordings (Figs. 4b and 7d) or across sleep cycles (Fig. 7b), we used mixed-effect models to take into consideration the trial and subjectwise variances separately. Subject identity was considered as a random effect. Mixed-models analyses were performed in R (R Development Core Team) with the ‘lme4’ and ‘lmerTest’ R packages. In Figs 4b and 7d, we examined the influence of stimulus condition and sleep stages on ITPC values. We estimated the significance of the interactions between these two variables by comparing a model including only fixed effects vs. a model including fixed effects and their interaction. In Fig. 7b, to test the influence of δ-power on ITPC, we compared a model including δ-power as a predictor with a model considering only subject identity as a random factor. All model comparisons were performed with chi-square (χ 2) tests. The corresponding χ 2 and P-values are reported in the text.

Data availability

All the relevant data is available upon reasonable request. Inquiries should be directed to the corresponding author.