Patients and recordings

The data were collected in 20 recording sessions in 9 patients (mean age 39, range 19–50, 5 females) with pharmacologically intractable epilepsy. Extensive noninvasive monitoring did not yield concordant data corresponding to a single resectable epileptogenic focus. Therefore, they were implanted with chronic depth electrodes for 7–10 days to determine the seizure focus for possible surgical resection31. Electrode locations were based exclusively on clinical criteria and were verified by postimplant computer tomography (CT) coregistered to preimplant magnetic resonance imaging (MRI). Each electrode consisted of a flexible polyurethane probe containing nine 40-μm platinum–iridium microwires protruding ~4 mm into the tissue beyond the tip of the probe. Eight microwires were active recording channels and referenced to the ninth, lower impedance, microwire. The differential signal from the microwires was amplified by using a 128-channel Blackrock™ system, filtered between 0.3 and 7500 Hz and sampled at 30 kHz. Here we report data from sites in the hippocampus, amygdala, entorhinal cortex, parahippocampal gyrus, anterior cingulate cortex, presupplementary motor area and supplementary motor area. 33 ± 19 units per session were recorded in these areas. All sessions were conducted at the patients’ quiet bedside. All studies conformed to the guidelines of the Medical Institutional Review Boards at University of California at Los Angeles and Tel-Aviv Sourasky Medical Center, and all patients provided written consent forms.

Unit identification and spike sorting

Spike detection and sorting was applied to the continuous recordings by using the well-established “wave-clus” software75. Briefly, extracellular microwire recordings were high-pass filtered above 300 Hz, a threshold was set at 5 SD above the median noise level and detected events were sorted using superparamagnetic clustering. After sorting, the clusters were classified into noise, single-unit or multi-unit based on: (i) the spike shape and its variance; (ii) the ratio between the spike peak value and the noise level; (iii) the interspike interval (ISI) distribution of each cluster; and (iv) the presence of a refractory period for the single units, i.e., less than 1% spikes within 3 ms ISI32. Forty-one percent of the units were classified as single units (see Supplementary Table S1). On average 1.4 ± 0.6 units were identified per wire. We computed several spike sorting quality measures for all identified units (Supplementary Fig. 2): (a) percentage of ISIs below 3 ms was 0.84% ± 1.83% (0.35% ± 0.42% for units classified as single units); (b) the ratio between the peak-to-peak amplitude of the mean waveform of each cluster and the SD of the residuals was 7.9 ± 3.3 (SNR76; 10.1 ± 3.7 for single units); (c) the median L-ratio77, which is the amount of contamination of a given cluster based on the Mahalanobis distance of spikes not included in the cluster from the center of the cluster, divided by the total number of spikes in the cluster, was 0.03 (SD = 0.69).

Stimuli and procedure

Visual stimuli were generated by MATLAB with the Psychophysics Toolbox extension78, running on a 15-inch Apple MacBook Pro laptop. Stimuli were presented on the laptop screen at 60 Hz and screen resolution of 1440 × 900. Responses were collected using a Logitech F310 gamepad.

Selectivity screening session

In a first recording session, usually done early in the morning, a large number of images (107 ± 25) of famous people, landmarks, animals, objects and family members were presented to the patient. This set was composed based on patient’s preferences. 300 × 300 pixels images (5° visual angle) were presented for 1 s followed by a blank screen of 0.5–1 s, and repeated 6–8 times in a pseudorandom order, while patients were engaged in simple discrimination tasks (e.g. person/other, building/other, man/woman etc.). Images that elicited the strongest responses in the screening session (using the same procedure as in ref. 38) were selected for use in the BR session that took place a few hours later. Data from the selectivity screening session is not presented here. Importantly, the BR session (see below) included a non-rivalrous condition in which images were presented normally to both eyes. Units were selected for rivalry/replay analysis based on the results of this condition and not based on the screening session results.

Binocular presentation

During the BR session, the visual stimulation to the left and right eyes was independently controlled using one of the following two methods79:

(1) Red-blue goggles (n = 6 patients): The two visual streams were presented in either red or blue at the center of the screen. Each lens passes only one of the streams, so that the two streams fall on corresponding retinal locations of the two eyes.

(2) Mirror stereoscope (n = 3 patients: patients 2–4 in Supplementary Table 1; one of the five sessions of patient 4 was with red-blue goggles): patients viewed the screen via an adjustable mirror stereoscope (SA200LT, Stereo Aids, Australia www.stereoaids.com.au). Left and right eyes’ visual streams were presented at different horizontal locations of the screen and were projected on corresponding locations of the retinae of the two eyes using the stereoscope.

Identical vergence cues (black and white dashed frames around the images, a 440 × 440 pixels cross (7° visual angle) behind the frames, and a 20 × 20 pixels fixation cross (0.3°) at the center of the images) were presented to both eyes 1 s before the beginning and throughout each of the conditions of the BR session, to ensure binocular fusion.

Note that the mirror stereoscope, used in the first three patients, was cumbersome to use in the clinical setting. Therefore we switched to the red-blue goggles that were more convenient and familiar to patients. Notably, the separation between the two eyes might be incomplete with the red-blue goggles, yet that actually makes it more difficult to find changes in firing locked to the perceptual transitions.

Binocular rivalry session

Based on time restrictions with the patient, a variable number of image pairs (M = 2.70 ± 0.73) were used in each session. Images that elicited the strongest responses in the screening session were usually paired with images that did not elicit a response in the same units. Before recording, patients were carefully instructed with the details of the task and were trained with a demo of rivalry, where transitions physically occurred on the screen, so that the experimenter could confirm that they execute the task well. Then patients wore red-blue goggles (or viewed the screen via a stereoscope, see “Binocular presentation” section above) and completed one or more repetitions of the following three conditions for each pair of images:

Non-rivalrous condition: A slide informing the patient on the assignment of one button (either left or right arrows of the gamepad) to each of the two images was presented. Patients were asked to press the assigned button for each image appearing on screen. Each of the two images (5° visual angle) was presented binocularly to both eyes (hence no binocular conflict; Fig. 1a) 8-10 times, for 1 s or longer, until a correct response was made. Order of presentation was pseudo-random, and one-second interleaving blanks were used between images. To ensure correct button assignment, the non-rivalrous condition was stopped only after 8–10 successive correct responses to each image in the non-rivalrous condition that preceded the first rivalry block, and four correct responses in subsequent blocks.

Rivalry condition: The two images (5° visual angle) were presented simultaneously one to each eye (see “Binocular presentation” section above) for either 90 or 120 s. This type of presentation creates BR, in which each image dominates conscious perception for a certain period while the other image is perceptually suppressed. These dominance and suppression periods reverse irregularly, interleaved by periods of mixed percept (transition/piecemeal periods; Fig. 1b). Note that the physical stimulus is constant hence the perceptual transitions are internally driven. Patients were asked to report perceptual dominance onset of each image by pressing and holding the assigned button for that image, and to immediately release that button as soon as something in the image starts to change (dominance offset/ transition onset11). This scheme of reports provided four behavioral events: image1 dominance onset (button A press); image2 dominance onset (button B press); emergence of image1 = image2 dominance offset = image2→image1 transition onset (button B release); and emergence of image2 = image 1 dominance offset = image1→image2 transition onset (button A release). To avoid unbalanced duration due to ocular dominance the two images were switched between the eyes in the middle of the rivalry condition, by linearly increasing the transparency of the current image in each eye from zero to 100% while linearly decreasing the transparency of the other image from 100% to zero over the course of 1 s. Additionally, two catch trials, in which the same image was presented to both eyes for 1 s, were included. The transparency of one image was linearly ramped up to 100% while the transparency of the other image was ramped down to zero over the course of 1 s before and after the catch trial. Eye-switching and catch trials periods and the following 1 s after these periods were not included in the rivalry analysis.

Replay condition: The four types of reports from the rivalry condition were used to create a matched-duration replay condition that immediately followed the rivalry condition (Fig. 1c). In the replay condition, the same stimulus was always presented to both eyes. During dominance periods of each image (defined as the time between press and release of the corresponding button), that image was presented to both eyes. During transition/piecemeal periods (defined as the time between the release of one button to the press of the other button), the transparency of the previous image was linearly ramped up to 100% while the transparency of the next image was linearly ramped down to zero in both eyes. During incomplete transition periods, in which a button release was followed by the same button press, the transparency of the dominant image was linearly ramped up to 50% while the transparency of the other image was linearly ramped down to 50% over the first half of that piecemeal period, and then the transparency of the dominant image was ramped down back to zero while the transparency of the other image was ramped up back to 100% over the second half of that period. The replay stimulation was designed to externally generate a perception that would closely mimic rivalry, thus resulting in a matched sequence of motor responses in the two conditions (see for example, Fig. 1d). Patients’ task was identical and patients were not informed that this condition is different from the rivalry condition. The replay condition was included for only one (usually the first) or two repetitions of the rivalry condition.

Selectivity analysis

The data from the non-rivalrous condition was used to define selective responses in MTL units. For each image, FR was calculated in three 250-ms bins starting 150 ms after image onset and ending 100 ms before image offset. FR outliers (>2 standard deviations above/below the mean) were discarded. For each bin, FR in all trials were compared to the FRs in a 250-ms window before all non-rivalrous image presentations (baseline FR) by means of a Mann−Whitney U test, using the Simes correction for multiple comparisons39 and applying a conservative significance threshold of p = 0.00538. This procedure identified both positive (i.e. increases above baseline FR; 72% of responses) and negative (i.e. decreases below baseline FR; 28% of responses) responses. Only MTL units that responded to one or more of the images in the non-rivalrous condition were included in the subsequent activity onset analysis. For each image pair, if a unit responded to only one of the images, this image was considered the “preferred image” for this unit, while the second image was the “non-preferred image”. If a unit responded to both images, the image that elicited the response with the lower pvalue, or with the same pvalue but in more bins, was considered the “preferred image”, and the other one was considered “non-preferred”.

Critical events for MTL and frontal units

Two critical events were analyzed for each unit: perceptual transition onset (button release) during rivalry and replay. For MTL units, this pertained to the preferred image only, and analyzed separately for each image pair, while for frontal units, where there was no selectivity to specific images—all transition events were analyzed together, across all images of all pairs. The “Individual unit activity onset analysis” (see below) was focused on these events. Note that all transition onsets to the unit’s preferred image, whether complete or incomplete (i.e. that do not lead to full dominance; 24%), were taken into account. For MTL units, only responsive units from the non-rivalrous condition were included. For frontal units, all recorded units were subject to this analysis. Supplementary Figs. 12 and 13 show the same analysis for perceptual dominance onset events (button press).

Individual unit activity onset analysis

To identify activity that significantly differ from baseline and determine its onset, a permutation test was conducted with cluster-based multiple comparison correction across time points80: at each time point in the [−1500,1000 ms] window around the critical event, the instantaneous FR (iFR; calculated with a sliding square window of 200 ms) was compared to the baseline FR (mean iFR during [−3000,−2000ms] window; first 100 ms were discarded due to edge effect of the square window smoothing) over trials, using a paired two-tailed t-test. Temporal clusters of significant activity were defined as consecutive significant timepoints (p < 0.05) with a maximal gap of 100 ms, and were assigned a cluster-level statistic corresponding to the sum of the tvalues of the time points belonging to that cluster (t-total). The distribution of maximal absolute cluster-level statistics obtained by chance was estimated by repeating the analysis 1000 times with randomly rearranging the spikes in a [−3000,3000 ms] window of each trial while preserving the ISI distribution of that trial. Only clusters with absolute t-total in the top 5% of this distribution (p < 0.05) were considered significant. This method allowed us to both detect positive or negative activity (i.e. significant increase or decrease relative to baseline FR, respectively) and measure its onset. Out of these significant clusters, only the one that was closest to the critical event (t = 0) was included in subsequent analyses. Clusters that started more than 500 ms after the critical event were discarded. Note that the baseline time window was not completely clean—for 15% of the perceptual transitions trials, the [−3000,−2000 ms] time window included the previous dominance onset event, and thus might have also included significance changes in FR. This results from the continuous nature of the paradigm and critically only makes it more difficult to detect an effect.

Population-level activity onset analysis

The above units were further analyzed at the population level. Here instead of conducting a t-test over trials at each timepoint, we averaged the iFR across trials for each trace (a unit response to a certain image-pair) and conducted a t-test over traces at each timepoint. As different traces had different baseline FR and either positive or negative activations (i.e. increased/decreased FR relative to baseline FR), we looked at absolute iFR percent change relative to baseline. To do so we normalized the iFR time-course to 0–1 range, and inverted the timecourse (1-timecourse) of traces that had a negative t-total in the individual unit activity onset analysis. The iFR percent change was calculated as the ratio of the difference (iFR − baseline FR), and baseline FR. The permuted data (n = 1000) was generated by separately shuffling the spikes of each unit and each trial while preserving the ISI distribution. Traces for which the iFR timecourse was inverted in the real data were inverted also in the permuted data. Three units with FR lower than 0.5 Hz were discarded from the analysis.

Bayes factor analysis

We calculated the Bayes factor (BF), defined as the ratio of the probability of observing the data given H0 and the probability of observing the data given H1, using JASP (Version 0.8.5; JASP team, 2017). We adopted the convention that a BF less than 0.1 implies strong evidence for lack of an effect (that is, the data are ten times more likely to be observed given H0 than given H1), a BF between 0.1 and 0.33 provides moderate evidence for lack of an effect, BF between 0.33 and 3 suggests insensitivity of the data, BF between 3 and 10 denotes moderate evidence for the presence of an effect (i.e., H1), BF greater than 10 implies strong evidence and BF greater than 100 suggests extreme evidence for the presence of an effect37.

Automatic trial-by-trial response detection analysis

Trial-by-trial activity onset times relative to the perceptual transition reports were determined by Poisson spike train analysis (Hanes et al., 1995)81. For this procedure, the ISIs of a given unit are processed continuously over the [−1500, +1500 ms] window around the report, and the onset of a spike train is detected based on its deviation from a baseline exponential distribution of ISIs (i.e. an exponential distribution with λ = 1/mean FR in [−3000, +3000 ms] window across all trials). Only spike train onsets <0 (i.e. earlier than the motor response) were considered. Activity onset time in rivalry and replay was determined for each trace as the median onset time for this condition, with the constraint that significant activity onset was recognized in at least 40% of the condition trials for this trace. An example of this analysis appears in Supplementary Fig. 4a. Note that this analysis can only recognize increases in FR relative to baseline, and therefore was not used on traces with decreased FR response.

Bootstrapping analysis

The bootstrapping analysis was aimed at assessing the robustness of the effects found at the population-level activity onset analysis. This analysis was repeated 10,000 times, so that in each iteration a random sample of 36/26 traces (corresponding to the number of MTL active traces in rivalry/replay) was selected with replacement (allowing repetition). The population permutation test described above was conducted for each sample, and the onset was determined both for replay and for rivalry. The bootstrapped onset distributions were plotted (Supplementary Fig. 9), and their means were calculated. Finally, we calculated the chances of obtaining the actual rivalry onset under the bootstrapped replay distribution, and the chances of obtaining the actual replay onset under the bootstrapped rivalry distribution.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.