For behaviour data analysis, we compared mean RTs for correct responses between male and female subjects as well as error rates and post-error slowing (calculated as the difference in RT between post-error and post-correct trials). Therefore, three regression analyses as well as a control analysis (see Methods) were calculated and we report Bonferroni corrected p-values for 14 tests as well as 99.9% confidence intervals (CI). All of these analyses included age as a separate factor as well as other regressors to increase the specificity of the observed effects (see below). Results of overall task performance are reported in the Methods and Supplemental Material sections.

Error Rate and RT

First, we found that the total number of errors committed in the task was not modulated by the factor sex (b = 0.04, p = 1, 99.9% CI = −0.19–0.27) and males committed on average 155 (14.4%) and females 153 (14.2%) errors. For general RT, we included gender, age and the number of errors into a linear regression model. This revealed that male subjects responded on average around 16 ms faster on correct trials compared to female subjects (Fig. 1, b = 0.42, CI = 0.20–0.64, p = 7.39 × 10−9). Participants’ age had no effect on RT (b = −0.06, CI = −0.17–0.05, p = 1). Subjects who made fewer errors also responded slower on correct trials (b = −0.31, CI = −0.41– −0.21, p = 9.63 × 10−21) indicative of individual differences in emphasizing speed or accuracy. Note that we excluded all post-error trials from this analysis to not confound the results with error-induced RT changes. On errors themselves, male subjects again responded faster (ΔRT = 9 ms, b = 0.24, CI = 0.01–0.46, p = 0.009).

Figure 1 Sex Effects on Reaction Time and Post-Error Slowing. (A) shows the regression weights of factors sex, age, number of errors on correct trials’ RT revealing a significant effect for factor sex using multiple linear regression analysis. (B) RT broken down by factor sex showing that male subjects responded on average ~16 ms faster than female subjects. (A,C) Participants who committed fewer mistakes also responded slower indicating a speed accuracy trade-off. There was no effect of age on RT (A). (D) Female participants displayed significantly higher post-error slowing, which corresponds to an increase in RT of 20 ms compared to male subjects (E). The general RT also had a small effect on post-error slowing and participants who responded slower displayed higher post-error slowing (F). Note that for the analysis presented in (A) all correct trials following errors were excluded and thus the higher RT seen in female subjects cannot be explained by the post-error slowing effect. (A,D) display regression weights while (B,C,E,F) display raw values. Error bars = 99.9% CI, * = p < 0.05, ** = p < 10−4, *** = p < 10−8 following Bonferroni correction. Full size image

Post-Error Slowing

For post-error slowing, we included age, the number of errors, as well as the mean RT into the model in order to investigate whether sex had an effect on post-error slowing over and above the observed difference in RT. We found that the amount to which female subjects slowed their responses down following mistakes was significantly larger compared to male subjects (b = 0.47, CI = 0.24–0.70, p = 4.29 × 10−10). This corresponded to a post-error slowing increase of 20 ms or 42% compared to male subjects (Fig. 1D,E). Neither age (b = 0.03, CI = −0.15–0.08, p = 1) nor the number of errors committed (b = −0.06, CI = −0.14–0.08, p = 1) had an effect on post-error slowing. However, subjects with generally higher RT also showed higher post-error slowing (b = 0.14, CI = 0.03–0.26, p = 0.0006). We furthermore conducted a control analysis by calculating post-error slowing with respect to the error preceding trial35, which accounts for possible general shifts in attention during the task. However, this did not qualitatively alter results (see Methods and Supplemental Material). We also normalized PES by each subjects’ RT on correct trials by dividing the PES measure by the mean standard trials’ RT36 to account for differences in general RTs. Again, results remained qualitatively unchanged demonstrating a larger RT increase in women (16.1 ± 1.0%) compared to men (11.9% ± 0.9%, b = 0.47, CI = 0.23–0.70, p = 1.05 × 10−9). Furthermore, an exploratory analysis of sex effects in post-error differences in accuracy revealed no sex effect on post-error increases in accuracy (PIA; corrected p = 1).

Analysis of Distractibility

As it has been reported that women are more distracted by irrelevant and conflicting task information, we compared RT increases induced by the congruence of the presented stimuli. Incongruent trials led to higher RT across subjects (ΔRT = +62 ms, t 873 = 121.5, p = 0 within machine precision) and we analysed the difference between congruent and incongruent trials (congruency effect). There was a small but significant sex effect (b = 0.30, CI = 0.06–0.54, p = 0.0004), which was caused by women displaying on average a 5 ms larger congruency effect. Furthermore, we tested whether the overall gender-related RT difference was found on both congruent and incongruent trials and the gender effect remained significant in both cases (ps < 10−5).

Error Related Brain Activity

We used a two-stage analysis approach: first we identified time and location (i.e. electrodes) of maximum error-related activity in the task and then used these for analysis of second level effects. Therefore, we employed single-trial robust regression to obtain a regression weight time-course for error-related activity locked to response onset for all electrodes37. This model included various regressors to control for possible confounds such as each trial’s congruency, flanker distance and reaction time (see Methods and Supplemental Material for more information about the model). First level regression weights were scaled by their respective standard errors and thus are comparable across subjects and regressors. From this model (Fig. 2A,B) we found the maximum amplitude of negative-going error-related EEG activity at electrode Cz 64 ms following response onset – compatible with the ERN. This was followed by a consecutive positive covariation peaking at 226 ms again at electrode Cz, reflecting the Pe.

Figure 2 Error Effects on EEG Activity. Scalp topographies of response-locked regression weights show the classical ERN and Pe succession (A). Maxima for ERN and Pe were found at 64 ms and 226 ms, respectively, which both displayed central scalp topographies (B). Associated p-values for t-tests of within subject regression weights for a difference from zero are displayed with logarithmic scaling in (C). (D) Shows regular ERPs, which do not account for error-unspecific task effects (see Supplemental Material for details). Full size image

Gender Differences in Error-Related Brain Activity

We then used a second level regression model including each participants’ sex, age and the number of committed errors as predictors to model first level results of the error regressor. To determine effects, we used the exact time of global maximum effects (Fig. 2B) from a contrast versus no effect of the first level model. We found a significant effect for predictor sex with a peak observed at electrode Cz. Here, at the time of the maximum effect of the error regressor across all subjects (64 ms), men displayed significantly higher error-related brain activity (Fig. 3, robust regression t 859 = 7.14, p < 10−11, averaged regression weights for female subjects: −6.9 ± 3.6, males −9.3 ± 4.4, Cohen’s d = 0.60). No sex effect was observed at the peak of the Pe effect (226 ms, t 859 < 0.1, p = 0.98) and additionally participants’ age did not significantly modulate error regressor time-courses (all corrected p > 0.05).

Figure 3 Results of Second Level Regression Analysis for Sex Effects. Displayed are mean regression weights of the error regressor at electrode Cz from the first level analysis for males and females separately. Larger error-related activity was found in male subjects during the time of the ERN and the effect showed a fronto-central scalp distribution (upper topography). Apart from this error regressor effect in the ERN time, no other time points including the Pe (226 ms) showed significant differences (lower topography). The topography plots display second level regression weights and all non-significant (p > 3.3 × 10−5) electrodes are masked out in white. The grey shaded area marks the time of significant effects that survived Bonferroni correction. Note that the analysis included factors age and error number as regressors of no interest on the second level, which did not significantly alter activity at both time-points at this electrode. Full size image

We then included an additional regressor into the model that controlled for each participants’ average RT in order to investigate whether the behavioural difference in RT may explain the differences seen in error-related brain activity. We found that RT itself significantly influenced error regression weights and participants with lower RT showed higher amplitudes (64 ms at electrode Cz robust regression” instead of just “robust regression t 859 = 10.75, p < 10−24). While inclusion of RT reduced the sex effect on error-related brain activity, it remained significant (t 859 = 5.26, p = 1.79 × 10−7), indicating that male participants showed higher error-related brain activity in the ERN time window over and above also displaying lower RT. See Supplemental Material for an analysis of regular error-related ERPs.

Gender Prediction based on Multivariate Pattern Classification

Given the current debate whether or not a dimorphic distinction between male and female brains is a valid category, we also thought to assess whether or not these statistical differences could be employed to form a categorical distinction. Therefore, we used multivariate pattern analysis of the peak latency error regression weights of the whole scalp to train a support vector machine on the prediction of participants’ genders. Using 500-fold cross-validation of the data split into training (90%) and independent test sets (10%), we found that the brain response to errors was sufficient to predict a subjects’ gender with 71.6% accuracy (chance = 50.0%, permutation test p = 6.67 × 10−5). A searchlight analysis of the scalp distribution of this information was in accordance with the well-known ERN topography (Fig. 4).

Figure 4 Prediction of Gender by Error-Related Brain Activity. The multivariate classification accuracy based on all sensors was 71.6% based on 500-fold cross-validation. The topography map of accuracies suggests that the main informational content for prediction of a subjects’ gender based on error-related brain activity was located at central electrodes, overlapping closely with the ERN topography. Full size image

Coupling Between ERN and Post-Error Slowing

Next, we investigated possible functional consequences of this differential brain response. We first sought to establish the relationship of single-trial ERN amplitudes to subsequent behavioural adaptation. Therefore, we regressed error-related EEG activity at each data point onto reaction times following error trials including factors of no interest (congruency, response stimulus interval). As expected, we found a negative covariation between EEG amplitudes in the ERN time range displaying a typical scalp topography (Fig. 5A) and consecutive RT (Cz peak 56 ms, b = −0.33, CI = −0.22– −0.44, p = 7.95 × 10−21, Fig. 5B,C) strengthening the relationship between ERN and consecutive adaptation in accordance with other studies20,21. This result confirms that higher, i.e., more negative, ERN amplitudes are associated with higher consecutive RTs. However, we found no gender differences in the strength of this coupling (Fig. 4D,E, robust regression at 56 ms t 859 = −0.50, p = 0.62 uncorrected) suggesting that the degree to which ERN amplitudes influence behavioural adaptation is similar in males and females.

Figure 5 Coupling Between Neural Signals and Consecutive Adaptation. (A,B) Robust regression coefficients indicate across all subjects that the amplitude of the ERN signal on a given error trial covaries with the following trial’s RT. Thus, higher post-error slowing following error trials is associated with higher ERN amplitudes. This effect is reflected in a negative covariation with a centro-parietal scalp topography (A) and minimal p-values coincide with the time of the ERN peak (C). When investigated for gender effects, we found that the coupling between ERN and post-error slowing was indifferent between men and women (E) and no data point survived correction for multiple comparisons at the peak of the ERN (D,F). Note that the effect apparent at the end of the displayed time-window (around 700 ms) as well as the effect around 300 ms likely reflect the actual onset of the next response captured by the regressor itself. Scalp topographies are thresholded at p< = 3.3 × 10−5, shades represent 99.9% CI. Full size image

Additionally, in order to clarify similarities between within- and across-subject associations of ERN and PES, we also quantified post-error slowing within subjects (as described above) and regressed it onto error-related brain signals across subjects. However, this analysis revealed no significant association between ERN or Pe related EEG activity and interindividual variance in post-error slowing (all corrected ps > 0.05 at electrode Cz).

Response Conflict Processing

A possible explanation for the observed gender difference in error-related brain responses could be based on possibly differential response conflict sensitivity or processing between groups, because previous studies found increased ERN amplitudes to be associated with increased response conflict38,39. Therefore, we compared the degree of error-activity modulation induced by the manipulation of the distance between flanking and target arrows, a parameter found to reflect response conflict as suggested by computational modelling38. As expected, we found a strong effect of distance on error-related activity in the early time of the ERN (Fig. 6A–C), which was larger on trials where flankers appeared further away from the target stimulus thus inducing high response conflict (Cz peak at 34 ms, b = −0.88, CI = −0.75– −1.01, p = 8.36 × 10−86). However, we did not observe any difference in this measure depending on gender (Fig. 6D,E, p-value at peak 0.85 uncorrected). This suggests that response conflict processing is comparable between both genders.

Figure 6 Response Conflict Processing. Displayed are results of a first level regression model on error trial EEG for a regressor that coded for the distance between flanking and target arrows. Larger distances caused significantly (C) increased (more negative) ERN amplitudes (B) likely due to increased response conflict in this condition38 and the effect showed a fronto-central scalp distribution (A). However, no difference between male and female participants was found in a second level regression on this factor (D–F) indicating that differences in conflict processing cannot explain the observed difference in ERN amplitudes between genders. Full size image

Comparison of Variances

Another explanation for the observed gender differences could be that women show more variability in their behaviour24 and possibly also in electrophysiological responses – which could corrupt ERP averages and regression results. Therefore, we compared the variance of RTs as well as the within-and across-subject variance in the latencies of error-related brain responses. However, due to RTs generally tending to deviate from normal distribution and a high correlation between mean RTs and SDs across subjects (r = 0.63, p < 10−96), we log-transformed RTs prior to analysis. We found a small effect of gender on RT variance (b = 0.19, CI = −0.01–0.40, p = 0.026) indicating slightly higher variances in female subjects. We also compared the difference in variance between correct and post-error trials, obtaining a similar result (b = 0.23, CI = 0.00–0.47, p = 0.013). Furthermore, we included SDs as a separate regressor for the RT analysis across subjects and the effect of gender remained significant (b = 0.29, CI = 0.12–0.47, p = 7.74 × 10−7). Thus, the reported gender differences for RT and PES cannot be explained by increased variance in behaviour. Note that log-transformation did not qualitatively change RT results reported above.

For EEG latency measures, we compared the latency of minima in individual trials in the 60 ms surrounding the grand average error-related peak activity. We found no evidence of increased latency variation within female subjects (SD men = 10.5, women = 10.6, b = 0.11, CI = −0.13–0.34, p = 1 corrected). As the same may apply to across group comparisons, we also compared variances of latencies of regression weight minima across male and female participants using Bartlett’s test. This revealed no evidence of a group difference (M men = 67 ± 15 ms, M women = 68 ± 15 ms, p for difference of variance = 0.96, see Supplemental Material for an analysis regarding Pe latency). These findings rule out spurious test statistics induced by confounding differences in within- and across-group variances.