We first sought to determine if there was an overall effect of gender across our four gradCPT dependent variables (CO, CE, RT, CV). Previous work in our group has shown changes in performance of the gradCPT task across the lifespan and since we had differences in age between men and women in our sample, we ran the between-groups MANOVA on age-corrected data (see Methods and [ 19 ]). We found that men and women had overall differences in gradCPT performance across all our main dependent measures, which included omission errors, commission errors, reaction time and the coefficient of variance of reaction time, “CV” (F(3, 21,479) = 21.8, p < 0.001, N = 21,484). The overall effect size of gender was small (partial η 2 = 0.037).

Besides strategic differences, an additional possibility is that the gender difference in error rates and variability could simply be driven by reaction time differences between genders. However, this was not the case. When reaction time was used as a covariate as well, the gender differences in CV (F(2, 21,480) = 199, p < 0.001), commission errors (F(2, 21,480) = 7.8, p < 0.001,) and omission errors (F(4, 21,480) = 31.5, p < 0.001) all remained significant. Thus the observed gender effects, showing that women make more errors of omission and have a greater degree of fluctuation in performance, while men make more errors of commission, still stand.

To determine whether the gender differences in commission and omission error rates were driven solely by a shift in strategy (i.e. making increased omission errors as a result of cautious responding to avoid commission errors), we tested whether there was still a gender difference in omission errors when controlling for commission errors, and vice versa. The gender difference was still significant in both cases (effect of gender on omission errors when controlling for commission errors, F(1, 21,480) = 65.5, p < 0.001, effect of gender on commission errors when controlling for omission errors, F(1, 21,480) = 6.56, p = 0.001).

For completeness, we also examined gender differences in d’ and criterion (calculated using omission and commission error rates). However, because omission errors and commission errors may have distinct causes, we did not focus on these analyses. On average, women had a slightly lower d’ (F(1, 21,482) = 102, p < 0.001, partial η 2 = 0.005) and slightly higher criterion, i.e. more cautious responding (F(1, 21,482) = 266, p < 0.001, partial η 2 = 0.012).

To determine which measures drove the significant overall effect of gender, we separately examined the four variables. Significant gender differences were found individually for all 4 (all p < 0.001, Fig 1 , Table 1 ). Men had faster and more consistent reaction times and made fewer omission errors to non-target stimuli (cities). Women made slightly fewer commission errors than men, but this effect was quite small ( Fig 1 ).

Are the observed gender differences related to sociocultural factors?

Although the effect size of gender in our overall sample was small, we observed that the effect of gender were much larger in certain countries than others within our sample. To determine the source of this variation, we next examined whether or not sociocultural differences across countries were associated with the observed gender differences. If sociocultural factors are significantly associated, it provides evidence against a strictly biological explanation of the gender differences we observed. For our first analysis, we used four valid and reliable indices of sociocultural conditions within a country, the Social Institutions and Gender Index (SIGI, family discriminatory code subscale), published by the Organization for Economic Cooperation and Development Development Center, the Human Development Index (HDI) published by the United Nations Development Programme, the ratio of female-to-male labor force participation, published by the International Labour Organization and the World Bank, and the poverty rate, published by the Central Intelligence Agency (Table 2). These indices were chosen because we hypothesized, based on the results of Weber et al., (2014) that the conditions they represent could affect gradCPT performance.

We restricted our data set to include only participants whose country location (from IP address) was recorded during testing (N = 16,606, see S1 Table for details). Each of these 16,606 participants was assigned SIGI/HDI/female-male labor force participation/poverty scores based on their country. We used mixed effects models, with a random effect for country and fixed effects for gender, each index, and the interaction between gender and each index. We found that three of our four indices (excluding poverty) were significantly related to gradCPT performance (see Table 3). In particular, less human development and gender equality were associated with slower reaction times, higher CV (more variability), more omission errors and, somewhat paradoxically, slightly fewer commission errors. There was also a significant interaction between gender and three of the four sociocultural indices within omission and commission errors, demonstrating that although overall average performance was affected by sociocultural conditions, men and women were not affected to the same degree (Table 4). There were no significant interactions between index and gender within reaction time and CV, and notably, there was no significant interaction between poverty and gender in any variable.

To ensure that strategy shifts were not driving these effects, we tested whether there was still a gender*sociocultural index interaction within omission error rate when controlling for commission error rate, and vice versa (i.e. testing for gender*sociocultural index interactions within commission errors while controlling for omission errors). The gender*index interactions remained significant in all cases (S2 Table). Last, to determine if slower RTs were driving the observed significant effects, we subsequently included reaction time as a covariate in all the significant models (commission and omission errors). The overall effects and interactions between gender and sociocultural index all remained significant in all cases.

For illustrative purposes, the impact of gender inequality on performance can be seen in Fig 2, in which the age-corrected gender differences in error rates are compared between the lowest-equality quintile (lowest 8 countries) and highest-equality quintile (highest 8 countries), according to the Gender Inequality Index (http://hdr.undp.org/en/content/gender-inequality-index-gii). Gender differences were larger in unequal conditions than in equal conditions, with men making more commission errors and women making more omission errors. Gender inequality accounted for a 1–2.5% change in error rates.

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 2. Gender differences in age corrected error rates in low and high gender equality conditions. Error bars show standard error. Low and high equality were defined as the countries in the bottom and top quintile of our sample according to the United Nations’s Gender Inequality Index. N = 8 countries per quintile, low equality N = 2,066 (Egypt, India, Pakistan, Bangladesh, Indonesia, South Africa, Brazil, Phillippines) high equality N = 1,657 (Germany, Sweden, Denmark, Netherlands, Italy, Norway, Belgium, Finland). https://doi.org/10.1371/journal.pone.0165100.g002

Correlation analyses using sociocultural indices. To further confirm that the sociocultural indices were correlated with the size of gender difference, we performed Pearson’s correlation analysis between the average gender difference in each country and each of the 4 indices. For these analyses we excluded a single country from which fewer than 20 women participated (total countries = 40, total N = 16,552). For CV and reaction time we found no significant correlation between the magnitude of gender difference and any indices of social conditions (all |r| < 0.35). However, consistent with the mixed model analysis above, for commission and omission errors we found significant correlations (using the Bonferroni correction) between gender difference and indices of gender equality (Table 5, Fig 3). Specifically, we found that in conditions of lower gender equality, men made more commission errors and women made more omission errors, but as gender equality increased, men and women performed more similarly on both measures. These results, using country averages, supported the results of our mixed model analysis above. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 3. Gender difference in omission and commission error rate versus three sociocultural indices. Social Institutions and Gender Index (SIGI), Human Development Index (HDI), and female/male ratio of labor force participation. Residualized gender difference is average women’s age-corrected score minus average men’s age-corrected score. A negative gender difference indicates that men made more errors than women; a positive gender difference indicates that women made more errors than men. Circle area reflects the number of participants from that country, N = 16,552 people, 40 countries. Linear trendline calculated using unweighted country averages. *indicates significance after FDR correction. https://doi.org/10.1371/journal.pone.0165100.g003 PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 5. Pearson correlation coefficients and corresponding p-values for relationships between sociocultural indices and gradCPT gender differences (women—men). https://doi.org/10.1371/journal.pone.0165100.t005

Isolating variation in men’s and women’s performance. The above analyses of gender differences used difference scores (men minus women), which are the most common way to describe differences in performance between men and women. However, using difference scores implicitly defines a model in which variation in men’s and women’s performances contribute equally and oppositely to the resultant measure. This model does not invariably hold true [27,28]. Therefore, we sought to determine which of two models better characterized the gradCPT gender difference/sociocultural association: 1) when women’s performance is considered the condition of interest and men’s performance was considered the control (i.e. women’s performance drives the effects), or 2) the converse, when men’s performance is considered the condition of interest and women’s performance is considered the control (i.e. men’s performance drives the effects). When we used linear regression to remove variation in men’s scores from variation women’s scores, we found that all of the significant correlations in Table 5 remained statistically significant. However, removing variation in women’s performance from men’s resulted in no significant correlations. Therefore, a model in which women’s performance is considered the control, and men’s performance is the condition of interest, does not fit this data as well as the converse model. This suggests that although there is significant shared variance in men’s and women’s performance, it is the unique variance in women’s performance, not men’s, that varies by gender equality across countries.

Effect of sociocultural conditions on d’ and criterion. When we examined the relationship between d’/criterion and sociocultural conditions, we found that criterion scores were significantly correlated to indices of gender equality (r = -0.59 with SIGI, p = 0.0009, r = 0.56 with labor force participation, p = 0.002) and human development (r = 0.51 with HDI, p = 0.006), with women responding more cautiously than men in countries with less equality. Criterion scores were not significantly correlated with poverty rate (r = -0.096, p = 0.632). In contrast, gender differences in d’ were not related to sociocultural conditions (all |r| < 0.2). These measures, which collapse omission and commission errors into a single number, could obscure the relationships between error type, gender, and sociocultural conditions. Since omission and commission errors might represent different aspects of behavior [27] we decided not to pursue further analysis of d’ and criterion (see Discussion).