During the controversy about James Damore’s infamous memo, which I briefly discussed here, I read a lot of nonsense from a lot of people, who clearly don’t understand much about this debate. If you have been following the debate about the underrepresentation of women in philosophy, which I discussed in a post a few month ago, you were no doubt familiar with much of that nonsense, for the same confused arguments were used in both cases. In fact, every time the kind of issues discussed by Damore come up, the same nonsense inevitably show up in the discussion. So I thought it might be a good idea to write a post in which I use a simple model to explain 1) how large disparities can result from differences in abilities and/or preferences between groups even in the absence of discrimination and 2) what effects giving preferential treatment to the members of underrepresented groups can have when such differences exist or even when they don’t.

My goal won’t be to argue that such differences exist, because others have already done so, but only to explain what happens if they exist. (A lot has been written on the evidence that men and women differ in abilities and preferences, as well as on the causes of these differences. For instance, you can read Scott Alexander’s reply to Adam Grant on Slate Star Codex, Artir’s ongoing series of posts about this on his blog, Sean Stevens and Jonathan Haidt’s review of the literature on Heterodox Academy and a piece on Quillette with reflexions on Damore’s memo and the controversy that ensued by Lee Jussim, David Schmitt, Geoffrey Miller and Debra Soh. I don’t agree with everything that is said in these articles, but they are pretty good and contain enough references for you to make your own opinion.) I will focus on the underrepresentation of women at Google to illustrate, but it should be obvious how the analysis presented here generalizes. I give some mathematical details for the people who are interested, but don’t worry if you’re not into that (although in principle anyone who finished high school should be able to understand if they read carefully), I also try to explain what’s going on in intuitive terms and use graphs. Toward the end, I will reach a very politically incorrect conclusion, but I will show that it’s probabilistically sound.

Let’s assume that, with respect to ability, both male and female applicants are normally distributed. In other words, let and be the ability of male and female applicants, respectively. For the moment, let’s assume that ability is distributed identically among male and female applicants, hence that and . The actual distribution of ability for the people who apply to Google is probably skewed to the right, but it would make the calculations more complicated and I don’t think it would change my qualitative conclusions, which is what I really care about here. (I only provide numerical examples to help you get a sense of the qualitative behavior of the model.) Moreover, I will assume that Google operates under a purely meritocratic regime, by which I mean that it tries to maximize the ability of the people it hires. According to this article from 2014, Google receives about 3 million applications every year and hires only 7,000 people, which means that under the assumption I just made it picks the best 0.23% applicants and rejects everyone else.

With that kind of decision-making procedure, even if there is no difference in ability between men and women, as long as women tend to be less interested in pursuing a career at Google than men, the company will still hire more men than women. Let be the proportion of men among the people who apply for a job at Google. The situation can be modeled by a random variable whose distribution is a mixture of and . The value of is obtained by selecting with a probability of or with a probability of and realizing the value of the selected random variable.

If men tend to be more interested in such a job than women, then will be greater than 0.5. Let’s assume that , which is approximately the proportion of men among computer science graduates in OECD countries. Since we have assume that there is no difference in ability between male and female applicants and that Google tries to maximize the ability of the people it hires, it follows that 80% of the people it hired are men and only 20% are women. This is exactly what you would have expected without doing any math, but it didn’t prevent many people from denying this totally obvious point after Damore circulated his memo.

Although this fact should be obvious, it’s nevertheless important. It means that, if Google wants to increase the proportion of women among the people it hires, while their proportion among the people who apply doesn’t change because women’s preferences remain the same, it has to hire women who are less competent than some of the men it rejects. This is true even if, as I have assumed so far, ability is distributed identically among male and female applicants. Since there is no doubt that, for whatever the reason, women are less interested in software engineering than men, and since Google has little to no influence on women’s preferences, it’s not difficult to predict what the result of giving preferential treatment to women in hiring would be.

People often claim that increasing the representation of some groups that are believed to be disadvantaged doesn’t come at the expense of people who don’t belong to them and they usually dress this up in the language of inclusion. Here is a good example of that kind of nonsense, from Parisa Tabriz, who is a computer security expert at Google.

Inclusion is not a zero-sum game. Making your team or organization a more inclusive place for X does not mean discrimination against 'not X' — Parisa Tabriz (@laparisa) August 4, 2017

The problem is that, in the vast majority of cases, that’s exactly what it means, because companies, universities, etc. typically have little to no influence on a large demographic group’s preferences, which are shaped by forces that are largely outside the influence of any individual company, university, etc. If the members of the groups whose representation you seek to increase are less interested in a job at your company/a degree from your university or, as we shall see, differ in ability from people who don’t belong to those groups, then in practice you are usually going to discriminate against the latter, no matter how much you try to hide it by talking about inclusion or some other fashionable buzzword.

So far, I have assumed that men and women do not differ with respect to ability, but this is hardly obvious. If as I have assumed, ability is normally distributed among men and women alike, they can only differ from each other because the mean is different and/or because the variance is different. But as I will now explain, if the mean or the variance for men is greater than for women, the underrepresentation of women that would result from purely meritocratic hiring would be even larger, compared to the situation in which there are differences in preferences but not in ability between men and women.

First, suppose that ability is on average slightly higher for men than for women, but that variance is the same. For instance, let’s say that but is 10% of a standard deviation greater than , which is a pretty small difference. Nevertheless, it means that, even if the same number of women applied, there would still be more men among the applicants who make the cut.

As you can see on this graph, if you look at the shaded area, there is a surplus of men among the people who make the cut (it’s the area in dark gray), because the mean for men is higher than for women. (In order for the effect to be easier to visualize, I have used a larger difference in means and a lower cutoff point when I drew the graph, but the logic is exactly the same.) Indeed, if the mean for men is 10% of a standard deviation greater than for women and everything else is the same as before, only 15.4% of the people hired by Google will be women instead of 20%.

Even if the mean is identical for men and women, Google would still hire less than 20% of women under a meritocratic regime, provided that the variance is larger for men than for women. For instance, let’s say that but is 10% larger than , which again is a pretty small difference.

As you can see on the graph, whereas increasing the mean simply translated the distribution to the right without changing its shape, increasing the variance makes the tails of the distribution fatter. (Again, I have used a larger difference in variance, so that the effect would be easier to see.) This means that, if the variance is larger for men than for women, there are more men than women among people with very low ability and among people with very high ability.

If ability is distributed normally, this effect is very strong, i. e. even a small difference in variance between men and women can result in very large disparities at the tails of the distribution. For instance, if the standard deviation for men is 10% larger than the standard deviation for women, as I have assumed above, there will only be 9.2% of women among the people hired by Google. Of course, if both the mean and the variance is greater for men than for women, the resulting disparity is even larger. As you can see on the graph below, even in that case, there is still a lot of overlap between men and women, but things are nevertheless pretty different at the right tail of the distribution. For instance, if the mean for men is 10% of a standard deviation higher than for women and the standard deviation for men is 10% larger than for women, there will be just 6.8% of women among the people who make the cut. Thus, even though I assumed small differences in ability between men and women, the proportion of women among the people hired by Google has been divided by almost 3 compared to what it was when I was assuming that only preferences were different.

This is why people who claim that, given how few women and minorities work at Google, it can be giving them preferential treatment in hiring are confused. For instance, here is what Matthew Yglesias from Vox said at the time, which he evidently thought was very clever.

It seems to me that if Google were actually in a pro-diversity ideological echo chamber their engineering staff would be more diverse. — Matthew Yglesias (@mattyglesias) August 9, 2017

But the fact that women only make up 20% of the workforce among tech workers at Google doesn’t show that hiring takes place on a purely meritocratic basis. If men are overrepresented among people who have the kind of ability that makes you qualified for a job at Google, as they are if ability is normally distributed among people who apply for such a position and the mean and/or the variance is greater for men than for women, hiring probably doesn’t take place on a purely meritocratic basis and the company is sacrificing quality on the altar of diversity.

Now, as I said above, I don’t want to discuss the evidence about differences in ability between men and women, because it would require a whole separate post and others have already done it, if not exactly in the way I would have. However, to be clear, there is quite a lot of evidence that men are overrepresented among people who are very good at programming, independently of the fact that men are more likely than women to be interested in programming. For instance, according to this article, only one woman has ever made it to the finals of Code Jam, a programming competition organized by Google. I haven’t been able to verify this myself, but even if there were a little bit more than one, since we’re talking about hundreds of finalists over the years, this pattern is still overwhelmingly unlikely if we assume that no differences in ability between men and women exist and that women enter the competition at the same rate they graduate in computer science or even at a much smaller rate. So, at the very least, the hypothesis that ability is not distributed identically among men and women should be taken very seriously.

So far, I have assumed that Google has a direct access to the ability of the people who apply for a job, but of course that is not the case. The people in charge of hiring have to estimate the ability of applicants, which inevitably comes with some measurement error. It’s interesting to examine how this uncertainty affects the analysis presented above. A very natural way of modeling a more realistic situation, where the people in charge of hiring at Google don’t have direct access to the ability of applicants but have to estimate it, is to consider a random variable and a random variable , which correspond to the perceived ability of the male and female applicants respectively. In that model, and are the same as before and correspond to the ability of the applicant, while and are error terms. The perceived ability of applicants is the sum of their real ability and a random error, capturing the fact that employers have to estimate the ability of applicants, which inevitably adds uncertainty because of measurement error.

We assume that and . Moreover, we assume that is independent from , and that is independent from . In other words, although Google’s estimate of the ability of people who apply for a job is somewhat inaccurate, the error does not depend on the actual ability of the applicants. Strictly speaking, this assumption is probably false, but it makes the model simpler. For the moment, we also assume that and , though we shall briefly consider what happens when later. What this means is that, to the extent that the perception of ability is biased, it’s biased to the same extent for men as for women. Finally, for the purpose of the calculations I make below, I will assume that both and are equal to .

If ability is distributed identically among men and women, the presence of measurement error doesn’t make any difference, at least as far as the representation of women is concerned. As long as women continue to constitute only 20% of the applicants, they will make up the same proportion of the people who end up being hired. As a result of measurement error, Google will hire some people it wouldn’t have selected otherwise because there were more qualified applicants whose ability was erroneously perceived as lower, but this will occur in exactly the same way for men and women because for the moment we are assuming there is no difference in ability between them, so it will have no effect on the representation of women. If ability is not distributed identically among men and women, however, things start to get interesting.

For instance, if we assume that is 10% of a standard deviation greater than but , 15.8% of the people who are hired by Google will be women. Interestingly, this is a little bit more than before, when I made the same assumptions about the difference in ability between men and women but didn’t take into account measurement error. If we assume that , but is 10% larger than , the share of women among the people who make the cut will be approximately 10.8%. Again, this is a little more than before, under the same assumptions but without measurement error. If the mean for men is 10% of a standard deviation higher than for women and the standard deviation for men is 10% larger than for women, 8.3% of the people who are hired will be women, which is again more than in the absence of measurement error.

Thus, it seems that, if the distribution of ability has a greater mean and/or a larger variance among men than among women, measurement error mitigates the extent to which they end up being underrepresented among the people who make the cut. This is probably not something you were expecting and, as we shall see, the explanation of this phenomenon has a very politically incorrect implication. In order to understand what’s going on, recall that I have assumed that Google hires the applicants whose ability it deems the highest based on the available evidence, where is the number of positions it’s trying to fill. Under this regime, you’re hired if and only if your perceived ability is above some cutoff (determined by the distribution of perceived ability of the applicants and the ratio of positions to applicants), which is the same whether you’re a man or a woman. There is no discrimination based on what group you belong to, you only get hired or rejected based on your perceived ability.

This is what most people who took the side of Damore in the controversy about his memo think hiring should be like under a meritocratic regime. They claim that, since Google has plenty of information about each individual applicant, which it can use to estimate their ability, it would be irrational to rely on group differences in order to decide who to hire. This is how they defuse the accusations of sexism against people who agree with Damore. If this story were correct, the existence of group differences would explain why some groups are underrepresented, but it wouldn’t justify using our knowledge of such differences to make hiring decisions, which in a meritocratic regime should be made purely on the basis of the information we have about each individual applicant. If you are just a little bit familiar with the debate about affirmative action, I’m sure you have often heard that sort of claims.

The problem is that, although it’s the kind of things people are loath to admit, this is actually false. If a meritocratic regime is defined as a decision-making procedure that ensures the average quality of the people hired is maximized, then as long as you know there are differences in ability between groups, meritocracy requires that you discriminate against people on the basis of what group they belong to. This is a dirty little secret that most people on either side of the debate have no idea about, while those who do are very careful not to reveal it, because it’s kind of radioactive. Understanding why is the key to explain why, in the presence of uncertainty about the ability of the people who apply for a job at Google, the extent to which men end up being overrepresented is somewhat mitigated compared to what it would be if Google could directly access the ability of applicants.

As I will now explain, the reason is that, if men are on average more qualified than women and/or the variance of ability is larger among men than among women, the actual ability of a man will on average be greater than that of a woman even if their perceived ability is exactly the same. (As we shall see, this is not true when their perceived ability is low enough and the variance is larger among men than among women, but it doesn’t matter here since we are only looking at people who score very high on perceived ability.) I imagine that many of you will find this claim outrageous, but if you think about it for a second, you will see that, in other contexts (which are not contaminated by politically sensitive issues), you readily accept it.

For instance, suppose that you’re trying to recruit a lifeguard, who needs to be able to swim very fast. You need to decide who, of the two people who applied for the job, you’re going to hire. So you have them swim a 100 meters and applicant A does a slightly better time than applicant B. However, B is a former Olympian, whereas A is just a regular guy. (Suppose, moreover, that A and B have roughly the same age, are physically similar, etc.) Although A did a better time than B when you timed them, the right thing to do is clearly to hire B, because former Olympians are usually much faster than people who never competed at that level. It’s very likely that it was just a fluke if A did a better time than B when you timed them. Perhaps B wasn’t feeling well on that day or something like that. Unlike the claim I made about men and women, I’m sure that almost nobody would have any difficulty admitting that, yet the underlying logic is exactly the same.

In case you’re still not convinced, here is another way to make the same point, which is that information about group membership is relevant even when you’re trying to estimate the ability of individuals. You just have to realize that any information about some individual can be construed in terms of group membership. For example, imagine that you’re trying to hire someone for a job that requires good mathematical abilities, so you ask everyone who applies for the score they obtained on the mathematical part of the SAT. Suppose that one of them, let’s call him A, got a score of 720. Another, equivalent way to describe the situation is to say that A belongs to the group of people who scored 720 on the mathematical part of the SAT. So, just like the fact that A is a man, the fact that A got a score of 720 is really information about group membership, although people don’t typically think about it that way. The distinction between specific and generic information that people often make is entirely arbitrary. We could decide to regard the information about A’s gender as specific and the information about his score on the SAT as generic.

Now, although many people will say that A’s gender does not provide any relevant information about his mathematical ability, it wouldn’t occur to anyone to dismiss the information about his score on the mathematical part of the SAT, yet both can be seen as information about group membership. Thus, if you must ignore the information about A’s gender when you estimate his ability, it can’t be because it’s generic information that isn’t specific to A. Indeed, as I noted above, the distinction between generic and specific information is arbitrary. Both the information about A’s gender and that about his score on the SAT can be seen as information about what groups A belongs to, namely the class of men and that of people who scored 720 on the mathematical part of the SAT. As long as we also have information about the way in which mathematical ability is distributed in these groups, knowing that A is a man and that he scored 720 on the SAT are both informative for someone who is trying to estimate A’s mathematical ability.

It’s true that knowing A scored 720 on the mathematical part of the SAT is more informative than knowing he is a man, because the variance of mathematical ability is much smaller among people who scored 720 on the mathematical part of the SAT than among men. This is why most people have the intuition that knowing A is a man doesn’t provide any relevant information about his ability, at least when you know his score on the SAT. But the truth is that, while the fact that variance is greater makes it less informative, it’s still informative and, even if you also know that A scored 720 on the SAT, it would be irrational to ignore it. If you do, your estimate of A’s mathematical ability will be less accurate than it would have been if you had taken into account the fact that he is a man, at least if you also have information about how mathematical ability is distributed among men. Moreover, if you know that ability is not distributed identically among women, then in general your estimate of A’s mathematical ability should be different than it would have been if he’d been a woman. This explains why, if B also scored 720 on the mathematical part of the SAT or even a little bit higher but is a woman, you should still hire A, because he’s probably better at math than B.

In order to see how strong that effect is, we can use the model I presented above to derive the conditional probability distribution of actual ability given perceived ability for both men and women, i. e. and . I wasn’t sure how to derive this, so I asked on Stack Exchange and someone answered me right away. It turns out that

and

What this means is that, if a man and a woman both have a perceived ability of , their actual ability will on average be

and

respectively.

For instance, if we assume that is 10% of a standard deviation greater than but , then on average 0.59% of the applicants have a real ability that is at least as high as that of a woman whose perceived ability that is just sufficient to make the cut at Google, whereas on average only 0.56% of the applicants have a real ability that is at least as high as that of a man who has the same perceived ability. Similarly, if we assume that but is 10% larger than , then on average 0.63% of the applicants have a real ability that is at least as high as that of a woman whose perceived ability that is just sufficient to make the cut at Google, whereas on average only 0.48% of the applicants have a real ability that is at least as high as that of a man who has the same perceived ability. Finally, if we assume that is 10% of a standard deviation greater than and is 10% larger than , then on average 0.66% of the applicants have a real ability that is at least as high as that of a woman whose perceived ability that is just sufficient to make the cut at Google, whereas on average only 0.48% of the applicants have a real ability that is at least as high as that of a man who has the same perceived ability.

This explains the phenomenon we have observed above, namely that when the mean of ability and/or the variance is greater for men than for women, the presence of measurement error somewhat mitigates the extent to which men end up being overrepresented beyond what you would expect based on the fact that more of them apply. It’s also worth noting that, in some cases, not only but also . In other words, when a man and a woman have the same perceived ability, the real ability of the man higher on average, but there is also more uncertainty about his ability relative to that of the woman. Thus, if you are risk averse, this mitigates the extent to which you should discriminate against women. On the other hand, if you are risk seeking, then it only gives you more reason to discriminate against them.

We can also derive the conditions under which in the model. In what follows, I continue to assume that , even when . It’s easy to show that, when and , then if and only if , which is exactly what you would have expect. One can also show that, in the case where but , if and only if . Hence, in the case where , if and only if . Again, even without doing any math, this is exactly what you would have expected.

Similarly, it can be shown that, when but , if and only if

Thus, in the case where , if and only if

This completes the discussion of the case in which but .

Finally, when both and , it can easily be shown that if and only if

Therefore, in the case where , if and only if

You can use these formulas to see how the model behave for different values of the parameters.

A common argument in favor of affirmative action is that it merely corrects the bias that, if minorities did not receive preferential treatment, would operate against them. In the model I’m using, this amounts to saying that . This argument is a defense of affirmative action on meritocratic grounds. The claim is that affirmative action will actually improve the quality of the people who are hired. The obvious problem with this argument is that, in order to say that, you must be able to show that, in the absence of affirmative action, there would be a bias against minorities, which is often difficult because it’s arguably false. But it’s not the only problem with this attempt to mount a defense of affirmative action on purely meritocratic grounds.

What I have just explained shows that, even if you can show that, it may still be the case that affirmative action reduces the quality of the people who are hired. Indeed, if minorities are on average less qualified, then if you want to maximize the ability of the people you hire then you should discriminate against them. (In the model I’m using, in order for , it has to be the case that

The resulting bias against women that is necessary to maximize the ability of the people you hire rapidly increases as the differences in ability between men and women become larger.) This makes the hurdle of mounting a defense of affirmative on meritocratic grounds even more difficult. It suggests that, if you want to defend affirmative action, it’s probably safer to reject meritocracy. Indeed, I’m not saying that people should discriminate against minorities, only that if you accept meritocracy as I have defined it, then it’s what you ought to do.

Another common argument doesn’t try to justify giving preferential treatment to women and minorities, but claims that we ought to increase outreach efforts toward them. It’s one I have often encountered when I was arguing that, if women are underrepresented in philosophy, it’s mostly because they tend to be less interested in philosophy than men. Some people acknowledge the point, but will insist that, despite the dangers of this approach I have warned against, we must nevertheless attempt to increase interest for philosophy among women. They claim that, if women are less interested in philosophy than men, the profession is depriving itself of many individuals who would be very good at philosophy. In other words, because women tend to be less interested in philosophy, we are currently not tapping into a large pool of philosophical talent. This argument also has a meritocratic flavor. The people who make it often acknowledge that, as we have seen, affirmative action will typically reduce the quality of the field even if there is no difference in ability between men and women. But they point out that, if we can somehow make women more interested in philosophy without giving them preferential treatment, we’ll actually increase the quality of the field.

Now, it’s true that we’d improve the quality of the field by increasing the number of women who pursue a career in philosophy, thereby reducing the underrepresentation of women in the field. However, this isn’t because doing so would increase the proportion of women in philosophy, because the same result could be achieved by increasing the number of men and therefore reducing further the proportion of women! What improves the quality of the field is only that we increase the number of people who pursue a career in it. It doesn’t matter whether they are men or women. The mechanism at work here is just that, as more people pursue a career in philosophy, the number of people in the field with very high philosophical ability increases. But it has nothing to do with the proportion of women among the people who are interested in philosophy per se. I guess one could argue that, insofar as women currently show less interest for philosophy than men, they are the lower hanging fruit. But this is hardly obvious, for although men are more interested in philosophy than women, there are still only a tiny proportion of them who are interested in pursuing a career in the field. Moreover, even if we could somehow change women’s preferences and make them like philosophy more, we’d only deprive other fields of high ability people.

I want to insist that nothing obviously follows about what one should do from anything I have said in this post, and I caution against thinking it has straightforward practical consequences, because figuring out what one should do requires not only that one understand the mathematical issues I have been exploring above, but also that one answer a host of complicated philosophical and empirical questions. For instance, I have shown that if we accept a purely meritocratic conception of what should determine who gets hired and men are overrepresented among the most qualified applicants, then one should discriminate against women even in individual cases because of the presence of measurement error. But this doesn’t show that we should accept a purely meritocratic conception of who gets hired or that men are indeed overrepresented among the most qualified applicants. However, unless you understand the mathematical issues I have been discussing in this post, you won’t be able to address either the philosophical or the empirical issues well.

Indeed, while the issues I have explored in this post don’t force you to adopt any position in particular about affirmative action, not only do they constrain what you can say once you have answered the philosophical and empirical questions, but they should also inform any discussion of the philosophical and/or empirical issues. For instance, you may have thought that meritocracy was the way to go, until you realized it can mean that people should sometimes discriminate against underrepresented groups. Since one person’s modus ponens is another’s modus tollens, this observation could be part of a philosophical argument against meritocracy as I have defined it. Similarly, if you don’t understand that differences in preferences and/or ability between groups can result in large disparities even in the presence of affirmative action, you may conclude like Matthew Yglesias that affirmative action doesn’t exist in cases where it probably does.

Indeed, it’s really amazing the kind of nonsense people write about affirmative action, even when they should know better. For instance, here is something I hear all the time, but in this case the person who says it is professor of sociology in a reputable university.

I teach this every semester in my undergrad class and their heads explode pic.twitter.com/dsVcntP1b4 — Tressie Mc (@tressiemcphd) August 5, 2017

In fact, according to this paper from 2004, the odds of being admitted to an elite university are 50% higher if you’re a woman, after you control for race, SAT, etc. They are 550% higher if you are black, 418% higher if you’re an athlete, 305% if you’re a legacy. (The claim that men are the greatest beneficiaries of affirmative action in college admissions comes from this article in the New York Times, where absolutely no evidence is given to support it, presumably because there isn’t any.) There is plenty of room for legitimate disagreement on affirmative action, but as long as people don’t get clear on the relevant facts, the debate will continue to be plagued by that kind of nonsense.