Last Thursday I gave a talk at the American Philosophical Association’s Central Division meetings about patterns in publication and citation in some of the field’s major journals. I have a more extensive analysis of the data that’s almost done, but that deserves a paper of its own rather than a post. Here I’ll confine myself mostly to descriptive material about some broad trends, together with a bit of discussion at the end.

I examined patters of publication and citation in four high-prestige, nominally general-interest journals within Anglo-American philosophy. The present analysis grew out of some earlier work. A while ago, I took a sample of twenty years worth of articles in the Journal of Philosophy, Mind, the Philosophical Review and Nous. I used it to construct a co-citation network. By examining the references contained in the 2,100 articles in this dataset, and seeing which ones tended to be cited together, it’s possible to construct a picture of the works the philosophers publishing in those journals have been talking about, and how those works are related. Our 2,100 articles between them cite about 34,000 items, forming a network with about a million edges. The network was relatively easy to thin out in part because philosophers don’t cite much. The articles in the dataset typically cited only about fifteen items, which is very low in comparison to other fields. This is a benefit from my point of view. It makes citation more informative in philosophy than it is in other fields. By looking at which items tended to be most cited together, a nice picture emerges of the discipline’s topical structure as it existed in these journals during this period. You can read the original post and follow-up items for more details on the methods and results.

Figure 1. Co-Citation Network of top 500 most-cited items over 20 years. Main topical locations fancifully labeled; items authored by women marked with red dots.

One feature of these results was that in the network of the five hundred or so most-cited items, only nineteen were by women. Some quite large components of the graph (including, for example, the part I think of as Epistemic Island) had no women authors at all. Within the network, the most-cited author—David Lewis—had twice many items on the graph as all of the women combined. These findings fed in to a wider debate amongst philosophers about women in the field and prompted me to take a closer look at the gender composition.

A natural thing to do at this point would be to code all 34,000 items in the citation network, not just the top five hundred most-cited ones, and do some analysis from there. This is much easier to say than to do. The records are messy. More often than not items are cited by last name and initial only. Throwing those away is far too lossy and would introduce a lot of bias in any estimates. This and other features of the raw data make it difficult to process. We’re still working on more or less clever ways to tackle this issue, but in the meantime what we’ve done is take a look not at the network of cited items but at the 2,100 or so published articles that form the basis of the dataset. The talk at the APA was about those papers, as is the rest of this post.

My student Nick Bloom and I coded the 2,100 papers by gender. This was tricky at times, but we were careful. (As Nick somewhat bitterly remarked at one point during the validation phase, “Only in Philosophy do Hilary, Shelley, and Jody all turn out to be dudes.") These papers were written by just over 1,100 authors. Again, these are all the papers published between 1993 and 2013 in Nous, Phil. Review, Mind, and J. Phil. We set aside co-authored papers for now, but in any case philosophy is a field where co-authorship is extremely rare.

Publications by Women in Each Journal

Let’s begin by looking at how often women are published in each journal. For the data as a whole, 87.5 percent of articles are by men and 12.5 percent by women. This varies a bit by journal. For all four journals considered together the percentage of female authors rose from about eleven percent in 1993 to about fourteen percent in 2013. But the annual totals bounce around a bit. Figure 2 shows the annual percentage of articles in each journal written by women over the twenty years.

Figure 2. Percent of articles by women in each journal, 1993-2013.

Within each journal, as we would expect, we see some variability. Nous puts out more issues than the other three, so its trend line is a bit smoother. It also managed to publish more than zero articles by women every year between 1993 and 2013. By contrast, J. Phil., Mind, and Phil. Review each have two years in the time series where they published no women at all.

The Matthew Effect is a Harsh Mistress

If 87.5 percent of articles are by men, what about the rate of citation? The data includes a measure of how many total citations each article has received, based on mentions in the large database of journals tracked by Web of Science. This under-counts the true number because it doesn’t count citations in books, or in journals not tracked by WoS. On the other hand, WoS has wide coverage and counts should be consistent. Figure 3 is a histogram showing the distribution of total citations to all articles in the data.

Figure 3. Citation Frequency. Publication is not the same as influence.

The story here is rather sobering and, if you’re familiar with the literature on citation, unsurprising. Citation counts are highly skewed. Even though these are all peer-reviewed articles published in high-prestige journals, almost a fifth of them are never cited at all, and just over half of them are cited five times or fewer. A very small number of articles are cited more than twenty or thirty times. Getting cited just twenty five times is enough to put a paper in the top decile of the distribution. (As I said, philosophers don’t cite each other much. Although English-speaking philosophy is a an outlier within the humanities in many ways, the very low rates of citation to other work is one of the most consequential ways philosophy remains much more like the English department than anything that goes on over in the Science or Engineering buildings.) The top one percent of papers are cited seventy five times or more. The most-cited paper in the data has just shy of 300 citations. To make the long tail of this distribution easier to see, we can put the x-axis on a log-like scale.

Figure 4. Citation counts, IHS scale.

Are Articles Written by Women Cited Less Often?

From the co-citation analysis we already know that within the articles published in our four journals women make up just 3.5 percent of the 500 most-cited items. We don’t have a baseline for the number of potentially citeable items here in general, nor do we know whether that 3.5 percent is proportional to the number of women amongst the full count of cited items. (This was one of the motivations for wanting to code all 34,000 by gender.) For the case of the articles themselves, though, we do have a base rate: 87.5 percent of the published articles are by men, and 12.5 percent are by women. If we add up the total citations held by those articles, we find that articles written by men have 88 percent of the citations, and those by women have 12 percent of the citations. So at this level of resolution, things are proportional in the sense that the share of citations to articles by women lines up with the overall share of articles by women. On the average, articles by women are not cited less often than articles by men. It’s the very low base-rate of articles by women that’s driving things.

Skewed Citation Rates and Gender

We’re not quite done, though. Overall, citations are proportional, given the low base rate of women in the field. At the same time, rates of citation in general are extremely skewed. It’s worth looking more closely about what these two things mean together. First, let’s look at articles spread out by age, as naturally the older pieces will tend to have more citations. Note that the age-range here doesn’t track cumulative citations, so we don’t see the true careers of individual articles. Instead we have a snapshot of the current state of articles of different ages. In Figure 5, below, each article is a dot; blue dots are male-authored, red dots are female-authored. We fit lines for men and women, using a model appropriate to the overdispersed count data (based on a negative binomial distribution). The y-axis is transformed so that the skew doesn’t dominate everything else. The points are jittered a little just to make them easier to distinguish from one another.

Figure 5. Citation Frequency by Age of Article, for Men and Women.

As you can see, the fits overlap fully. The error range is wider for women than for men because there are so few of them. On the average there’s no difference. But as we’ve said, on the average hardly anyone is getting cited, be they man or woman. If you look at the top half of Figure 5, you can see that above 75 citations or so there aren’t any red dots. By definition very few people make it to the upper end of the distribution. In this case none of them are women. We can investigate this more closely by separating the articles by gender and fitting lines picking out the average number of citations to the median article, to articles at the 90th percentile, and to the top one percent of articles. Figure 6 shows what that looks like.

Figure 6. Citations by Age at the 50th, 90th, and 99th percentiles.

Here we see that the median article for both men and women (that’s the red line) shows the same path from about two to about ten citations over the age range. The green line shows how things look at the 90th percentle—very successful articles, relatively speaking—and again things are broadly similar. But for the top one percent of articles things look different. A much larger gap opens up. Transforming the y-axis back to a linear scale makes this clear.

Figure 7. The One Percent in Philosophy.

Participation and Agenda-Setting

In many cases we don’t care a great deal about life at the very top of a distribution. Should we in this case? I think so, in part for the reasons that the original co-citation data brought out. The key question is, who gets to be a focal point for discussion? Successful, highly-cited articles don’t simply accrue status rewards in the abstract (or just actual money rewards for their authors). They also become centers of gravity that define what a field is about. Newcomers must orient themselves to those articles and the debates they have begun. So, success in an academic field isn’t just skewed in the way that a winner-take-all market or a tournament setting is skewed. Success means you structure the substance of the field. It’s like an Olympic event where the path taken by the winners also sets the shape of the track for future competitors.

Think of it this way. Getting published in a high-prestige journal is very difficult. All of the articles in our dataset are, by definition, already highly successful just in virtue of being selected for content and quality in the peer-review process. That skewness goes all the way back, so to speak, in the sense that most people who get Ph.Ds never publish much (or at all) in academic journals, and most people who publish peer-reviewed articles never publish in their field’s highest-prestige journals. But, sad to say, even clearing this very high bar is no guarantee that anyone will want to talk about your work. Most of the time, publishing in a high-status journal is in effect a permit to talk about other people’s work. Your contribution must be framed by the central items in your field. And while it’s hard to clear that bar, a much, much smaller number of articles move from being items in the literature to being topics of the literature. Most of the articles in our dataset are “contributions to the literature” in the first sense, but not the second. In this case the agenda-setters, as opposed to the participants, are all men.

As I’ve said before, an academic discipline is a kind of exclusive conversation. Even for very successful entrants, “participation” usually means being present as a contributor, but not as a topic-maker. It’s a little like attending an ongoing public debate. You need a ticket to enter and sit in the audience. You may be called on by the moderator to ask a question or make a point from the floor. But the agenda for the discussion is set by a much smaller panel of people up on the stage—people who started out as audience members. Most of academic life has this structure, from departmental talks to conference panels and plenaries to journal exchanges. A key issue then becomes how work gets selected for attention up on the stage, who gets engaged with from the stage, so to speak, and how this process plays out as new audience members come in the door.