Walk Hickey published an article on April 1 on FiveThirtyEight, titled The Dollar-And-Cents Case Against Hollywood’s Exclusion of Women. The article examines the relationship between movies' finances and their portrayals of women using a well-known heuristic call the Bechdel test. The test has 3 simple requirements: a movie passes the Bechdel test if there are (1) two women in it, (2) who talk to each other, (3) about something besides a man.

Let me say at the outset, I like this article: It identifies a troubling problem, asks important questions, identifies appropriate data, and brings in relevant voices to speak to these issues. I should also include the disclaimer that I am not an expert in the area of empirical film studies like Dean Keith Simonton or Nick Redfern. I've invested a good amount time in criticizing the methods and findings of this article, but to Hickey's credit, I also haven't come across any scholarship that has attempted to quantify this relationship before: this is new knowledge about the world. Crucially, it speaks to empirical scholarship that has exposed how films with award-winning female roles are significantly less likely to win awards themselves [5], older women are less likely to win awards [6], actresses' earnings peak 17 years earlier than actors' earnings [7], and differences in how male and female critics rate films [8]. I have qualms about the methods and others may be justified in complaining it overlooks related scholarship like those I cited above, but this article is in the best traditions of journalism that focuses our attention on problems we should address as a society.

Hickey's article makes two central claims:

We found that the median budget of movies that passed the test...was substantially lower than the median budget of all films in the sample. We found evidence that films that feature meaningful interactions between women may in fact have a better return on investment, overall, than films that don’t.

I call Claim 1 the "Budgets Differ" finding and Claim 2 the "Earnings Differ" finding. The results, as they're summarized here are relatively straightforward to test whether there's an effect of Bechdel scores on earnings and budget controlling for other explanatory variables.

But before I even get to running the numbers, I want to examine the claims Hickey made in the article. The interpretations he makes about the return on investment are particularly problematic interpretations of basic statistics. Hickey reports the following findings from his models (emphasis added).

We did a statistical analysis of films to test two claims: first, that films that pass the Bechdel test — featuring women in stronger roles — see a lower return on investment, and second, that they see lower gross profits. We found no evidence to support either claim. On the first test, we ran a regression to find out if passing the Bechdel test corresponded to lower return on investment. Controlling for the movie’s budget, which has a negative and significant relationship to a film’s return on investment, passing the Bechdel test had no effect on the film’s return on investment. In other words, adding women to a film’s cast didn’t hurt its investors’ returns, contrary to what Hollywood investors seem to believe. The total median gross return on investment for a film that passed the Bechdel test was \$2.68 for each dollar spent. The total median gross return on investment for films that failed was only \$2.45 for each dollar spent. ...On the second test, we ran a regression to find out if passing the Bechdel test corresponded to having lower gross profits — domestic and international. Also controlling for the movie’s budget, which has a positive and significant relationship to a film’s gross profits, once again passing the Bechdel test did not have any effect on a film’s gross profits.

Both models (whatever their faults, and there are some as we will explore in the next section) apparently produce an estimate that the Bechdel test has no effect on a film's financial performance. That is to say, the statistical test could not determine with a greater than 95% confidence that the correlation between these two variables was greater or less than 0. Because we cannot confidently rule out the possibility of there being zero effect, we cannot make any claims about its direction.

Hickey argues that passing the test "didn't hurt its investors' returns", which is to say there was no significant negative relationship, but neither was there a significant positive relationship: The model provides no evidence of a positive correlation between Bechdel scores and financial performance. However, Hickey switches gears an in the conclusions, writes:

...our data demonstrates that films containing meaningful interactions between women do better at the box office than movies that don’t...

I don't know what analysis supports this interpretation. The analysis Hickey just performed, again taking the findings at their face, concluded that "passing the Bechdel test did not have any effect on a film’s gross profits" not "passing the Bechdel test increased the film's profits." While Bayesians will cavil about frequentist assumptions --- as they are wont to do --- and the absence of evidence is not evidence of absence, the "Results Differ finding" is not empirically supported in any appropriate interpretation of the analysis. The appropriate conclusion from Hickey's analysis is "there no relationship between the Bechdel test and financial performance," which he makes... then ignores.

What to make of this analysis? In the next section, I summarize the findings of my own analysis of the same data. In the subsequent sections, I attempt to replicate the findings of this article, and in so doing, highlight the perils of reporting statistical findings without adhering to scientific norms.