Film Dialogue from 2,000 screenplays, Broken Down by Gender and Age Lately, Hollywood has been taking so much shit for rampant sexism and racism. The prevailing theme: white men dominate movie roles. But it’s all rhetoric and no data, which gets us nowhere in terms of having an informed discussion. How many movies are actually about men? What changes by genre, era, or box-office revenue? What circumstances generate more diversity? We didn’t set out trying to prove anything, but rather compile real data. We framed it as a census rather than a study. So we Googled our way to 8,000 screenplays and matched each character’s lines to an actor. From there, we compiled the number of words spoken by male and female characters across roughly 2,000 films, arguably the largest undertaking of script analysis, ever. Let’s begin by examining dialogue, by gender, for just Disney films.

*Domestic gross over $45M, inflation-adjusted. Using IMDB box office, 2,500 have hit this threshold. All Genres Action Drama Comedy Horror Search Screenplay Dialogue, Broken-down by Gender 2,000 Screenplays: Dialogue Broken-down by Gender Only High-Grossing Films: Ranked in the Top 2,500 by US Box Office*

In January 2016, researchers reported that men speak more often than women in Disney’s princess films. We validated this claim and doubled the sample size to 30 Disney films, including Pixar. The results: 22 of 30 Disney films have a male majority of dialogue. Even films with female leads, such as Mulan, the dialogue swings male. Mushu, her protector dragon, has 50% more words of dialogue than Mulan herself. This dataset isn’t perfect. As with Mulan, a plot can center around a character, even though the dialogue doesn’t reflect it. And all of our data is based on screenplays, not a perfect transcription of a film. Methodology For each screenplay, we mapped characters with at least 100 words of dialogue to a person’s IMDB page (which identifies people as an actor or actress). We did this because minor characters are poorly labeled on IMDB pages. This has unintended consequences: Schindler’s List, for example, has women with lines, just not over this threshold. Which means a more accurate result would be 99.5% male dialogue instead of our result of 100%. There are other problems with this approach as well: films change quite a bit from script to screen. Directors cut lines. They cut characters. They add characters. They change character names. They cast a different gender for a character. We believe the results are still directionally accurate, but individual films will definitely have errors.

2,000 Screenplays: Dialogue Broken-down by Gender All Genres Action Drama Comedy Horror Search

Each screenplay has at least 90% of its lines categorized by gender. If you notice a missing character from the analysis, their lines may be in the remaining 10%. If a character was cut from the film but is present in the screenplay, we inferred his or her gender based on the script’s pronouns. Across thousands of films in our dataset, it was hard to find a subset that didn’t over-index male. Even romantic comedies have dialogue that is, on average, 58% male. For example, Pretty Woman and 10 Things I Hate About You both have lead women (i.e., characters with the most amount of dialogue). But the overall dialogue for both films is 52% male, due to the number of male supporting characters. How many screenplays have women as lead characters? In 22% of our films, actresses had the most amount of dialogue (i.e., they were the lead). Women are more likely to be in the second place for most amount of dialogue, which occurs in 34% of films. The most abysmal stat is when women occupy at least 2 of the top 3 roles in a film, which occurs in 18% of our films. That same scenario for men occurs in about 82% of films.

Aging out of Hollywood: Men vs. Women For each film, we also determined the age of each cast member at the time of its release. This allowed us to quantify whether there is a bias toward younger women in Hollywood (or conversely, whether men enjoy a longer career).

Percent of Dialogue by Actors’ Age Among 2,000 Screenplays, all genres, all years

The amount of dialogue, by age-range, is completely opposite for women versus men. Dialogue available to women who are over 40 years old decrease substantially. For men, it’s the exact opposite: there are more roles available to older actors. Here’s another look at the same data, but for every age: Why we made this This project was born out of the less-than-stellar response to our analysis of films that fail the Bechdel Test. Commenters were quick to point out that the Bechdel Test is flawed and there are justifiable reasons for films to fail (e.g., they are historic). By measuring dialogue, we have much more objective view of gender in film. Many of readers are drawing conclusions that were anecdotally obvious to women in the film industry. But nobody wanted to do the grunt work of gathering the data. We spent weeks just matching scripts to IMDB pages. It’s still not perfect, but we’re now in a much better place than “you know...women are never love-interests when they’re older than 40. ¯\_(ツ)_/¯” All of our sources are available in this Google Doc and as much data as we can share (without getting sued) is available here on Github. Here's an FAQ that addresses concerns about the methodology and data. Or if you don’t know how to code, here’s an easy way to comb through every film, genre, and year.

All Films’ Dialogue, by Cast Member and Gender

All Genres Action Drama Comedy Horror All Years 2010s 2000s 1990s 1980s embed this chart on your site FILMS MATCHING YOUR CRITERIA