Author’s note: This post is based is based on a longer research paper that can be accessed here. Ariel Shin approached the same problem in this recent post. This analysis uses a different methodology and examines some other issues to continue to shed light on performance disparities in competitive debate. Thank you to Chris Palmer for providing Tabroom.com data and to Professor Claudia Goldin, Professor Lawrence Katz, and Priya Shanmugam for helpful comments.

Summary

I examine results from 89 National Circuit Lincoln Douglas competitions from the 2011-12 to 2015-16 seasons using Tabroom.com data. On average, I find that women are about 4 percentage points less likely than men to win preliminary debate rounds. A win-loss gap of 2.5 to 3 percentage points persists despite controlling for individual and tournament characteristics, suggesting that the results are not solely driven by differences in debate experience or the quality of debate programs. An analysis of the 2016 graduating cohort suggests that part of the gender gap a result of differential attrition: Women who debate at least once as sophomores are 2.5 percentage points less likely than men to debate as juniors. Moreover, the gap appears to be much larger in rounds 1 and 2 of a tournament than in other preliminary rounds, though it is impossible to determine whether this gap is due more to differences in experience or preparation going into the tournament or tournament-specific factors such as judge biases. However, I find no evidence that being assigned a female judge improves female debaters’ performance in specific rounds.

Introduction

Anecdotal evidence suggests the existence of a serious gender gap in competitive success in the high school debate community [1]. Only one woman finished in the top sixteen in the Lincoln Douglas division at the 2016 Tournament of Champions. While discussions of gender disparities abound (see, e.g., Timmons & Boyer 2013), few have quantitatively examined the determinants of success in Lincoln Douglas debate. Shin (2016) demonstrates a positive association between gender and competitive success in Lincoln Douglas tournaments and distinguishes between male and female performance across different kinds of debate programs. However, further research is needed to determine whether this association is partly driven by confounding variables. Moreover, potential solutions are missing from the existing literature.

Although this paper deals with the somewhat narrow topic of gender gaps in one segment of a competitive high school extracurricular activity, it lies at the intersection of several parts of the labor economics literature. First, it relates to previous research on inequality in outcomes in both academic competitions (e.g., Smith 2013) and to research on sex-based hiring (e.g., Goldin and Rouse 2000). Second, the paper ties into research on the economics of education. In particular, previous studies have linked instruction by a same-gender teacher to improved student performance and teachers’ perceptions of students (Dee 2007). Others have found mixed or negative results (Antecol et al 2012). This paper studies the effect of a female judge on female students in a high-stakes academic environment.

Competitive debate is also a particularly attractive setting in which to examine gender stereotypes. Many of the factors cited as barriers to female success in the workplace, including criticism of attire, voice, and tone, may exist in debate because of its focus on unavoidably subjective evaluations of communicative and argumentative skills. This paper’s focus on national-level debate tournaments combines education and competition and thus bridges the gap between the two sets of academic literature.

There are many possible theoretical explanations for gender-related differences in performance: Judges may be biased against women, male debaters may train more or have better access to coaches, women who experience overt sexism or more latent hostility may choose to leave the activity, etc. Without an empirical analysis, it is impossible to either quantify the gender gap or explore its potential underlying mechanisms.

Data

To include a comprehensive sample of competitions, I cross-reference all tournaments on Tabroom.com with a list of National Circuit tournaments that offer bids for the 2015-16 season listed on the Tournament of Champions website. Although many smaller tournaments use other tournament tabulation software, all but one of the tournaments that offer bids to quarterfinalists or round of 16 participants appear in the Tabroom.com data [2]. I attempt to include as many tournaments as is possible given the dataset. My sample contains 89 bid tournaments from the 2011-12 through 2015-16 seasons, a list of which can be found in Appendix C of the full paper.

Since Tabroom allows but does not require coaches to indicate the gender of debaters or judges, about 21% of observations for debaters and 50% for judges are initially missing gender labels. I adopt three strategies to assign genders to missing observations. First, I use 1990 Census data containing about 5,500 common baby names. The Census data corresponds to people who were 25 years old in 2015, which is a reasonable approximation for judges (who are often college or graduate students) as well as debaters. In cases where the same name appears in both the male and female Census lists, I assign the more common gender associated with the name. Second, I merge the Tabroom data with a list of common names of South Asian origin I found on Github, a website where programmers and researchers can share code and datasets. Third, I manually assign gender in what I believe are clear-cut cases [3]. After the three procedures, 99% of debaters and 96% of judges have assigned genders. The vast majority of the improvement is due to the official Census data.

Results

General Summary Statistics

Table 1 reports summary statistics for men and women separately. There are three key take-aways. First, men comprise about 60% of the competitors and an even higher fraction of total observations. On average men in the database compete in 42 preliminary rounds while women compete in about 35. Second, men win a higher fraction of debates: There is a 3.7 percentage point male-female win gap in preliminary rounds. Finally, the performance gap in elimination rounds is even larger. Men are 12 percentage points more likely to win an elimination round than women. These differences are all statistically significant (see Table 1).

Table 1. Summary Statistics: Performance of National Circuit Lincoln Douglas Debaters by Gender

Note: Table 1 reports summary statistics for high school Lincoln Douglas debate tournament results on Tabroom.com. The unit of observation is an individual. a The total number of identifiable unique competitors (those whose genders are either labeled or can be inferred using Census data) is 4,666. b Restricted to National Circuit tournaments with 6 preliminary rounds. c Speaker points are generally awarded on a scale of 0-30; in practice, the scale is about 25-30, with 27.5-28 being an average varsity debater. Standard errors in parentheses.

Source: Tabroom.com National Circuit Lincoln Douglas Debate competition results for a sample of 89 tournaments spanning the 2011-12 to 2015-16 seasons. See Appendix C for list of tournaments.

Are Women Leaving the Activity?

One explanation for differential performance may be that women are less likely to continue debating for all four years of high school. If male debaters are more likely than female debaters to persist in the activity, all else being equal they will accumulate more experience and perform better. Two data points suggest that this is the case. First, the average graduation year for men is about 2 months earlier than than for women. Moreover, 46% of current high school freshmen are female compared to only 33% of those who graduated high school in 2015. However, this statistic could be misleading: If more women have begun competing in Lincoln-Douglas debate over the last few years, we would expect there to be relatively more young female debaters even absent differential attrition.

To resolve this issue, I restrict the sample to the cohort of debaters graduating from high school in 2016. I then calculate a “participation gap,” defined as the difference between fraction of male and female debaters who, conditional on having debated as sophomores, also debate as juniors and as seniors. Table 2 shows that women who debated in at least one tournament as sophomores are about 2.5 percentage points less likely than men to debate as juniors. However, the participation gap does not seem to grow from junior to senior year. There is thus some evidence that women are more likely to quit National Circuit Lincoln-Douglas debate than men. Still, the lack of an increase participation gap between junior and senior year cautions against too strong of an interpretation of these results. Moreover, it is impossible to see whether women switch to a different kind of debate or stop debating entirely, so the information is imperfect.

Table 2. Debate Participation of 2016 Graduate Cohort Over Time

(Conditional on Having Debated as Sophomores)

Note: Table 2 reports the fraction of high school Lincoln Douglas debaters graduating in 2016 who debated as juniors and as seniors, conditional on having debated as sophomores. The unit of observation is an individual. Restricted to first six preliminary rounds at National Circuit tournaments listed in Appendix C. Column 3 contains results from t-tests for difference in means with standard errors in parentheses. I define “debating as sophomores” as students graduating in 2016 who debate in at least one preliminary round from August 1, 2013 through July 31, 2014. “Juniors” is the equivalent but for the August 1, 2014 to July 31, 2015 period. “Seniors” is the equivalent but from August 1, 2015 to present.

Source: Tabroom.com National Circuit Lincoln Douglas Debate competition results for a sample of 89 tournaments spanning the 2011-12 to 2015-16 seasons. See Appendix C for list of tournaments.

A Larger Gap in Preset Rounds?

Table 3 reports a final set of summary statistics that divides rounds into those that are randomly paired (“preset” rounds 1 and 2) and those that are power-paired (rounds 3 through 6). The male-female win-loss gap is 5.7 percentage points in rounds 1 and 2, twice as large as the gap in rounds 3 through 6. While it is impossible to know whether this represents a meaningful difference without accounting for sources of omitted variable bias, there are at least two possible explanations for this pattern.

First, large differences in performance-related characteristics of men and women at tournaments would, other things equal, lead to larger win-loss gaps by gender in rounds 1 and 2 than in later debates. For example, since power-paired rounds match debaters according to skill as determined by their previous performance at the tournament, if women on average have less debate experience than men, they would perform worse in preset rounds and better in power-paired rounds. These performance-related characteristics could encompass male debaters having more experience due to their age, receiving more attention from their school’s coaches, feeling more welcome or comfortable at debate competitions, having access to better research due to larger social networks, etc. These factors could lead men to perform better than women even if judges are unbiased.

A second explanation for the pattern observed in the data is judge discrimination. Suppose a female debater loses a preset round due to a judge’s unconscious bias against her. All else equal, in a later power-paired round the female debater’s skill level should exceed that of a male opponent whose record would reflect his skill alone as opposed to some weighted average of skill and harm from past judge discrimination. In other words, past judge discrimination forces women into a lower bracket, which makes it easier for women to win future rounds because their opponents will tend to be worse. Power-pairing may partially reduce the effect of judge biases by pairing female debaters against less skilled opponents in subsequent rounds.

Basic summary statistics thus suggest that there is a substantial gender-based performance gap at Lincoln-Douglas tournaments. Of course, absent controlling for confounding variables, these relationships are subject to omitted variable bias. To see whether the gender gap is robust to the inclusion of controls, I proceed by implementing two empirical strategies.

Table 3. Performance of National Circuit Lincoln Douglas Debaters by Gender: Preset versus Power-Paired Rounds





Note: Table 3 reports sample means of various performance metrics for high school Lincoln Douglas debaters by gender.The unit of observation is a person-round (an individual’s win or loss in a debate). Column 3 contains results from t-tests for difference in means. a Rounds 1 and 2 of tournaments are almost always “Randomly Paired” or “Preset,” meaning debaters are paired against each other randomly. Rounds 3 onward are “Power Paired,” meaning debaters are paired against each other on the basis of their previous record and speaker points. For instance, a debater with a 4-0 record generally would only debate someone who also has a 4-0 record. b Each observation denotes an individual’s win or loss in a specific debate (not the individual).

Source: Tabroom.com National Circuit Lincoln Douglas Debate competition results for a sample of 89 tournaments spanning the 2011-12 to 2015-16 seasons. See Appendix C for list of tournaments.

Empirical Strategies

(1) Linear Probability Model

First, I use a standard linear probability model to determine the association between gender and in-round performance when controlling for a set of control variables. The technical details are in the main paper, but the basic idea is that I want to account for variables that could mask important nuances in data on the gender gap. For example, if a female freshman is debating a male senior, the fact the man has three more years of experience should matter for the round’s outcome, so failing accounting for graduation year would generate a misleading correlation between gender and the likelihood of winning. Moreover, controlling for confounding variables can also hint at potential solutions. If most of the gender gap is due to women on average having less experience than men — if there are many female freshmen debating male seniors but not the other way around — that may tell us something different than if there’s still a gender gap between debaters of equal seniority.

Table 4 shows the results of the linear model. The numbers represent the association between the variable listed in the corresponding row (e.g., JV x Female means being a woman in JV) and the likelihood of winning a debate round. Each column adds additional variables to the previous column. So, in the baseline model, “-0.0379” means being female is associated with being 3.79 percentage points less likely to win a debate round.

Two main take-aways:

First, after accounting for graduation year, the preliminary round performance gap shrinks from 3.8 to 3.5 percentage points, suggesting that differences in graduation year account for only about 10% of the overall gap.

Second, the gap is no longer significant when accounting for school and division (column 3). This may initially suggest that the gap is a result of differences specific to school debate programs or tournament divisions. However, disaggregating performance by round reveals that there is still a 3.08 percentage point gap in rounds 1 and 2. Averaging performance across all preliminary rounds would thus be misleading, since power-pairing masks the gender gap by matching debaters based on their previous performance. That women are 3 percentage points less likely to win preset rounds thus suggests that the gender gap does exist and is substantial.

Table 4. Determinants of Competitive Success in High School Lincoln Douglas Debate (Linear Model): Preset Rounds

Note: Table 4 reports the coefficients from an ordinary least squares (OLS) regression. The unit of observation is a person-round (an individual’s win or loss in a debate). The dependent variable is 1 if the debater won the round and 0 if not. Individual controls include school fixed effects and graduation year. Standard errors in parentheses are clustered at the tournament-round level. a Six observations are missing school information, so including school as an individual control slightly reduces the sample size.

Source: Tabroom.com National Circuit Lincoln Douglas Debate competition results for a sample of 89 tournaments spanning the 2011-12 to 2015-16 seasons. See Appendix C for list of tournaments.

(2) Individual Fixed-Effects Model

Goldin & Rouse (2000) study the effect of screens that hide applicants’ identities on the success of female musicians who apply to symphony orchestras. By following the same individuals over time and exploiting variation in audition type (with versus without a screen), they can control for fixed individual characteristics such as race or innate musical talent.

I use the same approach to examine whether judge gender affects the success of female debaters. Essentially, this can be thought of as following individual debaters across rounds to see whether changes in judge gender affect women’s debate performance. Unlike the previous model, the individual fixed-effects model accounts for all characteristics that are fixed over time. These include race, any “innate” debate talent or IQ, and even income under the assumption that family income varies little over a debater’s high school career.

I find that there is no relationship between judge gender and female success in any of the model’s specifications (Table 5). Female judges do not reduce the gender gap in preset rounds, as shown by the insignificant coefficient on Female x Judge Female x Preset Round.

One possible interpretation of this is that judge bias is less important than factors such as previous debate experience in determining female success. On the other hand, if judge biases are a product of how women are generally perceived (e.g., through societal stereotypes), one would not necessarily expect female judges to be unbiased arbiters. In other words, the lack of a statistically significant relationship between being assigned a female judge for a debate and winning the debate is impossible to interpret causally.

However, the individual fixed-effects model lends further credence to the idea that the gender gap persists in preset rounds: A woman is 2.58 points less likely to win a preset round (Table 5, Col. 4). The evidence for a preset-round gender gap is thus robust to different empirical specifications. (Note that this model does not generate overall associations between gender and round wins because the debater’s gender does not exhibit variation over time.)

Table 5. Determinants of Competitive Success in High School Lincoln Douglas Debate (Individual Fixed-Effects Model): Judge Gender & Preset Rounds

Note: Table 5 reports the coefficients from a linear regression tracking individuals across tournaments.The unit of observation is a person-round (an individual’s win or loss in a debate). The dependent variable is 1 if the debater won the round and 0 if not. Standard errors in parentheses are clustered at the individual level. a All columns include individual, tournament, and year fixed effects.

Source: Tabroom.com National Circuit Lincoln Douglas Debate competition results for a sample of 89 tournaments spanning the 2011-12 to 2015-16 seasons. See Appendix C for list of tournaments.

Conclusion

Summary statistics and two empirical models yield similar conclusions: men are substantially more likely than women to win preliminary debate rounds at National Circuit Lincoln Douglas debate competitions. The basic linear model puts the gap at 4 percentage points while the full linear regression with individual, tournament, and year fixed effects, as well the individual-fixed effects model, suggest a gap of 2.5 to 3 percentage points.

Including a control variable for debaters’ graduation year seems to account for some of the gap’s narrowing. This is not necessarily a positive sign for those concerned about female success, for it suggests that women tend to be younger (and therefore less experienced) than male debaters. A simple analysis of debaters graduating in 2016 who debated at least once as sophomores confirms that women are over 2 percentage points less likely than men to debate the following season, which could explain the experience-based component of the gender gap.

To be clear, these findings are suggestive rather than causal and leave many questions unanswered. For instance, the research does not shed light on whether judge biases are an important factor at debate competitions. The lack of evidence that having more female judges improves female performance suggests that tournament administrators and coaches concerned with female success should not expect to eliminate the gender gap by hiring female judges.

Of course, female representation at tournaments may have positive effects on female debaters outside the narrow context of individual wins and losses associated with the judge’s gender. Future research could explore this possibility by tracking female success across different kinds of competitions, perhaps examining whether women benefit from there being a higher fraction of female judges, from peer effects, or from other policies tournaments adopt to reduce the gender gap. Gaining a more nuanced understanding of why men enjoy more competitive success than women would help stakeholders in the community minimize the likelihood that women are denied opportunities and treated unfairly in an activity that is meant to empower everyone instead of leaving some people behind.

Full analysis (includes complete footnotes and appendices):

Tartakovsky 2016 – “Gender Disparities in Competitive High School Debate”

Footnotes

[1] In this paper, gender refers most closely to gender identity. See footnote 1 of the full paper for a more detailed explanation of the construction of the gender variable.

[2] Tabroom.com is now the most popular system for National Circuit LD tournaments, but some tournaments use other different systems such as Joy of Tournaments, so not all competitions appear in the data.

[3] These cases include what are likely to be misspellings or abbreviations of debater and judge names by coaches (e.g., “Micahel” becomes “Michael,” “Catheri” becomes “Catherine,” and “Danella” becomes “Daniella”).

References

“2016 Accepted At Larges.” 2016. Tournament of Champions. University of Kentucky.

Antecol, Heather, Ozkan Eren, and Serkan Ozbeklin. March 2012. “The Effect of Teacher

Gender on Student Achievement in Primary School: Evidence from a Randomized

Experiment.” IZA Discussion Paper Issue 6453.

Dee, Thomas S. 2007. “Teachers and the Gender Gaps in Student Achievement.” Journal of

Human Resources 48(3): 529-554.

“Final Places in Lincoln Douglas,” Tournament of Champions, 2016, Tabroom.com.

Flowers, Andrew. “The Most Common Unisex Names in America: Is Yours One of Them?”

Five Thirty Eight, 10 Jun. 2015.

Gates, Gary J. and Michael D. Steinberger. 2010. “Same-Sex Unmarried Partner Couples in

the American Community Survey: The Role of Misreporting, Miscoding and

Misallocation.” Williams Institute Working Paper.

Goldin, Claudia and Cecilia Rouse. Sep. 2000. “Orchestrating Impartiality: The Impact of

“Blind” Auditions on Female Musicians.” American Economic Review 90(4): 715-740.

“Past Lincoln-Douglas Debate Topics.” 2016. National Speech and Debate Association.

Shin, Ariel. May 2016. “A Statistical Analysis of the Gender Gap.” VBriefly. Victory Briefs

Institute.

Smith, Jonathan. 2013. “Peers, Pressure, and Performance at the National Spelling Bee,”

Journal of Human Resources 48(2): 265-285.

Timmons, Cindi and Bekah Boyer. Jan. 2014. “Women in Debate: Working Toward a More

Complete Picture.” VBriefly. Victory Briefs Institute.

United States Census Bureau. Oct. 1995. “Frequently Occurring Names in the U.S. – 1990.”

Census.gov.