Significance Many scholars have argued that discrimination in American society has decreased over time, while others point to persisting race and ethnic gaps and subtle forms of prejudice. The question has remained unsettled due to the indirect methods often used to assess levels of discrimination. We assess trends in hiring discrimination against African Americans and Latinos over time by analyzing callback rates from all available field experiments of hiring, capitalizing on the direct measure of discrimination and strong causal validity of these studies. We find no change in the levels of discrimination against African Americans since 1989, although we do find some indication of declining discrimination against Latinos. The results document a striking persistence of racial discrimination in US labor markets.

Abstract This study investigates change over time in the level of hiring discrimination in US labor markets. We perform a meta-analysis of every available field experiment of hiring discrimination against African Americans or Latinos (n = 28). Together, these studies represent 55,842 applications submitted for 26,326 positions. We focus on trends since 1989 (n = 24 studies), when field experiments became more common and improved methodologically. Since 1989, whites receive on average 36% more callbacks than African Americans, and 24% more callbacks than Latinos. We observe no change in the level of hiring discrimination against African Americans over the past 25 years, although we find modest evidence of a decline in discrimination against Latinos. Accounting for applicant education, applicant gender, study method, occupational groups, and local labor market conditions does little to alter this result. Contrary to claims of declining discrimination in American society, our estimates suggest that levels of discrimination remain largely unchanged, at least at the point of hire.

The American racial landscape has changed in fundamental ways since the Civil Rights Movement of the 1960s. During that time, sweeping legal and social reforms reduced the barriers facing African Americans in many important domains (1, 2). A rising African American middle class and a growing acceptance of the principles of inclusion led some to conclude that racial discrimination had declined to the point that it was no longer a primary determinant of life chances for African Americans and Latinos (2, 3).

Supporting this perspective, a variety of indicators pointed toward a reduction of discriminatory treatment. Surveys indicated that whites increasingly endorsed the principle of equal treatment regardless of race (4). Rates of high school graduation for whites and African Americans converged substantially, and the black–white test score gap declined (5, 6). Large companies increasingly recognized diversity as a goal and revamped their hiring to curtail practices that disadvantaged minority applicants (7). With the election of the country’s first African-American president in 2008, many concluded that the country had finally moved beyond its troubled racial past (8).

Despite clear signs of racial progress, however, on several key dimensions racial inequality persists and has even increased. For example, racial gaps in unemployment have shown little change since 1980 (9, 10), and the black–white gap in labor force participation rates among young men widened during this time (11). Recently, the Black Lives Matter movement shone a spotlight on the ongoing struggles with racism and discrimination experienced by people of color in interactions with law enforcement. The election of Donald J. Trump as the 45th President of the United States with the support of antiimmigrant and white nationalist groups highlighted the persistence of racial resentment (12).

In light of persistent racial gaps in key social and economic indicators, some scholars have challenged prevailing assumptions about waning discrimination. Indeed, while expressions of explicit prejudice have declined precipitously over time, measures of stereotypes and implicit bias appear to have changed little over the past few decades (13⇓–15). In this view, far from disappearing, racial bias has taken on new forms, becoming more contingent, subtle, and covert (15⇓⇓–18).

What can we reliably say about trends in discrimination over time? Has the role of race appreciably diminished across the board, or are there important domains in which little racial progress has been achieved? Answers to these questions are important for understanding the sources of persistent racial inequality.

In this study, we examine trends in racial and ethnic discrimination in American labor markets based on a meta-analysis of every available field experiment of hiring discrimination (with fieldwork dates through December 2015). Meta-analysis is a body of formal methods to synthesize data from a population of existing studies. Field experiments of hiring discrimination are experimental studies in which fictionalized matched candidates from different racial or ethnic groups apply for jobs. These studies include both resume audits, in which fictionalized resumes with distinct racial names are submitted online or by mail (e.g., ref. 19), and in-person audits, in which racially dissimilar but otherwise matched pairs of trained testers apply for jobs (e.g., ref. 20).

The field experimental method is a design with high causal (internal) validity because it benefits from aspects of experimental design. The experimenter carefully manages the application process, which provides control over many potential confounding variables. The exact basis of causal inference across the two main forms of field experiment, resume and in-person audits, is somewhat different. In the typical resume audit, clues indicating race (such as a racially identifiable name) are randomly assigned to otherwise similar resumes, allowing for treatment and control groups to be equated through randomization. In in-person audits, matched pairs of trained testers who differ on the basis of race but are otherwise similar apply for jobs; the between-race contrast is grounded in matching pairs of applicants to make them as similar as possible in all employment-relevant characteristics except race. Both resume and in-person audit methods provide a strong basis from which to draw conclusions about hiring discrimination, particularly relative to the nonexperimental methods widely used in the literature, including by all prior studies of discrimination trends over time (ref. 21 and SI Appendix, section 1).

We use meta-analytic techniques to investigate change in hiring discrimination over time based on all existing US field experimental studies of labor market discrimination. Our procedure follows three basic stages: First, we identified all existing studies, published or unpublished, that use a field experimental method and that provide contrasts in hiring-related outcomes between equally qualified candidates from different racial or ethnic groups. Second, we coded key characteristics of the studies into a database for our analysis based on a coding rubric. This produced 24 studies containing 30 estimates of discrimination against African Americans and Latinos since 1989, together representing 54,318 applications submitted for 25,517 positions. Finally, we performed a random-effects meta-regression to identify trends over time.

We assess discrimination for each study using the ratio of the proportion of applications that received “callbacks”—or invitations to interview—by white applicants relative to African-American or Latino applicants. We calculated the proportions based on counts of the number of callbacks received by each group (white/African American/Latino) within each study. This discrimination ratio measured at the study level is the outcome in our meta-regression. Other methods of calculating hiring disparities between groups produced substantively similar results (SI Appendix, section 8).

We analyze the relationship of discrimination ratios to years in which the data were gathered to provide an estimate of the trend in discrimination. Specifically, we regress the log of the discrimination ratio on year of survey, with controls for key characteristics of the studies, using meta-regression. Meta-regression is a procedure similar to standard regression, except covariates are measured at the level of the study rather than the level of the individual, and the outcome is an effect from the study of interest (in our case, the outcome is the estimate of discrimination against African Americans or Latinos). Methods and Materials discusses further methodological and modeling details.

Discussion Contrary to widespread assumptions about the declining significance of race, the magnitude and consistency of discrimination we observe over time is a sobering counterpoint. We note that our results do not address the possibility that hiring discrimination may have substantially dropped in the 1960s or early 1970s, during the civil rights era when many forms of direct discrimination were outlawed, as some evidence suggests (1). Further, we note that our results pertain only to discrimination at the point of hire, not at later points in the employment relationship such as in wage setting or termination decisions. Social psychological theories would predict hiring to be most vulnerable to the influence of racial bias, given that objective information is limited or unreliable (23⇓–25). Likewise, from an accountability standpoint, discrimination is less easily detected, and therefore less costly to employers, at the point of hire (26). It may be the case, then, that more meaningful reductions in discrimination have taken place at other points in the employment relationship not measured here. What our results point to, however, is that at the initial point of entry—hiring decisions—African Americans remain substantially disadvantaged relative to equally qualified whites, and we see little indication of progress over time. These findings lead us to temper our optimism regarding racial progress in the United States. At one time it was assumed that the gradual fade-out of prejudiced beliefs, through cohort replacement and cultural change, would drive a steady reduction in discriminatory treatment (27). At least in the case of hiring discrimination against African Americans, this expectation does not appear borne out. We find some evidence of a decline in discrimination against Latinos since 1989. The small number of audit studies including Latinos limits our ability to include controls and the precision of our estimates—the decline is marginally significant statistically (P = 0.099). More evidence is needed to establish the trend in hiring discrimination against Latinos with greater certainty. Our results point toward the need for strong enforcement of antidiscrimination legislation and provide a rationale for continuing compensatory policies like affirmative action to improve equality of opportunity. Discrimination continues, and we find little evidence in regards to African Americans that it is disappearing or even gradually diminishing. Instead, we find the persistence of discrimination at a distressingly uniform rate.

Materials and Methods Our procedure follows three basic stages: first, to identify all existing field experiments of hiring discrimination; second, to develop a coding rubric and to code studies to produce a database of their results; and third, to perform a statistical meta-analysis to draw conclusions from the combined results. We discuss each of these steps in turn. Identifying Relevant Studies. We aimed to include in our meta-analysis all existing studies, published or unpublished, that use a field experimental method and that provide contrasts in hiring-related outcomes between different race and ethnic groups in the United States. This includes both in-person audit studies and resume studies (or correspondence studies). We also required that contrasts of hiring outcomes between race or ethnic groups were made for groups that were on average equivalent in their labor market relevant characteristics, since otherwise discrimination estimates are confounded with the difference in nonracial characteristics. We used three methods to identify relevant field experiments: searches in bibliographic databases, citation searches, and an email request to corresponding authors of field experiments of race-ethnic discrimination in labor markets and other experts on field experiments and discrimination. We began with a bibliographic search. Our search covered the following bibliographic databases and working paper repositories: Thomson’s Web of Science (Social Science Citation Index), ProQuest Sociological Abstracts, ProQuest Dissertations and Theses, Lexis Nexis, Google Scholar, and NBER working papers. We searched for some combination of “field experiment” or “audit study” or “correspondence study” and sometimes included the term “discrimination,” with some variation depending on the search functions of the database. We also searched two French-language indexes, Cairn and Persée, and two international sources, IZA discussion papers, a German working paper archive, and ILO International Migration Papers. Our second technique for identifying relevant studies relied on citation search. Working from the initial set of studies located through bibliographic search, we examined the bibliographies of all review articles and eligible field studies to find additional field experiments of hiring discrimination. The last technique used was an email request of authors of existing field experiments of discrimination. From our list of audit studies identified by bibliographic and citation search, we compiled a list of email addresses of authors of existing field experiments of discrimination. To this we added the addresses of authors of literature review articles on field experiments. Our email request asked for citations or copies of field discrimination studies published, unpublished, or ongoing. We also asked that authors refer us to any other researchers who may have recent or ongoing field experiments. The email requests were conducted in two phases. In the initial wave, 131 apparently valid email addresses were contacted. We received 56 responses. We also sent out a second wave of 68 e-mails which consisted of additional authors identified from the initial wave of surveys and some corrected email addresses. We received 19 responses to this second wave of email surveys. Overall, our search located 34 studies that were US-based field experiments of hiring, included contrasts between white and nonwhite applicant profiles that were on-average equivalent in their labor-market relevant characteristics (e.g., education, experience level in the labor market). Six studies were excluded for various reasons, as explained in SI Appendix, section 6. Our remaining 28 studies yielded 24 estimates of discrimination against African Americans and 9 against Latinos relative to whites. Coding and Selection of Analysis Period (1989–2015). We coded key characteristics of the studies into a database for our analysis. Coding was based on a coding rubric, which listed each potentially relevant characteristic of the research and included coding instructions. To develop the rubric, we initially read several studies and, based on this, developed an initial coding rubric of factors we thought might influence measured rates of discrimination. The initial rubric was reviewed and updated by all authors of this study for completeness. It was subsequently refined as coding progressed. Each study was coded independently by two raters, with disagreement resolved by the first author. See SI Appendix, section 7 for more discussion of coding procedures. A list of coded characteristics for the 1989–2015 studies are shown in the SI Appendix, Tables S1 and S2. Studies have fieldwork periods range from 1972 to 2015 for African Americans and 1989 to 2015 for Latinos. For most analyses in this paper, we focus on the period 1989–2015. We focus on this period because the data are sparse before this period (only four studies before 1989) and because our reading of the early studies indicates key methodological differences among these early studies that may affect their results. Resume audits typically signal race by using race-typed names on resumes, but the pre-1989 studies either indicated race directly on the resume [McIntyre et al. (28) put “Race: BLACK” on the minority resumes and nothing about race on the “white” resumes] or attached photos to resumes (a procedure used by Newman; ref. 29). Excluding the early studies leaves us with 21 estimates of discrimination against African Americans and nine against Latinos from 24 studies (six studies include estimates of discrimination against both African Americans and Latinos). The Meta-Analysis Model. A meta-analysis aggregates information from across studies to produce an estimate of an effect of interest (30). In this study, our basic measure of discrimination is the discrimination ratio. This is the ratio of the percentage of callbacks for interviews received by white applicants to the percentage of callbacks or interviews received by African Americans or Latinos. Formally, if cw is the number of callbacks received by whites, and cm is the number of callbacks received by African Americans or Latinos, and nw is the number of applications submitted by white applicants, and nm is the number of applications submitted by African American or Latino applicants, then the discrimination ratio is (cw/nw)/(cm/nm). Ratios above 1 indicate whites received more positive responses than African Americans or Latinos, with the amount above 1 multiplied by 100 indicating the percentage higher callbacks for whites relative to the minority group. Because audit studies equate groups on their nonracial characteristics either through matching and assignment of characteristics (in-person audits) or through random assignment (most resume audits), no further within-study controls are required. SI Appendix, section 8 discusses potential alternative measures of discrimination using the difference in proportions and the odds ratio, and presents alternative results using these measures. Our basic result—no decline in discrimination against African Americans over time—holds using both of the alternative measures, whereas evidence of a decline in discrimination for Latinos appears somewhat stronger with the difference in proportions or the odds ratio. The goal of a meta-analysis is to combine information across studies. This requires measuring the information each study contains about discrimination against a group. The information each study provides is inversely proportional to the square of the SE of the discrimination ratio. We calculate the SE of the ratio from counts reported in each study, accounting for audit pairs in the design when possible. In cases where information on paired outcomes is available from the study (counts of pairs in which both the white and the nonwhite tester receive a callback, white yes nonwhite no, white no nonwhite yes, neither get a callback), we calculated SEs of discrimination ratios accounting for the pairing (see SI Appendix, section 9 for details and formulas). For studies that are not paired between whites or nonwhites or where paired outcomes are not reported, we use formulas for the SE for unpaired groups. This formula will slightly overestimate the SE of the effect for studies that are paired but we treat as unpaired due to lack of information about the outcomes at the pair level, underweighting these studies a bit in computing the overall effect, and slightly inflating the overall cross-study SE. Of course field experiments vary in their characteristics, such as the geographic area they cover, the exact job sectors covered, and details of their methodology. To account for this variability in understanding the time trend, we use two procedures. First, we include controls, discussed further below, for many study characteristics. Second, to capture sources of variability not covered by the covariates, we use a random effects specification (22). Random effects incorporate a variance component capturing variation in outcomes across studies that are due to unobserved study-level factors. Random effects are recommended whenever there is reason to believe that the effect in question is likely to vary as a function of design features of the study, rather than representing a single underlying effect that is constant over the whole population. This is surely the case in our analysis, as we expect that the level of racial discrimination may depend on the year of the study, the situation the study considers (e.g., the occupational categories), the skill level of the applicants, and so on. The random effect increases the SEs of estimates to correctly account for variabilities among studies in drawing inferences about overall trend. More formally, random-effects meta-analysis allows the true effects of race on the callback rate in each situation estimated by each study, θ i , to vary between studies by assuming that they have a normal distribution around a mean effect, θ. If y i is the discrimination ratio in the ith study, then the meta-analysis model is as follows: ln ( y i ) = θ + u i + e i , where u i ∼ N ( 0 , τ 2 ) and e i ∼ N ( 0 , σ i 2 ) . Here, τ2 is the between-study variance, estimated from between-study variance as part of the meta-analysis model, while σ i 2 is the variance of the log response ratio in the ith study, estimated from study counts as described above. Following standard practice in the meta-analysis literature, we log the response ratio to reduce the asymmetry of the ratio. Meta-regression allows that the rate of discrimination is a function of a vector of k characteristics of the studies and effects, x, plus (in the random effects specification) residual study-level heterogeneity (between study variance not explained by the covariates). The model assumes the study-level heterogeneity follows a normal distribution around the linear predictor: ln ( y i ) = x i β + u i + e i , where u i ∼ N ( 0 , τ 2 ) and e i ∼ N ( 0 , σ i 2 ) , where β is a k × 1 vector of coefficients (including a constant), and x i is a 1 × k vector of covariate values in study i (including a 1 for a constant). Estimation is by restricted maximum likelihood. For details, see SI Appendix, section 9. To explore trends over time, we include covariates for the year of fieldwork of the study. In the simplest models, the only covariate is this time trend. In later models, we include a more extensive set of predictors to control for other factors that might confound the time trend. These additional controls include resume audit vs. field audit as the study method, gender and education level of the fictitious applicants, occupations tested, unemployment rates at the field sites used for testing, criminal background of some fictitious applicants, and region of the country. For discussions of why these controls were selected, see SI Appendix, section 4 (see SI Appendix, Tables S1 and S2 for descriptive statistics on the controls; for a discussion of trends in covariates, see SI Appendix, section 10).

Acknowledgments We thank Anthony Heath, Fenella Fleischmann, Matthew Salganik, Frank Dobbin, András Tilcsik, Donald Green, David Neumark, Hedwig Lee, two anonymous PNAS reviewers, and the editor for comments; Larry Hedges for methodological advice; and Jim Cheng Chen and Joshua Aaron Klingenstein for excellent research assistance. We have received financial support for this project from the Russell Sage Foundation and the Institute for Policy Research at Northwestern University.

Footnotes Author contributions: L.Q. designed research; L.Q., O.H., and A.H.M. performed research; L.Q. and O.H. analyzed data; and L.Q., D.P., O.H., and A.H.M. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1706255114/-/DCSupplemental.