Abstract An important goal of the scientific community is broadening the achievement and participation of racial minorities in STEM fields. Yet, professors’ beliefs about the fixedness of ability may be an unwitting and overlooked barrier for stigmatized students. Results from a longitudinal university-wide sample (150 STEM professors and more than 15,000 students) revealed that the racial achievement gaps in courses taught by more fixed mindset faculty were twice as large as the achievement gaps in courses taught by more growth mindset faculty. Course evaluations revealed that students were demotivated and had more negative experiences in classes taught by fixed (versus growth) mindset faculty. Faculty mindset beliefs predicted student achievement and motivation above and beyond any other faculty characteristic, including their gender, race/ethnicity, age, teaching experience, or tenure status. These findings suggest that faculty mindset beliefs have important implications for the classroom experiences and achievement of underrepresented minority students in STEM.

INTRODUCTION Despite decades of research and millions of dollars in federal funding aimed to understand and ameliorate the underrepresentation of diverse individuals in the STEM (science, technology, engineering, and mathematics) pipeline, Black, Latino, and Native American students [underrepresented racial/ethnic minorities (URM)] continue to underperform academically relative to their White peers (1). While these racial achievement gaps are determined by multiple (e.g., economic and structural) factors, they may be exacerbated by subtle situational cues from STEM professors that reinforce racial stereotypes about which social groups are more or less likely to have ability in STEM (2). The cues hypothesis suggests that threatening situational cues in STEM settings, such as the diagnosticity of a test (2–4), can cause URM students to become concerned about being judged in terms of ability stereotypes, resulting in a loss of motivation, intellectual underperformance, and larger racial achievement gaps in STEM classes (5–7). This study examines the role of a novel situational cue to stereotype underperformance—STEM college professors’ beliefs about the fixedness or malleability of ability (8)—and explores whether these faculty beliefs are associated with URM students’ motivation and their academic achievement in those professors’ STEM courses. People’s mindsets (also known as implicit theories or lay theories) are their beliefs about the fixedness or malleability of human characteristics like intelligence or personality (8). Faculty members who espouse fixed mindset beliefs endorse the idea that intelligence and ability are fixed, innate qualities that cannot be changed or developed much. In contrast, faculty who espouse growth mindset beliefs endorse the idea that ability is malleable and can be developed through persistence, good strategies, and quality mentoring. Fixed mindset professors are more likely to judge a student as having low ability based on a single test performance (9) and to use unhelpful pedagogical practices, like encouraging students to drop difficult courses (e.g., “not everyone is meant to pursue a STEM career”) (9). Faculty who endorse fixed mindset beliefs think that some students have strong, innate intellectual abilities, while others do not. Which students might those be? Pervasive cultural stereotypes suggest that White and Asian students are more naturally gifted in STEM than Black, Latino, and Native American students. Because these American cultural stereotypes impugn the intellectual abilities of URM students, we predicted that faculty who endorse fixed mindset beliefs may be particularly demotivating to URM students, resulting in lower performance among URM students in courses taught by fixed (versus growth) mindset faculty. Classic findings regarding the influence of teacher beliefs on students’ performance demonstrate that when teachers have lower expectations for their students, those students become less motivated and perform more poorly in those teachers’ classes (10). These Pygmalion effects are even stronger for URM students (11, 12). We hypothesized that STEM professors’ fixed beliefs about intelligence and ability would lead URM students to experience lower motivation and to underperform relative to their non-stereotyped peers—a pattern consistent with stereotype threat theory. Classic studies that document stereotype threat underperformance effects typically manipulate threatening (versus nonthreatening) situational cues in the learning environment, such as an experimenter’s race/ethnicity/gender, and assess students’ intellectual performance as the primary indicator of stereotype threat (2, 7, 13, 14). Drawing on this theoretical framework, the present study examines the role of college professors’ mindsets as a situational cue that triggers URM underperformance in STEM courses. We argue that if STEM faculty who endorse fixed mindset beliefs engender stereotype threat among URM students, we should observe lower student motivation and substantially larger racial achievement gaps in those professors’ courses compared to courses taught by STEM professors who endorse growth mindset beliefs. The present study investigates undergraduate STEM faculty’s self-reported mindset beliefs and their implications for student motivation and performance. Previous research has examined students’ perceptions of faculty beliefs (15), yet no study, to our knowledge, has examined actual self-reported mindset beliefs of STEM faculty as a predictor of student performance. Furthermore, the effects of teacher beliefs have only been examined among young children (16) and have not been applied in undergraduate populations, where career decisions and trajectories are more salient. We test our hypothesis in a longitudinal, university-wide sample of STEM faculty—the largest sample to date of faculty mindset beliefs combined with student records.

RESULTS To test our hypothesis, we examined the links between faculty mindset beliefs and the racial achievement gaps in those faculty members’ courses across seven semesters (2 years) and more than 15,000 undergraduate student records. Using a validated two-item lay beliefs about intelligence measure (8), we surveyed STEM faculty (N = 150; 40.8% response rate) at a large, selective public university (e.g., “To be honest, students have a certain amount of intelligence, and they really can’t do much to change it”; α = 0.91, M = 3.87, SD = 1.46). All 13 STEM departments (e.g., Astronomy, Biology, Computer Science, Mathematics, and Physics) at the university were represented in the sample. More than half (55.3%) of the sample was tenured, and the average STEM teaching experience was 18.4 years. The percentage of female and URM faculty in the sample was similar to the demographics of STEM faculty nationwide (faculty sample: 26.7% female, 4.7% URM; nationwide: 20.4% female, 5.2% URM) (1). University records provided course grades for all students [N = 15,466; 7172 women (46.4%); 1685 URM (10.9%)] enrolled in all of the courses (n = 634) taught by the STEM faculty respondents over seven academic terms. Thus, student-level data in this study represent a census (the entire population of individuals in a setting) rather than a sample that is used to estimate the population. A multilevel regression model accounted for the nested nature of the data (students nested within courses, nested within faculty) and controlled for confounding factors such as students’ previous achievement (SAT scores) and all available course and faculty characteristics (17). All variables were standardized so that coefficients from the multilevel model can be interpreted as effect sizes (18). Last, we added partially crossed random effects to the model because students could enroll in multiple courses from the same faculty member or in courses from multiple faculty members in the sample across the seven academic terms (19). Table S1 provides fixed effects estimates from the model. On average, all students performed more poorly in STEM courses taught by faculty who endorsed more fixed (versus growth) mindset beliefs (B = 0.08, P = 0.011). However, consistent with stereotype threat and the cues hypothesis, fixed faculty mindset beliefs were more strongly associated with lower course performance among Black, Latino, and Native American (URM) students (B = 0.12, P = 0.001) than among White and Asian students (non-URM; B = 0.08, P = 0.010; group × faculty mindset interaction: B = 0.04, P = 0.041; Fig. 1). On average, non-URM students earned 0.14 grade point average (GPA) points (on a 4.0 scale) higher than URM students, yet in courses taught by faculty who endorsed more of a fixed mindset (−1 SD), the racial achievement gap grew to 0.19 GPA points (URM GPA = 2.71; non-URM GPA = 2.90). However, in courses taught by faculty who endorsed more of a growth mindset (+1 SD), the racial achievement gap shrank to 0.10 GPA points (URM GPA = 2.96; non-URM GPA = 3.06). Thus, the racial achievement gap was nearly twice as large in courses taught by college professors who endorsed fixed (versus growth) mindset beliefs about students’ ability. Fig. 1 Faculty mindset beliefs predict the racial achievement gap in STEM courses. Predicted values are computed from the interaction between faculty mindset beliefs (fixed = −1 SD, growth = +1 SD) and students’ URM (Black, Hispanic, Native American) status. Error bars represent ±1 SE. Which STEM faculty are more likely to endorse fixed mindset beliefs? Do faculty who endorse fixed mindset beliefs tend to be men or women? White, Asian, or URM? Men and women faculty were just as likely to endorse fixed mindset beliefs (B = 0.14, P = 0.648; Table 1), and there were no mindset differences by faculty race/ethnicity (B = 0.03, P = 0.956). As social desirability and awareness regarding mindset beliefs grow (20), it is possible that the explicit endorsement of fixed mindset beliefs may be generational such that older (versus younger) faculty members may be more likely to endorse them. Similarly, it is possible that tenured (versus untenured) faculty with more (versus less) college teaching experience may endorse more fixed mindset beliefs. Yet, we find no evidence that endorsement of fixed mindset beliefs differs by professors’ age, tenure status, or years of college teaching experience (all Ps > 0.35). It could also be that fixed mindset beliefs might be more common in certain STEM disciplines (21). However, we found that fixed mindset beliefs transcended STEM disciplines and were endorsed equally across the 13 STEM disciplines in our sample (all Ps > 0.14). Thus, it seems that fixed mindset beliefs are not gendered, generational, endorsed only by majority group members, simply a function of accumulated teaching experience, or more concentrated in certain STEM disciplines. Table 1 Faculty characteristics predicting faculty mindset beliefs. Higher scores on faculty mindset beliefs reflect a more growth mindset. Gender was coded as follows: female = 1, male = 0. Race/ethnicity was coded as follows: URM (Black, Hispanic, Native American) = 1, non-URM (White, Asian) = 0. Tenure status was coded as follows: tenured = 1, nontenured = 0. Biology was used as the reference group for STEM discipline dummy codes. View this table: Exploring other faculty characteristics as additional predictors of URM underperformance Do faculty characteristics alone exacerbate or attenuate URM underperformance, and are fixed mindset beliefs more threatening when they come from faculty with certain demographic characteristics? For example, is it worse for URM students when a White professor endorses fixed (versus growth) mindset beliefs? Studies of students’ prototypes of scientists and engineers demonstrate that students often conjure images of older white men as the gatekeepers of science (22); therefore, it is plausible that faculty with these characteristics may be more likely to activate stereotype threat among URM students, resulting in larger racial achievement gaps in these professors’ classes. We explored the role of all available faculty characteristics in our dataset (i.e., faculty gender, race/ethnicity, age, tenure status, and teaching experience) as (i) additional predictors of URM underperformance and as (ii) potential moderators of the faculty mindset effects. Same-race role models and exam proctors have been shown to buffer URM students against stereotype threat underperformance in experimental laboratory settings (13, 23, 24); however, we found that URM (versus non-URM) faculty did not have smaller racial achievement gaps in their classes (B = 0.30, P = 0.215). Moreover, professors’ racial identity did not buffer URM students against the negative effects of fixed faculty mindset beliefs (faculty race/ethnicity × mindset interaction: B = −0.11, P = 0.502)—fixed mindset beliefs were equally bad for URM students when they were endorsed by White or URM professors. Similar findings emerged for faculty gender (all Ps > 0.24). Perhaps faculty who are older, have more teaching experience, or are tenured experts in their field are more identity threatening for URM students, especially when they endorse fixed mindset beliefs. Yet, professors’ age, teaching experience, and tenure status did not predict the racial achievement gaps in their classes (all Ps > 0.19), nor interact with their mindset beliefs to predict URM students’ grades (all Ps > 0.41). Demonstrating the strong impact of faculty mindset beliefs, when faculty demographics, mindset beliefs, and students’ URM status (and all interactions between these variables) were included in the model, the mindset beliefs of professors remain the consistent predictor of the racial achievement gap in their courses (table S2). This suggests that faculty mindset beliefs are powerfully associated with URM students’ intellectual performance—above and beyond that of other faculty characteristics such as their professors’ gender, race/ethnicity, age, teaching experience, and tenure status. What is it like to be a student in classes taught by faculty who endorse more of a fixed (versus growth) mindset? If professors communicate their beliefs through verbal and nonverbal behavior (9), then professors who endorse fixed mindset beliefs should be less likely to use pedagogical practices that emphasize learning and the potential for growth and development (9, 25, 26). What would be the point of emphasizing learning, growth, and development if you do not believe that students can grow their skills and abilities? Without faculty emphasis on learning, growth, and development, we expected that students would report being less motivated to do their best work in these professors’ classes. If students are less motivated, then they should be less likely to recommend these professors’ courses to others. It is possible that faculty who endorse fixed mindset beliefs create more demanding courses—requiring students to spend more time studying and preparing for their course. If this is true, then differences in students’ performance and psychological experiences might be explained by the demands of these courses (instead of professors’ mindset beliefs). Four semesters of students’ average course evaluation responses for all courses taught by all faculty respondents shed light on students’ experiences in these professors’ courses. Because student-level responses were unavailable because of confidentiality concerns, we were unable to examine racial/ethnic differences in students’ classroom experiences. We tested multilevel models, controlling for course and faculty characteristics, to account for courses nested within faculty. Consistent with the theory that faculty’s fixed mindset beliefs are demotivating to students, students reported less “motivation to do their best work” in classes taught by faculty who endorsed more fixed mindset beliefs (B = 0.09, P = 0.028) (Fig. 2 and table S3). Students also reported that fixed mindset professors were less likely to use pedagogical practices that “emphasize learning and development” (B = 0.09, P = 0.005). Exploratory mediation analyses of responses to these two questions (see the Supplementary Materials) revealed that these demotivating pedagogical practices statistically explained the effect of faculty mindset on course grades for both URM and non-URM students, although this effect was larger for URM students. Thus, faculty who endorsed more fixed mindset beliefs used less motivating pedagogical practices (at least as reported by students), and these practices were associated with lower course performance for all students on average and especially for URM students. Fig. 2 Faculty mindset beliefs predict students’ experiences in STEM courses. Predicted values are computed from the mean of faculty mindset (fixed = −1 SD, growth = +1 SD). Error bars represent ±1 SE. ns, not significant. *P < 0.05 and **P < 0.01. Given that faculty who endorsed fixed mindset beliefs used less motivating pedagogical practices than faculty who endorsed growth mindset beliefs, it is not surprising that students were less likely to recommend these courses to others (B = 0.08, P = 0.006). Faculty mindset beliefs did not predict the amount of time that the course required (B = −0.04, P = 0.350). This finding suggests that fixed mindset professors do not demand more of students—at least from the students’ perspective—than do growth mindset professors; the amount of time that students reported studying or preparing outside of class remained the same across courses taught by fixed and growth mindset professors.

DISCUSSION Our findings suggest that faculty mindset beliefs predict students’ experiences in their STEM courses and the magnitude of the racial achievement gaps in these courses. We found that the racial achievement gaps in courses taught by more fixed mindset faculty were twice as large as those in courses taught by more growth mindset faculty. To our knowledge, this study examines the largest sample of STEM courses (>600) and students (>15,000) to date, including more than 1600 URM students. Moreover, it is the first to examine the association of professors’ self-reported mindset beliefs with their own students’ grades, demonstrating the implications of faculty mindset beliefs for URM underperformance in STEM courses. Supplemental analyses show that faculty beliefs that are most proximal to students’ experiences (that is, the beliefs of the specific professor who is teaching one’s class) matter more for students’ performance in that class than do discipline-level faculty beliefs (that is, the average faculty beliefs within a STEM discipline). Together, these findings suggest that the mindset beliefs of STEM college professors shape the motivation and achievement of students in their classes, and these beliefs matter especially for URM students in their classes. Professors’ beliefs about the nature of intelligence are likely to shape the way they structure their courses, how they communicate with students, and how they encourage (or discourage) students’ persistence (9). These malleable teaching practices have important implications for the motivation, learning, and achievement of all students in their classes. However, we argue that faculty beliefs about which students “have” ability in STEM might constitute a greater barrier for URM students because fixed mindset beliefs may make group ability stereotypes salient, creating a context of stereotype threat. Recent research suggests that when stigmatized students expect to be stereotyped by fixed mindset institutions, they experience less belonging, less trust, and more anxiety and become less interested (27, 28), suggesting that fixed mindset faculty might also engender these adverse outcomes among students. In the present research, we were unable to assess students’ stereotype threat experiences directly, as this would have required a survey assessment on a prohibitively large scale (e.g., more than 15,000 students). However, it is important to note that most of the stereotype threat literature, including the original demonstrations of stereotype threat in the context of race and gender (2, 29), documented the presence of stereotype threat by assessing intellectual performance and demonstrating greater underperformance by stigmatized groups in the context of negative situational cues (e.g., test diagnosticity). Thus, our results are consistent with this measurement tradition as well as with stereotype threat theory. Future research could measure students’ experiences of threat in response to faculty mindset beliefs. We found that fixed mindset beliefs are not concentrated within certain STEM disciplines. Instead, they appear to be distributed relatively evenly among faculty across STEM disciplines, suggesting that the negative effects of these beliefs may be found across departments, colleges, and likely at other universities. Beliefs that are concentrated within disciplines pose additional problems for stigmatized students. Previous research published in Science shows that professors’ beliefs about brilliance (i.e., whether top performance in a field requires brilliance) when aggregated to the discipline level correlate with the number of women and racial minorities enrolled in American Ph.D. programs (21), suggesting that brilliance beliefs—at the field level—may discourage the pursuit of advanced education among stigmatized groups. The present research complements this work by examining how more traditional mindset beliefs—here, professors’ beliefs about the fixedness (or malleability) of intelligence—shape undergraduate students’ classroom experiences, their performance, and the racial inequalities in those particular professors’ courses. This work suggests that faculty mindset beliefs could be an important predictor of future decisions regarding the pursuit of advanced education in specific STEM fields. Future research could test this possibility. Fixed mindset beliefs were also uncorrelated with faculty identities (e.g., gender, race/ethnicity, and age) and experiences (e.g., tenure and teaching experience), suggesting that fixed mindset beliefs are problematic for students, regardless of the faculty member’s background. However, there are reasons to be optimistic here. Fixed mindset beliefs are changeable. Studies have shown that cost-effective educational interventions can help people develop more of a growth mindset (30, 31). Thus, professors’ mindset beliefs may be a potential lever to creating identity-safe college classrooms (32)—learning environments where all students, regardless of race/ethnicity, feel that they are valued and encouraged to reach their full potential. Millions of dollars in federal funding have been earmarked for student-centered initiatives and interventions that combat inequality in higher education and expand the STEM pipeline. Rather than putting the burden on students and rigid structural factors, our work shines a spotlight on faculty and how their beliefs relate to the underperformance of stigmatized students in their STEM classes. Investing resources in faculty mindset interventions could help professors understand the impact of their beliefs on students’ motivation and performance and help them create growth mindset cultures in their classes at little to no cost. If more faculty create growth mindset cultures in their classes, then this could increase students’ motivation and engagement in STEM—potentially inspiring more URM students to pursue STEM careers. Even a small increase in STEM course grades could mean the difference between receiving credit for the course, retaining financial aid, and/or advancing toward a STEM degree. In this study, 150 faculty taught more than 15,000 students in just 2 years’ time, underscoring the pervasive influence each college faculty member possesses. Faculty-centered interventions may have the unprecedented potential to change STEM culture from a fixed mindset culture of genius to a growth mindset culture of development while narrowing STEM racial achievement gaps at scale (33).

MATERIALS AND METHODS Participants All currently employed STEM faculty (including adjuncts, lecturers, postdocs, and graduate students) who had taught at least one course at the university were recruited by email invitation. Emails were obtained from university records. In total, 483 STEM faculty were contacted, and 197 provided usable data (40.8%). We excluded 45 faculty who had not taught at least one undergraduate course within the previous 2 years and 2 faculty who did not answer the two mindset beliefs questions. The final sample included 150 faculty across 13 STEM departments: Astronomy, Biology, Biochemistry, Biotechnology, Chemistry, Cognitive Science, Computer Science, Economics, Geological Science, Informatics, Mathematics, Physics, and Statistics. See the Supplementary Materials for a comparison of STEM faculty who opted in to the study with those who opted out. Faculty survey measures Participants completed the survey online and were told to “consider the undergraduate students you teach (or have taught) at [the university] when responding to these questions.” Faculty mindset beliefs were measured with two items (i.e., “To be honest, students have a certain amount of intelligence, and they really can’t do much to change it”; “Your intelligence is something about you that you can’t change very much”; α = 0.91) on a 1 (strongly agree) to 6 (strongly disagree) scale. Higher scores on the faculty mindset belief measure represented a more growth mindset. Teaching experience was measured with one item (“How many years have you been teaching in your field?”). Participants were asked to provide their gender, race/ethnicity, and age. Tenure status was collected from university records. Student variables University records provided students’ gender, race/ethnicity, first-generation status, and SAT scores for all students (N = 15,466; 46.4% women) enrolled in all of the courses (n = 634) taught by the STEM faculty respondents over seven academic terms. Black, Hispanic, Native American/Alaska Native, and Native Hawaiian/Pacific Island students were categorized as underrepresented minority (URM; n = 1685; 10.9%). White and Asian students were categorized as the majority group (n = 13,781, 89.1%). Students who did not provide the university with their race/ethnicity or were designated as having “two or more races” were excluded from analysis (n = 3271). Students were categorized as first generation if neither parent/guardian had obtained a 4-year college degree (n = 2255; 14.6%). If a student took the ACT instead of the SAT, then their ACT composite was converted to a SAT score. Students who did not provide the university with a SAT or ACT score were excluded from analysis (n = 440). Course grades Course grades were obtained from university records for all students (N = 15,466) in all courses taught by the faculty members in our sample for seven semesters (2 years) preceding the faculty survey. Grades were provided on a 4.0 scale (A/A+ = 4.0, A− = 3.7, B+ = 3.3, B = 3.0, B− = 2.7, C+ = 2.3, C = 2.0, C− = 1.7, D+ = 1.3, D = 1.0, D− = 0.7, F = 0.0). Course-level variables University records provided course characteristics, such as the number of students enrolled in each course and the course level (i.e., 100, 200, 300, or 400 level). A 100-level course is typically an introductory course, whereas a 400-level course is typically a more advanced course. Of the 634 courses included in the sample, 24.0% were 100-level, 23.3% were 200-level, 31.7% were 300-level, and 21.0% were 400-level courses. Course evaluations Four semesters of students’ average course evaluation responses for all courses taught by the faculty in our sample were collected from university records. Course evaluations at this university were standardized across all courses and intended to be used for faculty development (i.e., to help faculty improve teaching) and for tenure and promotion decisions. At the end of the semester, students answered two questions concerning the professor’s pedagogical practices (i.e., “How much did the instructor motivate you to do your best work?” and “How much did the instructor emphasize student learning and development?”) and one question concerning their overall recommendation of the instructor (i.e., “How likely would you be to recommend this course with this instructor?”) on a 1 (not at all) to 4 (a lot/very likely) scale. Students answered one question concerning the amount of time the course required (i.e., “Compared to other courses you’ve taken how much time did this course require?”) on a 1 (much less time) to 5 (much more time) scale. Additional evaluation questions were asked of students by the university; however, only the evaluation questions reported above were publicly available online; therefore, our analyses were limited to these four questions. Courses with fewer than five enrolled students were not included in analyses to make sure that results were not biased by low response rates. Student-level responses were unavailable because of confidentiality concerns; for this reason, we were unable to examine racial/ethnic differences in students’ classroom experiences. Hierarchical models We used hierarchical linear modeling to account for the nested structure of the data (17). To examine the factors that affect student course grades, we tested a three-level model in which students (level 1) were nested in courses (level 2) and courses were nested within faculty (level 3). The model included partially crossed random effects because students could take courses from more than one faculty member (19). In the model, we controlled for all available student characteristics (gender, race/ethnicity, first-generation status, and SAT scores), all available course characteristics (course enrollment and three dummy variables that account for course level), and all available faculty characteristics (gender, race/ethnicity, age, years of teaching experience, and tenure status). See tables S4 to S6 for correlations among variables at each level. Missing data were handled by listwise deletion. The slope of student race/ethnicity was allowed to vary by course to estimate the cross-level interaction between faculty mindset and student race/ethnicity. The intraclass correlation coefficient (ICC) for course section (level 2) was 0.06, indicating that course sections accounted for 6% of the variance in student grades. The ICC for faculty (level 3) was 0.09, indicating that faculty accounted for 9% of the variance in student grades. The model was fitted using the lme4 package (34) for R version 3.3.1 (35) using restricted maximum likelihood. We used the lmerTest package to obtain P values for fixed effects (36). T tests used the Satterthwaite approximations to degrees of freedom. All continuous variables were standardized. Categorical variables were coded as follows: female = 1, male = 0; URM (Black, Hispanic, Native American) = 1, non-URM (White, Asian) = 0; first-generation = 1, continuing-generation = 0; tenured = 1, nontenured = 0. We added three dummy codes to control for course level, with level 100 as the reference group (i.e., level 200 = 1 and level 100 = 0). Specifically, we estimated a model using the following R code, which was adapted from Bates et al. (34) M1 <- lmer(Student_Course_Grade ~ Faculty_Mindset*Student_Race + Student_Firstgeneration + Student_Gender + Student_SAT + Faculty_Gender + Faculty_Teaching_Experience + Faculty_Tenure_Status + Faculty_Age + Faculty_Race + Course_Enrollment + Course_200Level + Course_300Level + Course_400Level + (1 | Student_ID) + (Student_Race |Faculty_ID/Course_ID) To examine average course evaluations, we tested a two-level model in which courses (level 1) were nested within faculty (level 2). In this model, we controlled for the same course characteristics (course enrollment and three dummy variables that account for class level) and faculty characteristics (gender, race/ethnicity, age, years of teaching experience, and tenure status) as the previous model. The ICC for faculty (level 2) ranged from 0.51 to 0.60, depending on the question, indicating that faculty accounted for approximately 51 to 60% of the variance in students’ course evaluation responses. The following R code was used to estimate the models: M2 <- lmer(Course_Evaluations ~ Faculty_Mindset + Faculty_Gender + Faculty_Teaching_Experience + Faculty_Tenure_Status + Faculty_Age + Faculty_Race + Course_Enrollment + Course_200Level + Course_300Level + Course_400Level + (1|Faculty_ID)

SUPPLEMENTARY MATERIALS Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/2/eaau4734/DC1 Supplemental Analyses Table S1. Fixed effects estimates predicting students’ grades in STEM courses. Table S2. Testing the role of other faculty characteristics. Table S3. Fixed effects estimates predicting course evaluations. Table S4. Correlations among the variables at level 1 (student). Table S5. Correlations among the variables at level 2 (course). Table S6. Correlations among the variables at level 3 (faculty). Table S7. Discipline-level mindset beliefs. Fig. S1. Mediation models for URM and non-URM students. References (37–39)

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.

Acknowledgments: We thank N. Yel for statistical support throughout this project and the members of the Mind and Identity in Context Lab, Indiana University. Funding: This work was supported by NSF grants DRL-1450755 and HRD-1661004 awarded to M.C.M. and a Russell Sage Foundation grant (87-15-02) awarded to M.C.M. Author contributions: All authors designed the research and collected the data. E.A.C. analyzed the data. E.A.C. and M.C.M. wrote the manuscript with input from K.M. and D.J.G. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All de-identified data, code, and materials are available upon request and by IRB approval. In compliance with IRB policies, group characteristics will only be shared when there are 10 or more individuals within the group to preserve participants’ anonymity. Additional data related to this paper may be requested from the authors.