Submitted on June 23, 2011

Many believe that current teacher evaluation systems are a formality, a bureaucratic process that tells us little about how to improve classroom instruction. In New York, for example, 40 percent of all teacher evaluations must consist of student achievement data by 2013. Additionally, some are proposing the inclusion of alternative measures, such as “independent outside observations” or “student surveys” among others. Here, I focus on the latter.

Educators for Excellence (E4E), an “organization of education professionals who seek to provide an independent voice for educators in the debate surrounding education reform”, recently released a teacher evaluation white paper proposing that student surveys account for 10 percent of teacher evaluations.

The paper quotes a teacher saying: “for a system that aims to serve students, young people’s interests are far too often pushed aside. Students’ voices should be at the forefront of the education debate today, especially when it comes to determining the effectiveness of their teacher." The authors argue that “the presence of effective teachers […] can be determined, in part, by the perceptions of the students that interact with them." Also, “student surveys offer teachers immediate and qualitative feedback, recognize the importance of student voice […]". In rare cases, the paper concedes, “students could skew their responses to retaliate against teachers or give high marks to teachers who they like, regardless of whether those teachers are helping them learn."

But student evaluations are not new.

Anonymous student evaluations are widely used at US universities for tenure and promotion decisions. Their purpose is to gauge the teaching quality of faculty members and to help them improve their instruction. Yet, the generalized perception among faculty members is that student evaluations achieve neither of these goals. Social psychologists, in particular, have a good understanding of how such evaluations work and what they really reflect.

One of the most compelling experiments investigating biases in student evaluations – with the irresistible and descriptive title “Motivated Stereotyping of Women: She’s Fine if She Praised Me but Incompetent if She Criticized Me” – was done by social psychologists Sinclair and Kunda (see here). The researchers administered a test of 10 open-ended questions to 50 male students. Each student was later given feedback on how he did via one of four randomly selected pre-recorded videos featuring two evaluators, a man and a woman, and two scripts, one praising the student’s performance and one criticizing it. After receiving their feedback, students were asked to rate their evaluator. Students who received positive feedback gave roughly the same ratings to male and female evaluators. Among those who were given negative feedback, the female evaluator was rated significantly lower than the male. Sinclair and Kunda ran a second experiment where observers (male students who did not take the test) rated the evaluator. They found that observers’ ratings were uncorrelated with evaluators’ genders. The authors concluded that when people are criticized, they unconsciously maintain self-esteem by using negative stereotypes about the criticizer to discount their negative feedback. The researchers found the same effect for race (see here).

In an older experiment by Kaschak, a group of 50 students (50 percent female) were given descriptions of professors and their teaching methods and were asked to rate them. In one condition, half the professors were listed as female, in the other half as male. A second group of students (n=50, 50 percent female) were given the same descriptions, but with the genders of the professors switched. Although the professors’ gender did not affect the ratings of the female students, male students tended to rate the female professors lower (see here).

Psychological research has investigated the concept of “shifting standards” (see here and here), which suggests that when people are called on to make a judgment, they do so in relation to a point of reference which is often determined by stereotypes about social groups. For example, Bennett pointed to a gendered shift in the standard that is used to assess university instructors. She asked undergraduate students how much attention they expected and received from their instructors, as well as their ratings on their instructors’ availability outside of class. Students expected and reported getting more time from women than from men, yet were more likely to rate women as “not available enough." The students’ reference point for “enough” clearly shifted to a higher level for women teachers (see similar studies see here and here).

The literature on biased student evaluations is extensive. The above are but a few examples. So, we actually have a lot of evidence, not only that students’ perceptions of their teachers are unconsciously biased, but how and why. The major problem is not that some students will use teacher ratings in spiteful ways. The more serious problem is that it’s difficult to design a useful set of survey questions that circumvent students’ innate perceptual biases which typically and unconsciously discriminate against traditionally devalued social groups (e.g., women, African Americans.) Difficult, but not impossible.

In a previous post, I argued that there is no such thing as an unbiased evaluation. Here, I sound a further note of caution by focusing on a specific context, the student-teacher relationship. As of right now, few, if any, evaluation systems tells us much about student expectations; these surveys are simply not designed with the intention of detecting biases. In light of the evidence, it’s clear that those involved in designing new methods of teacher evaluation must find ways to assess the effectiveness of the goals and outcomes of the teaching process itself – not the perceived effectiveness of the teacher – which will hopefully be less subject to students’ unconscious judgments and expectations.

- Esther Quintero