Physiklehrerinnen und -lehrer bewerteten diese Antwort auf eine Prüfungsfrage. (Screenshot: ETH Zürich / Sarah Hofer)

Hofer asked secondary school physics teachers in an online test to grade an exam answer. She presented 780 participants from Switzerland, Germany and Austria with the same question from the field of classical mechanics and the exact same fictitious – and only partially correct – student answer. The only thing that the ETH scientist varied in the experiment was a short, introductory written statement: half of the trial participants were led to believe that they had to grade an answer from a “male student”, the other half “a female student”. Hofer left the participants in the dark about the purpose of her study, and instead pretended it related to a cross-comparison of two different methods for correcting exams.

The participants graded the physics task differently. In her analysis, Hofer compared the range of grades of the supposed female students with those of the supposed male students. The good news: for teachers who had taught for at least ten years, the gender of the student had no influence on the grade. The bad news: teachers in Switzerland and Austria who had taught for less than ten years gave the girls a significantly poorer grade than the boys. As an example: teachers with five or less years of professional experience discriminated girls by a grade of 0.7 (Switzerland) and 0.9 (Austria) on average.

When stereotypes influence

“Teachers with less teaching experience are possibly more guided by the bias that girls are worse in physics than boys when grading,” says Hofer. Earlier studies have already provided evidence that girls have to work harder for the same grades in science-related subjects, but most of those studies looked at the field of mathematics. The present study is the most comprehensive and most recent one for the field of physics and the German-speaking countries.

It is known that biases and stereotypes have an impact on grading when the evaluator does not have enough information or is extremely stressed or overwhelmed, says Hofer. “Teachers with less experience are apparently more influenced by contextual information such as gender.”

Mixed picture in Germany

The results of the new study are curious for German secondary school teachers with less than then years of teaching experience: the male teachers graded the girls and boys the same, while the female teachers behaved like their Swiss and Austrian colleagues and graded the girls more poorly. German female teachers with five or less years of experience discriminated the girls by a grade of 0.9 on average. Hofer and Elsbeth Stern, Professor for Empirical Educational Research, could not explain these special circumstances based on the data available. One possible explanation is that German male teachers are more sensitised than their colleagues in the other countries studied due to promotion programmes for girls in the STEM fields (science, technology, engineering and mathematics). However, as Hofer points out, such programmes exist in all three countries.

In addition to gender, the researcher also varied the specialisation of the fictitious students in languages vs. science in the introduction in the online test; specialisation did not affect the grade.

Girls are not rewarded for effort

For ETH professor Stern, the poorer grades for girls, as demonstrated in this study, are part of a more fundamental problem: “Girls and women cannot count on being rewarded for their effort.” At times they will be graded too well, other times too poorly. Their grades do not reflect actual performance as well as they do for boys and men, which makes it difficult for them to find their direction. “As a girl, when you already have the feeling in school that you won't be fairly graded in sciences, then you tend to lose interest in these subjects,” says Stern. Instead, scientifically-gifted women all too frequently turn to other subjects in which they are likely to be more strongly promoted. This is something that should be taken into account in the ongoing STEM promotion programmes.

“Grades are the feedback that students receive for their performance, and they strongly affect their self-perception, motivation and willingness to make an effort,” says Hofer. “Teachers should therefore take grades very seriously,” says Stern. Likewise, even greater attention should be paid to grading during teacher training. This will be done in the teacher training at ETH Zurich.

But even more fundamentally, stereotypes should be critically scrutinised, especially at school, says Hofer. When grading exam questions, a more structured approach with clear criteria could help teachers grade objectively and block out stereotypes. “It would be important for teachers to use an evaluation scheme for each exam that outlines how many points should be awarded for which partial answers and which clearly defines what are careless mistakes and consequential errors.” It would also be helpful if teachers covered the student's name when grading.