Imagine if marathon runners were ranked simply by taking their average time over every course. Some courses are clearly harder than others, and runners can choose which races to enter, so a runner could always improve her ranking by refusing to run on difficult courses. Even sillier would be to rank runners by their average finish position across all races: a world-class professional could just run against high-schoolers and finish first without even trying.

Strangely, the current system for evaluating American college students manages to achieve this extraordinary level of silliness. Since students can select their own courses, and grades from all courses count equally, they are rewarded for taking easier courses and punished for taking harder ones. A first-year student taking introductory English literature gets exactly equal credit as her classmate who precociously jumps into graduate-level literary analysis.

College grade-point averages ( GPA s) are not merely a matter of pride. Medical schools, law schools and consulting firms, among other popular post-graduation destinations, have strict GPA cut-offs; any student who fails to make the grade will struggle to have their application even seen by a human. From an individual student’s standpoint, it’s completely rational to optimise for GPA , even at the expense of other considerations. From the university’s point of view, though, that expense is vast.

That’s because a running race exists to test current ability; the training itself has happened long before. College, by contrast, is supposed to be the place where the training actually happens. That makes college GPA significantly worse than our imagined marathon ranking. If you believe that, even occasionally, taking a harder course would help a student learn more, the GPA system incentivises students to take courses where they’ll learn less. In other words, if you believe that college helps make people smarter, the GPA system makes us collectively dumber.

Universities have certainly shown concern over GPA in the past. Princeton University and Wellesley College, among others, spent over a decade experimenting with policies to curb grade inflation, mandating a fixed cap on the number of A-grades or a cap on the average GPA in most of their courses. But while inflation-curbing policies can help equalise the difficulty of various different courses, they don’t change the incentive for each student to take the easiest courses possible.

And there is always a way to find easier courses. In a college where professors can grade at their own discretion, some might be known for grading more generously. But even in universities with enforced grading standards, different courses necessarily attract students with different levels of ability: the students in an intermediate course will always be tougher competition than the students in the introductory class. And some courses are easier for specific students, even if they’re not easy for students as a whole: it’s not uncommon in Ivy League schools to see fluent foreign-language speakers feign some degree of ignorance in order to take introductory language courses where they’ll get perfect grades with minimal effort.

There are two potential solutions to this general class of problems. One is to create a standardised test that all students must take, thereby making outcomes directly comparable. This is what happened at the high-school level, for better or worse, with the scholastic aptitude tests ( SAT s) in reading, writing and maths: all students take the same assessment, so the resulting scores are directly comparable. However, at the university level, it seems implausible to imagine all students across different specialties taking a single, unified test.

The other solution is to find a way to compare and calibrate different grades from different courses. Over the years, many academics – including Valen Johnson, then of Duke University, and Jonathan Caulkins and his co-authors at Carnegie Mellon – have proposed methods that would try to control for both differences in instructor grading and different levels of student ability. The correction for instructor stringency relies on the idea that all instructors are (implicitly) ranking their students from top to bottom, regardless of the grade cut-offs they use. If we adjust those rankings, we can compare between instructors who give very different grades. The correction for student ability relies on tracking the same students across their various courses, and inferring the average student ability in each course as a result. The adjusted GPA measures reward students both for getting good grades and for taking difficult courses – in short, for working and learning as much as possible.

At the time of Johnson’s proposal in 1997, Duke University briefly flirted with implementing his alternative GPA (as a supplement to the traditional GPA ), but after a rambunctious public consultation the proposal was narrowly defeated in a faculty vote. The opposition mostly came from students and members of the humanities departments, who were worried that their courses would be deemed less demanding. As far as I can tell, no university has ever implemented an alternative GPA .

One sad feature of the college GPA situation is that so many of the parties are responding rationally to the incentives they face. Students facing GPA cutoffs are behaving rationally by taking easier courses for better grades, even if they would prefer to take harder courses in a system that didn’t punish that choice. Graduate schools and employers might think GPA is a critically flawed metric of student ability, but still end up using it for want of better alternatives. Individual professors get better student evaluations and easier lives if they grade generously, and can’t individually correct for differences between course difficulty anyway.

What remains a mystery, though, is why universities play along. While any transition in the GPA system might prove contentious in the short run, the examples of Princeton and Wellesley show a willingness to suffer short-term costs for a longer-term good. As houses of learning, universities should want to encourage students to challenge themselves as much as possible. Surely the first step would be to stop punishing those students. The adjusted GPA measures are ready to go; all they need is for ambitious universities to fire the starting gun.