

This article is an update to a marathon prediction calculator that originally appeared on Slate in October 2014. Since then, the statisticians behind the calculator have updated the numbers and the formula on which it is based, and the author of the article has become a FiveThirtyEight staff writer. As a result, FiveThirtyEight has reworked the calculator, and we are running this follow-up article on both websites.

A few years ago, I contacted Andrew Vickers to pick his brain about statistical methods in clinical research. Vickers, a statistician at Memorial Sloan Kettering Cancer Center, was happy to discuss stats with me, but what he really wanted to talk about was running. Specifically, the race time predictors commonly found online such as this one at Runner’s World UK that use your time from one race to predict what your finish time will be for a race of another distance. You type in your 5K result, for instance, and it tells you what time to expect for a 10K or a marathon.

We built a calculator that predicts your marathon time based on your racing history. See how fast you’d run a marathon »

These online calculators are usually based an algorithm published back in 1981 by an engineer named Peter Riegel. The concept is simple — as the race distance increases, the maximum pace you can maintain decreases, which means your 10K time will be more than just double your 5K time. The Riegel formula accounts for this slowing by incorporating a “fatigue factor,” a constant known here as \(k\).

The Riegel equation:

$$\text{predicted race time} = \text{time run in earlier race} \cdot \left(\frac{\text{distance of race you’re predicting}}{\text{distance of earlier race}}\right)^k$$

Vickers had scrutinized the Riegel formula and thought that it underestimated marathon times. To bolster his point, he told me that he’d recently run a 2:59 marathon, yet the race calculators would have predicted a finish time of 2:48 based on his most recent half-marathon result. If he’d paced his race based on that prediction, he might have set himself up to run out of gas before the finish.

He proposed a project: I’d write a story about his misgivings regarding the Riegel formula and ask readers to fill out a form with their recent race times and a bit of other relevant information, and he’d use the data to create a more accurate formula. In April 2014, Slate published my initial story on the subject and included a link to Vickers’s survey.

The survey received 2,497 responses, which Vickers and his colleague Emily Vertosick used to look for factors that are linked to race performance and to come up with a better formula for predicting finishing times, which I wrote about at Slate. Now they’ve published their work in the journal BMC Sports Science, Medicine and Rehabilitation.

The data set they were working from isn’t perfect: We solicited survey responses, which means the sample of respondents wasn’t really random, and we relied on people to self-report their times and other information, which they don’t always do accurately. But even with imperfect data, it looks like Vickers was right — the Riegel formula worked great for distances up to the half-marathon, but it underestimated marathon finishing times by 10 or more minutes for half of the runners in his sample. That’s a “humongous problem” for runners who use these calculators to plan their races, Vickers said, because pacing is crucial in marathons, where starting out too swiftly can cause runners to hit a wall of exhaustion long before the finish.

Start a marathon too slowly, and you can recoup some of the time you lost by picking up the pace when you find yourself with something left in the final miles, said Stephen Seiler, a sports scientist at University of Agder in Norway who was not involved in the study, “but there is no good way to get the monkey off your back if you have gone out too hard.” The optimal way to run a fast marathon is with an even pace, so that any changes in speed come as a kick at the end. Finding your ideal pace requires estimating the finishing time you’re capable of running, and that’s what prediction calculators are meant to help you pinpoint.

The Riegel formula is modeled on world-record performances, but Vickers suspected that the numbers would be different for runners who weren’t at the world-class level. Recreational runners take at least an hour more to run their marathons compared with elite runners, so the margin for error is greater, Seiler said. “When an elite runner bonks, their performance decline in absolute terms is smaller than when the rec runner bonks.”

Another factor to consider: Some people are naturally faster at shorter distances than longer ones. “An elite runner who is strong on endurance and weak on speed will have a doubling time around 10 seconds per mile,” said Ken Young, a statistician with the Association of Road Racing Statisticians. In other words, he said, a top runner who can finish a 5K at 5:20 per mile can usually run a 10K at 5:30 per mile. “Similarly, a runner strong on speed and weak on endurance may have a doubling time of 20 sec/mile (or more),” Young said, so a 5:20 pace for 5K might translate into a 10K pace of 5:40. This difference in people’s natural abilities limits the accuracy of generalized algorithms that predict runners’ times, Young said.

Still, Vickers and Vertosick were determined to try to improve their formula, and their survey asked runners to rate themselves on a 10-point scale from “endurance runner” to “speed demon” to account for some of these differences in abilities. Using the survey data, the researchers created their new formula by randomly splitting the results into several groups. They used one group to develop a new formula, then they tested it on another group to validate the new equation. The Riegel formula relies solely on previous race times, but Vickers wanted to look for other factors that might better predict finishing times.

It turned out that several factors were correlated to quicker finishes. “People who run more miles have faster times, and people who ran intervals and tempo runs had faster times,” Vickers said. Runners who incorporated interval workouts into their training ran about 3 percent faster than those who didn’t. “We found that intervals helped about the same amount, no matter what the length of the race, and the same was true for mileage,” Vickers said. Tempo runs, on the other hand, corresponded to faster times for short races more so than for long ones.

In the survey sample, women were about 20 percent slower than men at the 5K, but the difference dropped to 10 percent for the marathon. This finding contrasts with results from elite runners, where the range of differences in world-best performances between men and women is much smaller, between 10 percent and 12.5 percent across all distances. That discrepancy doesn’t surprise sport scientist Ross Tucker. He said that research in South Africa has shown that if a man and a woman have similar times over one distance, the woman will usually be faster than the man at a longer race and slower than the man at a shorter one. “So if you and I are matched at 10K,” Tucker told me, “then you’ll likely be faster than me at the 21K and marathon, but I’ll most likely be better than you at 5K.”

After analyzing the relationships between these factors, Vickers and Vertosick found that two factors were the best predictors of final race times: average weekly training mileage and previous race times. Their new formula uses these two inputs to calculate a predicted time.

You can try it out here and let us know how it does.