Success in Formula 1 is as much about consistency as outright speed. It is easy to point to historical examples of drivers being outscored by their teammates across a year despite being the much stronger driver over a single lap. In 1984, Prost outqualified his champion teammate Lauda 15-1, but was narrowly beaten by Lauda to the title. He took this lesson to heart and reformed his driving style, focusing on race pace and consistently bringing home points. Things came full circle when Senna outqualified Prost 28-4 from 1988-1989, but Prost outscored Senna 186-163 (and 154-150 in points that counted towards the championship). The difference was partly down to bad luck, but also down to Prost’s more pragmatic approach. A contemporary example is Button vs. Hamilton at McLaren from 2010-2012. While Hamilton dominated qualifying 44-14, he was ultimately outscored 672-657 by Button, through a combination of bad luck and inconsistency.

Drivers who keep out of trouble may not gain the same attention as flashier drivers like Gilles Villeneuve, Ayrton Senna, and Ronnie Peterson, but they understand the demands of the sport. Points are not won in qualifying and, since 1960, they are not won for setting fastest laps either.

So who are the cleanest drivers in the sport’s history, and who are most likely to end the race in a wall? I’m sure some names immediately spring to mind.

To answer this, I compiled some data on the likelihood of drivers retiring due to driver-related DNFs. I defined these as races in which a driver had any of the following happen:

DNF due to a crash or collision.

Disqualification from the race due to driver conduct (e.g., black-flagged for actions on track).

Running out of fuel within the last 5 laps of a race — I included this because fuel management has been an important skill at several points in the history of the sport.

Voluntarily withdrawing during a race without any mechanical or other problems.

Non-classified finishes (i.e., failing to complete a satisfactory fraction of the total race distance) that were not attributed to mechanical problems.

The vast majority of driver-related DNFs were DNFs due to crashes or collisions. I didn’t attempt to attribute fault when it came to collisions (since this can become subjective), nor did I include data on races where the driver crashed or went off track but was able to finish the race.

I ran these statistics for all drivers who have scored at least 3 wins and all drivers who are currently active. I also picked two special examples for reference:

Ukyo Katayama, who has the greatest number of driver-related DNFs per start of any driver with 50 or more starts. Andrea de Cesaris, who has a reputation as one of the most crash-prone drivers in the sport’s history.

For each driver, I calculated the average number of starts per driver-related DNF. The results are shown in the graph below.

Four of the current drivers (Chilton, Ericsson, Kyvat, and Magnussen) have not yet had a single driver-related DNF, so they do not appear on the chart.

The crash champion is Ukyo Katayama, with one crash every 3.2 starts. Famous for his incredible start-line crash at Estoril in 1995, he puts even Andrea de Cesaris to shame. Among the world champions, “Hunt the Shunt” is a clear leader, with one driver-related DNF every 4.6 starts, although Damon Hill is not far behind with one driver-related DNF every 6.1 starts.

At the other end of the spectrum, Juan Manuel Fangio emerges as one of the safest drivers in history, crashing out of a race only once in his 50 starts. In fact, he may not have even been to blame for that crash (at Spa 1953), as some sources attribute the crash to a steering problem. In an era where a single crash could easily be fatal, Fangio usually drove within his limits. The same could be said of Clark, Gurney, Stewart, McLaren and Hulme. Clark and McLaren nevertheless lost their lives to crashes caused by mechanical failures.

A few stereotypes are also put to rest. Senna and Schumacher were aggressive drivers, but they were not especially crash-prone, despite their reputations among many newer fans. Mansell, Piquet, Lauda, and Hakkinen were all more likely to end races at their own hands. Farina, who was considered a dangerous driver that often put other drivers out of races, had fewer driver-related DNFs than Ascari, Moss, Hawthorn, Brabham, Collins, and Brooks. Among the modern drivers, Maldonado and Grosjean are quite crash-prone, but Sutil has them both beaten, with one driver-related DNF every 5.2 starts.

Hamilton is the most crash-prone of the current world champions, with one driver-related DNF every 11.0 starts. However, he isn’t as far behind Button as one might expect, and he has a similar crash rate to Prost. Alonso stands out as the safest of the current world champions, with just one driver-related DNF every 19.9 starts.

Bianchi and Bottas rank ahead of Alonso, but there is significant uncertainty in their crash rates, due to their small number of starts. To give some more robust estimates, we can see how the current drivers would rank if they were to each have 2 driver-related DNFs in the 16 remaining races this season.

A note regarding uncertainty

As fans of a sport, we often think of results in an absolute sense. There is no question who won the 2008 season or how many times Schumacher crashed in his career. These are just facts. The graphs of starts per driver-related DNF are based on these facts.

However, from a statistical perspective, it makes sense to think about uncertainty in these measurements. Michael Schumacher had 30 driver-related DNFs in his 288 starts, giving him one driver-related DNF every 9.6 starts. By comparison, Mika Hakkinen had one driver-related DNF every 9.5 starts. Is that difference meaningful, or might it just be down to random chance (variation in the data sample)? If we could somehow rerun history many times, how often would we expect Schumacher to have the higher crash rate than Hakkinen?

To estimate the uncertainty in our measurements, we need to make some assumptions about the underlying statistical distribution from which these samples are drawn. A not unreasonable assumption is that driver-related DNFs have a Poisson distribution, with the mean of that distribution varying from driver to driver. Using the Poisson distribution, we can exactly calculate our degree of certainty in the mean rate. For a 95% confidence interval (i.e., we are 95% sure that the “true” mean lies within this interval), Schumacher has between 6.7 and 14.2 starts per driver-related DNF, while Hakkinen has between 5.9 and 16.3 starts per driver-related DNF.

In other words, it is very difficult to be certain — in a statistical sense — whether one driver was objectively more crash-prone than another in most cases, even for drivers with relatively long careers. We can be almost certain that Alonso (with a 95% confidence interval of 11.1 to 39.9) is a safer driver than Sutil (with a 95% confidence interval of 3.6 to 9.2), but we can’t generally say much more than that based on these data alone!

Adjusting for other DNFs

One factor that could skew the statistics is car reliability. A driver with an unreliable car might break down before they have the chance to crash. This is particularly important given the trend towards much greater reliability over the past decade. Andrea de Cesaris had 103 non-driver DNFs in his 208 starts, which may have significantly reduced his number of driver-related DNFs.

It’s impossible to know for sure how many driver-related DNFs were prevented by non-driver DNFs (i.e., all other types of DNFs), but we can make a quick and dirty approximation. Let’s assume that driver-related DNFs and non-driver DNFs are independent events, with respective probabilities of Pd and Pf. In any given race, there is the chance of neither of these events occurring, one of these events occurring, or both of these events occurring. In the event that both would have occurred, the actual DNF would be due to whichever event occurred first.

We’ll now make the slightly naughty assumption that driver-related DNFs and non-driver DNFs occur with the same distribution with respect to laps into the race. This is a bit naughty because accidents are probably skewed towards the beginning of the race (when cars are running closer together) and mechanical failures are probably slightly skewed towards the end of the race (after the car has been running for a long time). Nevertheless, if we use this assumption for a first approximation, it means that in races where both a driver-related DNF and a non-driver DNF were going to occur, each type of DNF has a 50% chance of occurring first.

In this case, the total probability of a driver-related DNF is

P(driver failure | no non-driver failure) + 0.5P(driver failure | non-driver failure) = Pd(1-Pf) + 0.5PdPf,

and the total probability of a non-driver DNF is

P(non-driver failure | no driver failure) + 0.5P(non-driver failure | driver failure) = Pf(1-Pd) + 0.5PdPf.

These equations can be solved for Pd and Pf for each driver. We can then estimate an expected number of driver-related DNFs for each driver, if they had never suffered any non-driver DNFs. This is just NPd.

I used this adjusted estimate to recompute the number of starts per driver-related DNF for each driver, as shown below.

The overall ordering of the drivers is largely preserved by this adjustment. At the lower end, Hunt is still just below de Cesaris (a driver he held in very low esteem), but things are improved somewhat for Sutil.

For drivers who have relatively infrequent driver-related DNFs, the differences may seem unimportant — especially given the uncertainty discussed above — but remember that a single extra DNF can easily decide a championship. For example, Webber or Hamilton could have won the title in 2010, if not for untimely crashes. Alonso and Vettel have each averaged about one driver-related DNF per season, whereas Hamilton typically has two, and Raikkonen falls somewhere in the middle.

Championship years and yearly fluctuations

For a given driver, there can be significant fluctuations in the number of driver-related DNFs from year to year. For example, Nigel Mansell had 1 driver-related DNF in 1987, but then 5 driver-related DNFs in 1988. This is not unexpected, given we are dealing with statistics of small numbers. A plausible model for the frequency of driver-related DNFs is the Poisson distribution, for which the coefficient of variation (i.e., the standard deviation divided by the mean) is inversely proportional to the square root of the mean.

We might expect crash frequency to vary across a driver’s career in some cases, due to changes in experience, ability, or driving style. For example, Felipe Massa had one driver-related DNF every 5.0 starts in his first two years in the sport, but since 2005 has had one driver-related DNF every 17.7 starts. Meanwhile, Michael Schumacher had one driver-related DNF every 6.5 starts from 1991-1995, but then settled down to one driver-related DNF every 12.2 starts from 1996-2006. On his return from 2010-2012, he increased to a rate of one driver-related DNF every 9.7 starts.

One interesting observation from the data is that in 50 of the 64 seasons of Formula 1, the champion has had fewer driver-related DNFs than their personal career average. This is likely due to an interplay of several factors that are difficult to disentangle. First, a driver is naturally more likely to win the title in a year in which they lose fewer points through crashes. Second, fewer crashes may be a sign of stronger form, which will also improve the likelihood of a driver winning the title. Third, a driver who is often running near the front may have less opportunity to tangle with other drivers.

Traditional metrics for driver success have focused on the peaks of achievement, including the most number of wins or poles. Equally important, I would argue, is minimizing the number of poor performances.