Due to Tito Rabat injuries consequence of a nasty crash in Silverstone, Xavier Simeon moved to Rabat’s garage, and Simeon’s seat in the Avintia team was vacant for the Misano race.

Christophe Ponsson, who currently races in the Spanish Superstock 1000 series, landed the seat despite having zero experience in a MotoGP bike or anything similar. Ponsson was so unfamiliar with a MotoGP prototype bike, that he felt that learning to ride his Ducati GP16 was “as if I start again riding on a bike”.

Ponsson slipped into his leaders and ended FP1 with a best lap of 1’40.038, 7.4 seconds (and out of the 107% rule) slower than the fastest lap of the session, and 4.0 seconds afar from the rider in front of him.

In FP2, he improved his lap time by 2 seconds and ended 6.0 seconds behind the session’s best lap, this time inside the 107% rule. He kept improving and on Saturday he qualified with a lap of 1’37.180, 4.8 seconds slower than Lorenzo’s pole position lap. On Sunday, Ponsson finished the race dead last with a fastest lap of 1’37.375, 4.7 seconds slower than the race fastest lap.

So Ponsson speed increased over the weekend, he started the weekend 7.4 seconds of the pace (a bit more if you compare his very first laps), and ended 4.7 seconds behind the fastest pace.

Best lap times are a good measurement to measure the performance of a rider over a whole session, however it does not capture the evolution during the session, lap by lap. Did Ponsson become faster and faster at each lap of each session? How much did he improve per lap during the race, considering that he rode by himself all 26 laps?

If we want to look at the pace improvement during a session we need to take into account the lap times for all laps of the session.

Ponsson and Dovizioso lap times during the race:

In the previous plot we can see the lap times of Ponsson and Dovizioso (for reference). We can see that both riders had a slow first lap, starting from the grid, and then they settled on a fairly constant pace, Dovizioso close to 93 seconds (1’33”), and Ponsson around 98 seconds (1’38”). In addition, Dovizioso lost around 2 seconds in his last lap, celebrating his victory, and Ponsson slowed down considerable on the laps that he was being lapped.

But if we look closely, we can see that Ponsson in fact improved a bit during the race, starting with low 1’38” and dropping to high 1’37” towards the end of the race.

But how can we measure that subtle improvement between laps?

One approach is to fit the data into a relatively simple model that captures the pace at each lap, and how it changes lap per lap. The simplest of models is a linear regression, where we assume a constant pace improvement per lap. But we need to make sure that the lap times from slow laps (start of the race, blue flags, or celebration laps) are not used to compute the pace. For that we need a robust regression.

Thus, the motogpdata elders, decide to use a RANSAC regressor (sklearn) to robustly fit a straight line over the lap times. The robust bit is important, because it is the bit that means that the algorithm will decide for us which laps should be used to compute the pace and which not. RANSAC algorithms do that by splitting the data points randomly into inliers (points to use to fit the model) and outliers (points discarded as being to far away from the model), then it fits the model, computes the residual, and tries another random split of the points. After a few iterations the best model is selected.

So we took the data from the MotoGP statistics website and put together a small spreadsheet with Ponsson’s lap times and the lap times of Dovizioso for comparison, as Dovizioso was the winner of the race and fast all weekend.

We then used the magic oh Python and pandas to load that data in a one liner, parse and clean it a bit, and used the lap times to perform the linear regression using RANSAC.

Ponsson and Dovizioso pace during the race:

The greyed out points are lap times considered outliers by the RANSAC algorithm, and thus have not been used to compute the pace improvement.

The line for each rider represent the estimated pace during the session. The nearly horizontal lines indicate that the race pace did not change much during the race. Ponsson improved 0.018s per lap, compared with a 0.005s improvement per lap for Dovizioso. Ponsson started with a pace of 1’38.2 and ended 0.4s faster with a pace of 1’37.8, as we had inferred on the previous plot. Dovizioso started with a pace of 1’33.3, and ended 0.1s faster, at 1’33.2. Ponsson ended the session 4.6 slower than Dovizioso.

We then repeated the analysis with the 6 other timed sessions of the weekend: FP1, FP2, FP3, FP4, Qualification, and Warm Up.

Ponsson and Dovizioso pace improvement per session:

Ponsson did improve during each session, averaging a 0.2 seconds improvement per lap during the bulk of practice (FP1, FP2, and FP4). However between his pace at the end of FP4 (1’37.9) and at the end of the race in Sunday (1’37.8) he only improved by one tenth of a second over 45 laps.

With complete disregard for the fact that each session is different – that sometimes a rider is chasing a quick lap and others trying the wrong kind of tyres – we can crudely concatenate all laps from all sessions in chronological order and analysed how Ponsson pace improved along the entire weekend.

Ponsson and Dovizioso pace evolution during the whole event:

During the 109 laps analysed of Ponsson’s weekend, he averaged an improvement of 0.042 seconds per lap. In comparison, Dovizioso had an improvement roughly a third as big, at 0.015 seconds per lap.

However our linear model assumption that pace progressed uniformly during the weekend doesn’t seem to correspond with Ponsson lap times progression. In fact Ponsson improved a lot from FP1 to FP4, but not much during qualification, warm up, and race.

In order to capture that change in pace improvement we tried to fit a polynomial function to the data points, using the same RANSAC estimator and the magic of sklearn pipelines.

Ponsson and Dovizioso pace evolution during the whole event (3rd order polynomial):

In this case, we can see that the model better captures the progression of Ponsson’s pace, with a pronounced improvement per lap between FP1 and FP4 and a much more reduced improvement towards the end.

So was Ponsson’s improvement good enough? Probably not.

While he ended the weekend 5.5 seconds faster than he started (see last plot), he was still 4.4 seconds per lap slower than the race winner. In addition, as the weekend progressed, his lap times improvements became smaller and smaller. He started the race with a pace of 1′.38.2 and ended with a pace 1′.37.8. During the 27 laps that he rode alone during the race, he only improved a tenth (0.02s per lap) compared of what he had improved during free practice sessions (0.20s per lap). This suggests that even if Ponsson was given more time, he would only improve marginally, and remain far behind any other rider.

That was in fact the general opinion around the paddock. Several riders argued that it was not safe to ride with a rider 4 seconds slower a lap, and the selection committee of Dorna, FIM, and IRTA refused Ponsson as a substitute rider, forcing Avintia to find a new rider. The Ducati GP16 was raced by Jordi Torres in Aragon.

[Code]

A note of caution when fitting higher order functions:

When deciding what function to fit into the data we need to be careful, fitting too complex models into poor data (small number of samples, too many outliers, etc) can lead to overfitting, see for example what happens when we use a third order polynomial on each session data separately:

Ponsson and Dovizioso pace evolution during each session (3rd order polynomial):

In this case a cubic fit has too many parameters to estimate (4 parameters) with the given data (as low as 7 laps during qualification), and thus it is too flexible and unstable for the data we have, specially for sessions with low number of laps.

For instance, on the last plot we can see that during qualifying (7 laps) the model says that Ponsson’s pace started at 1’34.6 and ended at 1’39.4, which makes little sense. In this case, the pace calculated using linear regression seemed to better capture the real pace.

Also, while we could use the fitted functions to extrapolate when Ponsson’s pace would be similar to Dovizioso’s, one single race weekend is not enough data and we can be led to wrong conclusions, as very well illustrated in xkcd: