When the playoffs come around in January, the big question will be, which team will eventually win the Lombardi Trophy? From a numbers standpoint, defense — while undoubtedly important in the playoffs — is hard to predict, so we will focus on predicting offensive play, mainly from the quarterback and his surrounding passing offense.

When predicting the playoff performance of quarterbacks and passing offenses, we are in danger of falling for two extremes. After the 2016 regular season, the New York Giants found themselves in the playoffs with the 27th-ranked passing offense (as measured by EPA per pass play), and if you had hoped the Giants would make a strong playoff push based on a “Playoff Eli” myth stemming from five years earlier, you were left disappointed. One year later, the Minnesota Vikings were three-point favorites on the road in Philadelphia based on an excellent season from Case Keenum. We all know how this ended, and based on career performances, we could have anticipated that there is not much of a difference between Nick Foles and Keenum.

While a prior based on a playoff sample from multiple years ago surely holds no predictive power, a prior based on the whole career of a quarterback does, and accounting for it can help to make better predictions, as illustrated in the example above. The goal of this article is to investigate how to balance between the most recent performance (that is the performance during the 2019 regular season or only the last eight games of the season) and the priors we have about a quarterback based on his career performance.

Career statistics matter

We start with a primer about what could become important next August: When going into a season, what is more important for predicting performance, the last season or the whole career of a quarterback? To measure performance, we use the PFF passing grade and expected points added (EPA) per pass play, but first of all, we have to establish what we mean by career statistics. We obviously want to account for sample sizes, since the career of someone like Matt Ryan should give us more confidence in his evaluation than the career of someone like Deshaun Watson.

For that matter, we use Bayesian Inference to update our beliefs of NFL quarterbacks, as introduced by our own Kevin Cole. Instead of simply taking the career average, we use the career performance to update our prior beliefs to a posterior mean, with the sample size playing a large role. For example, Patrick Mahomes has generated 0.28 EPA per pass play with the Chiefs' offense, but due to the relatively small sample size compared to other quarterbacks, our posterior mean for him and the Chiefs is “only” 0.23 EPA, which is still the highest posterior mean of all active quarterbacks. On the flip side, Drew Brees’ career average (0.185 EPA per pass play) and his posterior mean (0.181 EPA per pass play) are almost identical, as his large sample size has us believe that his observed performance basically equals his true performance.

We’ve found that after going back more than 3000 dropbacks (roughly five seasons for a full-time starter), the predictive power doesn’t increase anymore, which is why we use a quarterback's last 3000 dropbacks if we refer to his career stats. We’ve also found the Bayesian method yields a higher predictive power compared to simply taking the career averages (for both EPA per pass play and the PFF passing grade, the R-squared value increases by more than 0.02 when using the Bayesian posterior mean), which justifies our approach.