Today’s guest post/contest comes from Thomas McDermott, a frequent guest contributor. As always, we thank him for his hard work.

Author’s note regarding the numbers: The Win Probability (WP) numbers shown below were, for the most part, generated using a formula presented by Wayne Winston in his book Matheletics, and subsequently improved upon by Pro Football Reference for their WP model.

The heart of the formula is the Excel NORMDIST function, which returns the normal distribution (the probability) for a given mean and standard deviation. I have made some minor adjustments to this formula, but it is basically the same. The formula requires the use of Expected Points data; the EP dataset I use comes from Brian Burke’s Advanced Football Analytics site (when it was active), and I have adjusted those numbers for era. Since the formula falls apart in certain areas – most importantly, the 4th quarter when the game is close – I abandon it and use other data to generate a WP number. Field goal success rates, 4th down success rates, drive results, PFR’s Play Index and the recently provided play-by-play data from Ron Yurko on GitHub, are some of my alternate sources. I realize there’s a “black box” aspect to Win Probability analysis (if you can find two models that completely agree, let me know), since it’s not something that can be easily checked, and perhaps especially suspect when the author openly states that he adjusts the numbers “manually”. To that I can only say that my intent is to provide as accurate a picture as possible of the games that I analyze, and I’m open to any suggestions, comments or questions. Thanks.

Below is a table showing the performances of Tom Brady and Nick Foles in last year’s Super Bowl (“Rank” refers to how that stat ranks amongst all Super Bowl QB performances, and Value is explained in this footnote ):

The numbers say Brady had the better day, and it’s true that his performance could be viewed as the best in Super Bowl history. But after the game, in the glow of the Eagles victory, I think a lot of fans agreed with what HOF QB Steve Young said on NFL Primetime:

“MVP Nick Foles outdueled (my italics) Tom Brady in the biggest offensive show in the history of the Super Bowl, and he made the plays that scored the touchdown that turned the game in the last part of the fourth quarter, more than anyone else that had done it, against [unintelligible] the Brady era…”

We can forgive Young’s jumbled praise toward the end of his statement because he’s pretty pumped up, and perhaps we can forgive him for crowning Foles without, possibly, looking at both QB’s numbers. But on that second item, I’m not sure Young would change his mind, even if he had the numbers in front of him. It appears to me that for Young, Foles “outdueling” Brady isn’t about who has the most yards or the most touchdowns or the best completion percentage. It’s about which QB made those key plays in crucial moments that significantly altered the outcome of the game…you know, the plays that scored the touchdown that turned the game. Young is saying that in this game, Foles was more “clutch” than Brady. Was he right?

I don’t think it’s a question that really can be answered, even putting aside all the baggage with the term “clutch”. But, if we wanted to take a shot at it, the Win Probability Added (WPA) stat would be the best place to start. As I’m sure most readers of this website know, WPA is simply the difference between a team’s chances of winning the game (its win probability, WP) at the beginning of a play and its WP at the end of that play. That difference tells us how much a play increased or decreased a team’s chances of winning, which is what we want to know if we’re trying to measure “clutch”. To find out the QB’s impact on the game, we can just add up all the WPA for the plays in which he passed, ran or caught the ball.

I’ve processed WPA numbers for Brady and Foles in this last Super Bowl, as well as every other quarterback in every other Super Bowl. I’m going to present my “findings” in a series of posts over the next few weeks; the first one is today, covering Super Bowls 36-52 (2001-2017).

Before diving into the table, there are some ground rules cover:

The QB gets credit for every play where he ran, passed, was sacked, or caught the ball. If he’s responsible for a delay of game penalty, he takes the WPA hit for that. Other than delay of game penalties (if it’s deemed that the QB is responsible), plays that are deemed “no play” do not affect the QB’s WPA (e.g., the QB does not receive credit for pass interference calls). If a QB throws a completion and then the receiver fumbles away the ball, it is a neutral play for him (he neither gains nor loses WPA). The QB gets full credit for the entire play in which he makes a completion – if the receiver catches a 1-yard dump off and then runs 75 yards for a TD, the QB gets credit for all of it. There is one situation where I split WPA: penalty plays that add yards at the spot of the penalty (unnecessary roughness, etc.). The QB gets credit to where the play would have ended without the penalty; if the play results in a first down and the play would have ended short of the first down marker, he gets credit to at least the first down marker.

We’re almost there. I’ve tried to cram as much relevant information in here as possible to provide a full picture of not only the quarterback’s performance, but the context of the game he was in. Below are some explanations of some of the terms:

Vegas Spread : Negative number means the QB’s team was favored by that much at closing (per PFR box score in all cases)

WP at Start : The QB’s team’s probability of winning at the beginning of the game (calculated from the Vegas spread)

WPA : The QB’s total WPA in the game. This number includes the Vegas spread in the calculations and is generally the number that I prefer in my evaluations.

WPA NoSprd : The QB’s WPA assuming both teams have a 50% chance to win the game at the start. This is useful if you think the Vegas spread is unfairly skewing the numbers in a certain direction.

WPA Sup : The total WPA for the QB’s teams’ defense and special teams. Note that this does not include the team’s own rushing WPA. This stat includes the Vegas Spread.

Value : This is a measure of statistical production used frequently by Chase Stuart in various articles; it’s ANY/A above expectation multiplied by the number of dropbacks (see here for a post ranking Super Bowl performances by Value).

And finally, the table. As you can see, Foles did indeed “outduel” Brady: his WPA is +1.00 and Brady’s is +0.61.

There’s so much here I want to discuss, but since I’ve gone on long enough, I’ll limit myself to four comments:

Because WPA is only looking at the impact of plays in the context of winning and losing, it can be inherently unfair. Russell Wilson’s interception at the end of SB 49 was over 4 times as costly as Tom Brady’s two interceptions – combined – in that same game. You may not agree with that assessment, but keep in mind that this stat is supposed to work that way. Brady’s interceptions came earlier, when the outcome of the game was somewhat in doubt, Wilson’s came at a moment when it appeared that the Seahawks had the game all but wrapped up. I’ve already noted this in the “ground rules”, but it’s worth repeating: the QB, Wilson, is taking the full brunt of that WPA loss, there is no parsing to account for receiver error, play calling, great defensive play, etc. But what’s cool is that since we have the effect of the play as a whole, we can divvy that up any way we see fit.

Because of this “unfairness”, WPA is not very useful in evaluating quarterback talent. This quote by Brian Burke will put us in the right mindset: WPA is what I call a narrative stat. Its purpose is not to be predictive of future play or to measure the true ability of a player or team. It simply measures the impact of each play toward winning and losing. Wilson’s interception might be the most impactful play in Super Bowl history (I think it is), but the idea that that tells us anything about Wilson’s ability or how he performs in clutch situations is ridiculous.

Since everything must always, at some point, be about Brady/Manning: Tom Brady earns his reputation as playing well in the clutch, as he has four appearances in the Top 10 (of course, this will change once we add more games), and two appearances in the Top 3! His best Super Bowl game, to me, is the last one: he is statistically off the chart (Value), and he has a high WPA. Unfortunately for fans of “The Sheriff”, Peyton Manning does not fare so well in this ranking, as he has two performances in the Bottom 5. What’s interesting to me about his Super Bowls is that his best game is probably the one most detractors point to as evidence that he is a choker: SB 44 against the Saints, in which he threw the pick-six. His worst game, against the Panthers in 2015, results in him getting a ring. I never liked QB wins as a stat, at least for small sample sizes like the Super Bowl, but after looking at this data, I now really detest it.

Where I think things really get interesting is when we look at WPA Support, and I have to go Brady/Manning here again. It’s been shown by a few analyses – this one by Adam Steele covers regular season numbers, this one, also by Adam, covers the Brady/Manning playoffs, and this one by James Hanson covers support for Brady, Brees, Manning and Rodgers – that generally speaking, Tom Brady has had some solid support (good defenses and special teams) surrounding him for most of his career, and Peyton Manning not as much. But there’s an interesting paradox here: Brady’s support in the Super Bowls, apart from his first one, has generally been disappointing, from a clutch perspective (and from an overall perspective on the last game). In crucial moments, it gives way, setting the stage for Tom Terrific to save the day. On the flip side, Manning has had two Super Bowls where he really benefits from great defensive play; in 2006 against the Bears and 2015 against the Panthers, a game in which it’s safe to say that the defense “won the game”. I think this would be an interesting study just using traditional stats.

Thanks for reading, hope you found this as interesting as I did.