Unless you have Tom Brady or another really elite quarterback, it’s hard to get to the Super Bowl unless you have a quality quarterback playing on a rookie contract. Veteran quarterbacks are very expensive, so it’s very important to find a college quarterback who has the upside to run hot in the playoffs while still playing on his rookie contract. Historically, rookie quarterbacks don’t reach the Super Bowl, so my focus in this column is to find players who can succeed in seasons two through four. Everything you will see here is trying to optimize Pro Football Reference’s Approximate Value Per Game (AV/G) metric, strictly in seasons two through four.

What you’ll see below is a bunch of statistics from both college and the NFL. We’re on the search of finding college stats -- production, efficiency, size, combine, etc. -- that have historically led to NFL success. Some stats are much better than others, meaning you can lock onto the few that have been correlated while ignoring the ones that haven’t. Now, there are variables in the NFL Draft QB Stats Report that aren’t in this piece because those variables don’t have historical data, which I need to build correlations and models. Variables like Marginal Efficiency, Marginal Explosion, and 20+ yard percentage may end up being more correlated than the stats I’m looking at here, but I need a few more years of data before I can be more confident in those results.

Correlation Coefficients:

The chart below will look complicated, but it’s really not. The left side of the chart has the college stats, and the top of the chart has the NFL stats. The darker the color (either green or red) the more correlated the two stats are. If there isn’t a lot of color in the cell, then the two stats have little to zero relationship. For example, the college rushing statistics and the NFL rushing statistics are dark green, meaning that college rushing has historically been positively correlated to NFL rushing. On the other hand, “Ball Velocity” has been negatively correlated (red cell) with “NFL INT%”, meaning that the higher the velocity of passes at the NFL Combine, the lower the interception percentage in the NFL. Lastly, college passing TD% has close to zero correlation (white cell) with NFL yards per carry. All of these examples should make sense if you actually took the time to read this paragraph. … The numbers shown in the chart are correlation coefficients. If you want R-squared, just square the number. I chose correlation coefficients to show positive and negative correlations with color.

Statistics that have historically NOT been correlated to early NFL success for QBs:

1. NFL Combine Height (R-squared = 0.00)

2. College INT (0.00)

3. NFL Combine Vertical Jump (0.00)

4. NFL Combine Weight (0.00)

5. NFL Combine Shuttle (0.00)

6. NFL Combine Hand Size (0.01)

7. College INT (0.00)

8. College INT/G (0.01)

9. College INT% (0.03)

10. College Rushing Y/G (0.03)

11. NFL Combine Agility Score (0.03)

Here’s how you read the number in the parenthesis, using the first one as the example: “0-percent of the variation in NFL Approximate Value Per Game in NFL seasons two through four can be explained by the quarterback’s NFL Combine height.”

Old school football guys love themselves a tall and sturdy NFL quarterback, but I’m here to tell you to not listen to that nonsense. Height, weight, and hand size may have been important some older studies, but the NFL has expanded their quarterback qualifications in recent years, which gave smaller quarterbacks a chance to play. Some of these quarterbacks have been pretty darn good, including Russell Wilson (71 inches), Drew Brees (72 inches), Michael Vick (72 inches), and Patrick Mahomes (9.25-inch hands). These players likely wouldn’t have made size minimum thresholds at the time of their drafts, but they showed that the old thinking was too conservative.

Of the on-field stats, the interception metrics have really not mattered. Both total interceptions and interceptions per game have historically not been correlated to early NFL success, and even college INT% has been so-so. I believe the NFL has made too much of interceptions, so buying low on quarterbacks with a few more interceptions is a wise strategy, as long as the rest of the profiles checks out.

Statistics that have historically been slightly correlated to early NFL success (AV/G) for QBs:

1. College Fantasy Points Per Game (R-squared = 0.17)

2. College Total Passing Touchdowns (0.10)

3. College Total Passing Yards (0.10)

4. College Rushing TD/G (0.08)

5. College YPC (0.07)

6. NFL Combine Ball Velocity (0.07)

7. College Passing TD/G (0.07)

8. College Passer Rating (0.07)

9. College AYPA (0.07)

10. College Rushing TD% (0.07)

11. College YPA (0.06)

12. NFL Combine Speed Score (0.06)

13. NFL Combine 40-Yard Dash (0.06)

14. NFL Combine Broad Jump (0.06)

15. NFL Combine Cone Drill (0.06)

If you have been grinding these College Football #Streets as I have, you’re at a slight advantage at dissecting these quarterback prospects. Of all the singular stats I have looked at, College Fantasy Points Per Game has been the most correlated stat to early NFL success… by far! If you weren’t reading my weekly CFB DFS columns (do you even have a life bruh?), then I’ll give you some insight on this past year’s strategy: Pick Kyler Murray and print all the money.

Anyways, total production stats (passing yards and touchdowns) have been better indicators than the efficiency and per game stats historically. Also, AYPA has been slightly, slightly more correlated to early NFL success than YPA. That’s because touchdowns matter.

Out of all the NFL Combine data, ball velocity has historically been the most correlated to early NFL success, but it’s a metric that only has been tracked since 2008, so there’s a sample size warning here (n=36). We’ll take a deeper look at it later, but take a quick glance at the relationships between ball velocity and NFL YPG, TDPG, and INT% in the above chart. Looks like a pretty awesome little stat, and one that Benjamin Allbright (@AllbrightNFL) and others have been utilizing for their quarterback evaluations. I will be doing the same.

Conclusion:

The NFL Draft largely remains a total guessing game. The R-squared between the overall draft pick and a quarterback’s approximate value per game in NFL seasons two through four is just 0.24, and only EDGE and LB have been more correlated with their overall draft position. But it’s important to remember that overall draft position leads to more playing time, even if it’s unjustified. The decision makers have the incentive to keep their own jobs, so they really don’t want to sit their high draft picks. This inflates the R-squared between overall draft pick and NFL success for all positions.

However, the college football player data that we have that goes back years and years hasn’t been very valuable at predicting NFL success either. The things that are somewhat helpful for evaluating quarterbacks are college fantasy points, total touchdowns, total yards, some of the rushing data, and ball velocity at the NFL Combine. Everything else largely doesn’t matter. That includes height, weight, hand size, and interceptions.

But there’s hope. I’m optimistic about the future of NFL Draft analytics, especially at quarterback where we have tons of plays/throws/runs to analyze, with more time and effort going into predicting college football outcomes. I used some of the newest data in my 2019 NFL Draft QB Stats Report, and it’s only a matter of time before we get to build models with those inputs -- I’m waiting for more historical data for the new stats. Lastly, there are ways to use rankings and mock drafts that lead to better results. You’ll see the results of those when I come out with the Analytics Top-300 Big Board.

On the next page, we will get away from overall early NFL success and look at what's correlated individual statistics like fantasy points, passing yards, passing touchdowns, interceptions, and YPA. There are some interesting findings when we begin to plot these data points on graphs. Join me!

Data Notes: The quarterbacks sampled (n=110) met the following requirements: Had a total Approximate Value of at least 2.0 during NFL seasons two through four and were drafted between the year 2000-2017. Some of these quarterbacks that qualified didn’t play in the FBS, so their college statistics were omitted but I kept them in the sample while researching the NFL Combine. It’s up to you if you want to extrapolate these college production/efficiency findings to the FCS level. Also, not every quarterback participated in the NFL Combine, but the sample (n=99 for the 40-yard dash) is large enough to take these NFL Combine results seriously. The reason I choose the year 2000 was to keep the sample large enough to have reasonable results, but not too far away from “Today’s NFL” that is vastly different than the 1980s. Lastly, since this sample is looking at NFL seasons two through four, it’s up to you to extrapolate these findings to rookie seasons and seasons coming after the rookie contract. I’d guess that the data would change marginally, but the major takeaways would more-or-less stay the same. ... In order to establish a baseline for what “NFL success” is, I’ve decided to use Pro Football Reference’s Approximate Value statistic. While not perfect, it’s a good enough metric for the column (and it will help me analyze all positions, which will ultimately let me produce an Analytics Top-300 Big Board). If you aren’t familiar with approximate value per game (AV/G), here is how the top 2018 NFL QBs finished in AV.