3.1 Descriptive Model

In this part of our study we will present our descriptive generalized linear model. In particular, we build a Bradley-Terry model to understand the factors that impact the probability of a team winning an American football game. This model will be later used in our future matchup prediction engine, FPM, as we describe in Section 3.2.

Let us denote with W ij the binary random variable that represents the event of home team i winning the game against visiting team j. W ij = 1 if the home team wins the game and 0 otherwise. As aforementioned our model for W ij will provide us with the probability of the home team winning the game given the set of input features, i.e., y = Pr(W ij = 1|z). The input of this model is vector z that includes features that can potentially impact the probability of a team winning.

The features we use as the input for our model include:

Total offensive yards differential: This feature captures the difference between the home and visiting teams’ total yards (rushing and passing) produced by their offense in the game.

Penalty yards differential: This features captures the differential between the home and visiting teams’ total penalty yards in the game.

Turnovers differential: This feature captures the differential between the total turnovers produced by the teams (i.e., how many times the quarterback was intercepted, fumbles recovered by the opposing team and turns on downs).

Possession time differential: This feature captures the differential of the ball possession time between the home and visiting team.

Passing-to-Rushing ratio r differential: The passing-to-rushing ratio r for a team corresponds to the fraction of offensive yards gained by passing: (3)

This ratio captures the offense’s balance between rushing and passing. A perfectly balanced offense will have r = 0.5. We would like to emphasize here that r refers to the actual yardage produced and not to the passing/rushing attempts. The feature included in the model represents the differential between r home and r visiting .

Power ranking differential: This is the current difference in rankings between the home and the visiting teams. A positive differential means that the home team is stronger, i.e., ranks higher, than its opponent. For the power ranking we utilize SportsNetRank [19], which uses a directed network that represents win-lose relationships between teams. SportsNetRank captures indirectly the schedule strength of a team and it has been shown to provide a better ranking for teams as compared to the simple win-loss percentage.

Before delving into the details of the descriptive model, we perform some basic analysis that compares the game statistics and metrics used for obtaining the features we include in our regression model. In particular, given a game statistic s i (e.g., total offensive yards), we perform a paired comparison for this statistic between the winning and losing teams. In particular, for each continuous game statistic s i we compare the pairs ( ) with a paired t-test, where ( ) is the value of s i for the winning (losing) team of the jth game in our dataset. Table 1 depicts the results of the two-sided paired t-tests for our continuous statistics together with the home team advantage observed in our data. As we can see all the differences are significantly different than zero (at the significance level of α = 0.01). Fig 1 further presents the empirical cumulative distribution function (ECDF) for the paired differences for all the statistics as well as the probability mass function (PMF) for the distribution of the wins among home and visiting teams. For example, we can see that in only 20% of the games the winning team had more turnovers as compared to the losing team. We further perform the Kolmogorov-Smirnov test for the ECDFs of the considered statistics for the winning and losing teams. The tests reject the null hypothesis at the significance level of α = 0.01 for all cases, that is, the cumulative distribution of the features is statistically different for the winning and losing teams.

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 1. Empirical cumulative distribution function for the paired differences of each feature. Based on the Kolmogorov-Smirnov test the features’ ECDFs for the winning and losing teams are statistically different (at the significance level α = 0.01). The probability mass function for the home team advantage is also presented. https://doi.org/10.1371/journal.pone.0168716.g001

Our basic data analysis above indicates that the distribution of the statistics considered is significantly different for the winning and losing teams. However, we are interested in understanding which of them are good explanatory variables of the probability of winning a game. To further delve into the details, we use our data to train the Bradley-Terry regression model and we obtain the results presented in Table 2. Note here that, as it might be evident from the aforementioned discussion, we do not explicitly incorporate a feature for distinguishing between the home and the visiting team. Nevertheless, the response variable is the probability of the home team winning, while the features capture the differential of the respective statistics between the home and road team (i.e., the difference is ordered). Therefore, the intercept essentially captures the home team advantage—or lack thereof depending on the sign and significance of the coefficient. In fact, setting all of the explanatory variables equal to zero provides us a response equal to Pr(W ij |0) = 0.555, which is equal to the home team advantage as discussed above. Furthermore, all of the coefficients—except the one for the possession time differential—are statistically significant. However, the impact of the various factors as captured by the magnitude of the coefficients range from weak to strong. For example, the number of total yards produced by the offense seem to have the weakest correlation with the probability of winning a game (i.e., empty yards). On the contrary committing turnovers quickly deteriorates the probability of winning the game and the same is true for an unbalanced offense. Finally, in S1 Text we present a standardized version of our model.

While the direction of the effects for these variables are potentially intuitive for the coaching staff of NFL teams, the benefit of our quantifying approach is that it assigns specific magnitude to the importance of each factor. Clearly the conclusions drawn from the regression cannot and should not be treated as causal. Nevertheless, they provide a good understanding on what is correlated with winning games. For example, if a team wins the turnover battle by 1 it can expect to obtain an approximately 20% gain in the winning probability (all else being constant), while a 10-yard differential in the penalty yardage is correlated with just a 5% difference in the winning probability. Hence, while almost all of the factors considered are statistically significant, some of them appear to be much more important as captured by the corresponding coefficients and potential parts of the game a team could work on. Again, this descriptive model does not provide a cause-effect relationship between the covariates considered and the probability of winning.

Before turning to the FPM predictive engine we would like to further emphasize and reflect on how one should interpret and use these results. For example, one could be tempted to focus on the feature with the coefficient that exhibits the maximum absolute magnitude, that is, the differential of ratio r, and conclude that calling only run plays will increase the probability of winning, since the negative differential with the opposing team will be maximized. However, this is clearly not true as every person with basic familiarity with American football knows. At the same time the regression model is not contradicting itself. What happens is that the model developed—similar to any data driven model—is valid only for the range of values that the input variables cover. Outside of this range, the generalized linear trend might still hold or not. For example, Fig 2 depicts the distribution of ratio r for the winning and losing teams. As we can see our data cover approximately the range r ∈ [0.3, 0.98] and the trend should only be considered valid within this range (and potentially within a small ϵ outside of this range). It is interesting also to observe that the mass of the distribution for the winning teams is concentrated around r ≈ 0.64, while it is larger for the losing teams (r ≈ 0.8). We also present at the same figure a table with the range that our features cover for both winning and losing teams. Furthermore, to reiterate, the regression model captures merely correlations (rather than cause-effect relations). Given that some of the statistics involved in the features are also correlated themselves (see Fig 3) and/or are result of situational football, makes it even harder to identify real causes. For instance, there appears to be a small but statistically significant negative correlation between ratio r and possession time. Furthermore, a typical tactic followed by teams leading in a game towards the end of the fourth quarter is to run the clock out by calling running plays. This can lead to a problem of reverse causality; a reduced ratio r for the leading team as compared to the counterfactual r expected had the team continued its original game-plan, which can artificially deflate the actual contribution of r differential on the probability of winning. Similarly, teams that are trailing in the score towards the end of the game will typically call plays involving long passes in order to cover more yardage faster. However, these plays are also more risky and will lead to turnovers more often, therefore, inflating the turnover differential feature. Nevertheless, this is always a problem when a field experiment cannot be designed and only observational data are available. While we cannot claim causal links between the covariates and the output variable, in what follows we present evidence that can eliminate the presence of reverse causality for the scenarios described above.

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 2. Model validity. Our model is trained within the range of input variable/statistics values on the left table. The figure on the right presents the probability density function for r for the winning and losing instances respectively. https://doi.org/10.1371/journal.pone.0168716.g002

Reverse Causality: In what follows we examine the potential for reverse causality. To fast forward to our results, we do not find strong evidence for it. To reiterate, one of the problems with any model based on observational data is the direction of the effects captured by the model. For example, in our case teams that are ahead in the score towards the end of the game follow a “conservative” play call, that is, running the football more in order to minimize the probability of a turnover and more importantly use up valuable time on the clock. Hence, this can lead to a decreasing ratio r. Therefore, the negative coefficient for the r differential in our regression model might be capturing reverse causality/causation. Winning teams artificially decrease r due to conservative play calling at the end of the game. Similarly, teams that are behind in score towards the end of the game follow a more “risky” game plan and hence, this might lead to more turnovers (as compared to the other way around).

One possible way to explore whether this is the case is to examine how the values of these two statistics change over the course of the game. We begin with ratio r. If the reverse causation hypothesis were true, then the ratio r for the winning team of a game would have to reduce over the course of the game. In order to examine this hypothesis, we compute the ratio r at the end of each quarter for both the winning and losing teams. Fig 4 presents the results. As we can see during the first quarter there is a large variability for the value of r as one might have expected mainly due to the small number of drives. However, after the first quarter it seems that the value of r is stabilized. There is a slight decrease (increase) for the winning (losing) team during the fourth quarter but this change is not statistically significant. Therefore, we can more confidently reject the existence of reverse causality for ratio r.

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 4. Evolution of r through the game. Ratio r is stable after the first quarter for both winning (left figure) and losing (right figure) teams, allowing us to reject the reverse causation hypothesis for r. https://doi.org/10.1371/journal.pone.0168716.g004

We now focus our attention on the turnovers and the potential reverse causation with respect to this feature. In order to examine this hypothesis, we obtain from our data the time within the game (at the minute granularity) that turnovers were committed by the winning and losing teams. We then compare the paired difference for the turnover differential until the end of the third quarter for each game. Our results show that the winning teams commit fewer turnovers than their losing opponents by the end of the third quarter (p-value < 0.01), further supporting that avoiding turnovers will ultimately lead to a win. Of course, as we can see from Fig 5, there is a spike of turnovers towards the end of each half (and smaller spikes towards the end of each quarter). These spikes can be potentially explained from the urgency to score since either the drive will stop if the half ends or the game will be over respectively. However, regardless of the exact reasons for these spikes, the main point is that by committing turnovers, either early in the game (e.g., during the first three quarters) or late, the chances of winning the game are significantly reduced.

In conclusion, our model provides quantifiable and actionable insights but they need to be carefully interpreted when designing play actions based on it.