Two posts ago, I implemented a Hierarchical Bayesian model of the Premier League. The model, introduced by Gianluca Baio and Marta A. Blangiardo, modeled scoring in soccer as a Poisson process, with the log scoring intensities a linear function of the teams' attacking/defending strengths plus a home field advantage. By fitting the model to the league as a whole, they are able to estimate teams' attacking strengths while 'controlling' for the defending strengths of their opponents, and vice versa.

Since writing that post, I've wanted to reproduce the model for (American) football. But a Poisson process isn't as natural of a fit for football as it is for soccer. It wouldn't to capture the fight-for-field-position process of football. Moreover, because football is more structured, we have more data to work with: we can get much more granular than the game-outcome level.

In this post, I describe a model of the football drive as a piecewise exponential competing risks survival model. I then fit an example implementation, embedding the drive model within a Hierarchical Bayesian model of the NFL.

The Piecewise Exponential Competing Risks Survival Model¶

I learned about the piecewise exponential model via the course materials of Germán Rodríguez, who is based at the Office of Population Research at Princeton. Rodriquez calls it his "favorite model in the context of competing risks," and I'm borrowing heavily from him in this explanation of the model; see here (pdf) and here.

The model assumes there are intervals, each of which has a constant baseline hazard rate. With intervals $0 = \tau_1 < \tau_2 < ... < \tau_{k} = \infty$, the baseline hazard for death by cause $j$ is a step function: $$\lambda_{j0}(t) = \lambda_{jk}, \text{for } t \in [\tau_k, \tau_{k+1})$$

Bringing in covariates $x$ and coefficients $beta_j$, the hazard rate for cause $j$ at time $t$ in interval $k$ is: $$\lambda_j(t,x) = \lambda_{j0}(t)e^{x'\beta_j} = e^{\alpha_{jk} + x'\beta_j}$$ where $\alpha_{jk} = \text{log }\lambda_{jk}$ is the log baseline risk for cause $j$ in interval $k$.

Thus, failures of type $j$ in interval $k$ to people with covariate values $x_i$ are are distributed Poisson with mean $$\mu_{ijk} = E_{ik}e^{\alpha_{jk} + x'_i\beta_j} $$ where $E_{ik}$ is the total exposure of people with covariates $x_i$ in interval $k$ (to all causes). In other words, the likelihood function asks, 'how probable is it that we'd see $failures_{ijk}$ in a $Poisson(\mu_{ijk})$ distribution?', for every combination of $i$, $j$, $k$. (Note that this means covariates have to be discretized.)

What makes the model handy for competing risk is that for the overall risk, just sum over the causes: $$\lambda(t,x) = \sum_{j=1}^m\lambda_j(t,x) = \sum_{j=1}^me^{\alpha_{jk} + x'\beta_j} $$

And given an observed death, the conditional probability that the death was by cause $j$ is just $j$'s risk divided by the overall risk: $$\pi_{jk} = \frac{e^{\alpha_{jk} + x'\beta_j}}{\sum_{r=1}^me^{\alpha_{rk} + x'\beta_r}}$$

Survival analysis is usually concerned with an entity's survival in time. Here, we'll look at a drive's survival in yards. Drives that move backwards will be counted as 0-yard drives.

In a conventional application of the piecewise exponential survival model, the constant-baseline-risk intervals are age intervals - e.g. 0-1 months, 1-3 months, 3-6 months, etc. That is, they are relative to the entity whose survival is being modeled.

But there's no reason these intervals have to be a relative to the entity. In the case of the football drive, they can represent fixed regions of the football field itself.

The idea is to pick areas on the field that have a similar baseline hazard rate. Intuitively, this means picking areas on the field within which an offense or a defense is going to think similarly, especially with regard to risk averseness. So, for example, I defined the following intervals, where yards 51 and up indicate the opponent's territory:

0-13 yards: offense pinned against their goalline, so has to call quick-developing plays 13-75 yards: the 'normal' zone 75-100 yards: 'extended red-zone'; offense thinks 'worse case, we kick a field goal' and focuses on ball control; defense has very small area to worry about

So if a drive starts on the 10 yard line and wants to survive to the endzone, it has to survive the interval-0 hazard rate for 3 yards, the interval-1 rate for 62 yards, and the interval-2 rate for 25 yards. If a drive starts on the 77, it only has to survive the interval-2 hazard rate for 23 yards. And so on.

Intuitively, it's like a drive has to run a gauntlet, with a difficulty that varies by zone. To be clear, the gauntlet is such that each zone isn't pass/fail, but rather death-by-exposure, so that it can occur anywhere within the zone - indeed, it's more likely to occur earlier in the zone than later.

A drive can die by 'running out of steam' - a punt, field goal attempt, unsuccessful 4th down attempt - or by a turnover. While there's no requirement to do so, I model these two causes as competing risks because they have different consequences for the starting field position of the other team. Also, the piecewise baseline hazard rates probably vary: imagine a team focusing more on ball control as it gets closer to the endzone, knowing that as long as they retain the ball, they can kick a field goal.

Covariates can include team-specific parameters (attacking strength, home field advantage), team-interval-specific parameters (red zone attacking strength), drive-situation-specific parameters (weather, game situation (is the offense team losing badly, or down by just a score?)), and game-specific parameters (importance). This flexibility is a key strength of the model.

As an approach to modeling football, it has (at least) these weaknesses:

it makes no distinction between run vs. pass

it cannot represent drives that end 'before' they began - e.g. a drive starting on the 20 but ending on the 10 because of a sack or penalty. No safeties.

it doesn't include special teams, although this could be remedied by adding a special-teams model

it doesn't account for clock management

Example Implementation¶

In the remainder of the post, I show the results of an example implementation of the model. I've embedded the drive model within a Hierarchical Bayesian framework, so that team-specific parameters are drawn from common distributions.

I fit the model to 2014 regular season data from NFLsavant.

I used the following constant-baseline-risk intervals. Note that yards 50+ are in the opponent's territory.

0-13 yards 13-75 yards 66-100 yards

Covariates: Drive Death by Punt/Field Goal¶

I used team-specific attacking/defending 'strength' covariates. My first attempts suffered from overshrinkage (see my earlier post for a discussion of shrinkage in Hierarchical Bayesian models), so I tried the technique used by Baio/Blangiardo in their second model of the Seria A:

One possible way to avoid this problem is to introduce a more complicated structure for the parameters of the model, in order to allow for three different generating mechanism, one for the top teams, one for the mid-table teams, and one for the bottom-table teams. Also, in line with Berger (1984), shrinkage can be limited by modelling the attack and defense parameters using a non central t (nct) distribution on ν = 4 degrees of freedom instead of the normal of § 2.

Using $t$ to indicate team, first we define latent group parameters $grp^{att}(t)$ and $grp^{def}(t)$, with Dirichlet(1,1,1) priors. Team specific attacking and defending strengths are modelled as: $$att_t \sim nct(\mu_{grp(t)}^{att}, \tau_{grp(t)}^{att},

u)$$ $$def_t \sim nct(\mu_{grp(t)}^{def}, \tau_{grp(t)}^{def},

u)$$

To associate the latent groups with a performance tier, the priors on the $\mu$s constrain them to be positive/negative. In this example specification, a negative coefficient means a lower hazard (longer drive), while a positive coeffience means a higher hazard (shorter drive). So for the 'top' group, the priors are: $$\mu_1^{att} \sim truncNormal(0, 0.001, -3, 0)$$ $$\mu_1^{def} \sim truncNormal(0, 0.001, 0, 3)$$

And for the 'bottom' group, the priors are: $$\mu_3^{att} \sim truncNormal(0, 0.001, 0, 3)$$ $$\mu_3^{def} \sim truncNormal(0, 0.001, -3, 0)$$

I fix $\mu_2^{att}$ and $\mu_2^{def}$ to 0, unlike Baio/Blangiardo, who draw them from a Normal distribution.

Each set ($att$s and $def$s) is subject to a sum-to-zero contraint, e.g. $\sum_{t=1}^Tatt_t = 0$.

These covariates are used in all zones:

Team-specific home field advantage, applied when the defending team is home. Home parameters are drawn from $Normal(\mu_{home}, \sigma_{home})$. Unlike the other team-specific parameters, $home_t$ is not subject to a sum-to-zero constraint. (In future work, I'll try giving the attacking team a home advantage as well).

$offense\_winning\_greatly$, a binary flag indicating whether the team possessing the ball is winning by more than 16 points.

$offense\_losing\_badly$, a binary flag indicating whether the team poessessing the ball is losing by more than 16 points.

$two\_minute\_drill$, a binary flag indicating whether the drive began in the last two minutes of a half.

Covariates: Drive Death by Turnover¶

Because we observe fewer deaths-by-turnover, we should cut down on the covariates to make sure the model is identifiable. With that in mind, the covariates I used in this implementation are:

Team specific ball-retention abilities $att_t \sim Normal(0, \sigma_{att})$

Home field advantage, applied when the defending team is home. Note that it's not team-specific here.

$offense\_winning\_greatly$, a binary flag indicating whether the team possessing the ball is winning by more than 16 points.

$offense\_losing\_badly$, a binary flag indicating whether the team poessessing the ball is losing by more than 16 points.

$two\_minute\_drill$, a binary flag indicating whether the drive began in the last two minutes of a half.

Assessing Model Fit with Simulation¶

In order to simulate a game, we have to make some assumptions. The major assumptions are:

Special Teams There are no kickoff returns, all kickoffs result in a touchback. If a drive dies by punt/field goal, the team on offense kicks a field goal is they're passed some point on the field (I used 66), and all field goals attempts are good. A punt and return net always net out to some constant number of yards (I used 38).

The Clock The clock time elapsed by a drive is a linear function of the drive distance. Touchdowns, field goals, and punts all take a certain constant amount of time off the clock.

Ties Ties are decided by coin flip.



The ties assumption is easily fixed, but I haven't had the chance yet. Simulation code is here.

After fitting the example implementation on 2014 regular season data, and using these assumptions, I used the model to simulate the 2014 season 900 times. For each simulated season, I drew a set of parameter values from the posterior distribution.

The model + simulation recreates some macro-level features of football, such as the distribution of drive starting points. I think the discrepency here is largely driven by my assuming away special teams, although again, this could be remedied by a sub model.