During the NFL playoffs there was a controversy as to whether the New England Patriots intentionally deflated footballs below league standards while playing the Indianapolis Colts. The NFL hired attorney Ted Wells to investigate the matter and he recently released his findings, which concluded the Patriots did deflate the footballs. “The Wells Report” is actually two reports: a legal style report prepared by Wells, and a scientific report from a scientific consulting firm named Exponent hired to research the matter. As discussed in a Los Angles Times article in February 2010, Exponent’s neutrality has been questioned as they have argued secondhand smoke does not cause cancer for big tobacco and Chevron dumping toxic waste doesn’t pose any health concerns. I will reveal there are major mistakes in the statistical argument against the Patriots. The clearest way to explain these mistakes is through the following blueprint for fabricating a false conclusion that is embedded in the Wells report, by intention or accident:

Perform preliminary analysis assuming unwanted factors do not exist and prove desired result beyond a reasonable doubt. State result prominently and quietly mention it is preliminary. Using a different model, include unwanted factors in detailed analysis. Conclude desired result is true in a weak, non-statistical manor. DO NOT update the preliminary analysis with the unwanted factors to avoid contradicting the preliminary result. In the legal document, emphatically lead with preliminary analysis result and follow up with the weaker detailed analysis result.


Determining intent is complicated because each report has plausible deniability. After the preliminary analysis, the technical report examines all the factors. The legal report quotes directly from the technical report obscuring the miscommunication between reports. Now, here are the details.



Step One: Preliminary Analysis

Before starting an examination of the factors effecting footballs and measurements, Exponent performs a preliminary analysis assuming that there are no differentiating factors. Exponent wants to check if the Patriots are obviously innocent and there is no need to bother with a detailed study. Exponent expresses this in the executive summary and final conclusion on p64,

This difference in the magnitude of the decrease in average pressure between the Patriots and the Colts footballs, as measured at halftime, was determined to be statistically significant, regardless of which gauges were used pre-game and at halftime. Therefore, the reasons for this difference were an appropriate subject for further investigation. (Emphasis added.)


This analysis shows a 0.4% likelihood of occurring under the assumption there are no factors to vindicate the Patriots. However, this result only proves the need to check if there are factor explaining the pressure difference. Later, the Exponent report will reveal there are factors affecting pressure rendering this analysis quite meaningless. Intentional or not, the statement of this result is confusingly labeled “Conclusions” making this preliminary result easily misinterpreted as an actual final conclusion. The report is entitled: “The Effect of Various Environmental and Physical Factors on the Measured Internal Pressure of NFL Footballs” because only the initial preliminary analysis assumes no factors exist.

Step Two: Detailed Analysis with Unwanted Factors

There is a factor that affects halftime pressure measurements: the Patriots footballs were measured before the Colts footballs. The footballs are inside before the game, then outside in the cold for the first half, and then back inside for halftime measurement. As the football temperature rises and falls so does the football pressure. It’s the Ideal Gas Law. Below, I have included figure 24 and figure 26 of the Exponent report to show there would be a pressure increase roughly 0.6-0.7 psi during halftime. There are many curves in these figures because Exponent is examining wetness and gauge usage, but all the curves have roughly the same shape and increase. The exact measurement times are not known, but some generalities are known:

The Patriots footballs were measured before the Colts footballs so the Patriots footballs should have increased less.



The referees measured footballs during most of halftime so the increase between the first and last football measured would be almost the full 0.6-0.7 psi.



The Colts footballs were measured at the end of halftime because the referees ran out of time and only measured 4 footballs.



The referees inflated the Patriots footballs to 13.0 psi at halftime and this probably occurred in between measuring the Patriots and Colts footballs.


As a crude estimate, the Patriots football pressure drop should be roughly 0.3 psi less than the Colts or about half the full halftime increase. A finer estimate is 0.2-0.4 psi. Examples of times giving 0.2 and 0.4 psi increases are 6-10 and 3-10 minutes respectively.


There are several scenarios Exponent considered. The referee said the Colts football pressure was either 13.0 or 13.1 psi before the game. There is an obvious error in the two gauge measurements (12.50,12.95 psi) for the Colts third football because the first gauge was broken and always showed a measurement 0.4 psi higher than the other gauge. Exponent suggests two options: switching the measurement entries or discarding the measurement and decides switching is preferred. Dropping the third football can by justified by arguments such as the error that occurred could have been a typo and the second gauge measurement is really 12.05 or 12.15 psi. From the following table, the difference between the Patriots and Colts average pressure drops for these scenarios ranges from 0.57-0.73 psi. Therefore, the 0.3 psi time adjustment factor explains roughly half the difference rendering the preliminary analysis meaningless as promised.


Pregame

Colts Pressure (psi) Switch 3rd measurement

Average difference (psi) Discard 3rd measurement

Average difference (psi) 13.0 0.73 0.67 13.1 0.63 0.57

Although Exponent explores the time adjustment factor in detail, Exponent decides not to update the preliminary analysis to include this factor. This is either an accidental mistake or an intentional effort to avoid vindicating the Patriots. Instead, Exponent decides to analyze if the Patriots halftime pressures are plausible on their own rather than analyze the difference between the Patriots and Colts pressures. This is a much more complicated problem because the theoretical pressure value depends on a rat’s nest of unknown assumptions involving time, temperature, wetness and gauge usage. Exponent reproduces numbers consistent with the Patriots measurements, but only with assumptions they feel are unlikely. This conclusion is not presented in a statistical framework and there is no probability or statistical significance associated with this conclusion. This result is actually the Exponent report’s main conclusion.


Step Three: Do Not Update Preliminary Analysis with Unwanted Factors

Now it’s time to insert what Exponent omitted and update the preliminary analysis with an adjustment factor to account for the Patriots footballs being measured before the Colts footballs. The following results are for the Exponent statistical model used in the preliminary analysis with the adjustment added to the Patriots halftime measurements:

Adjustment (psi) Switch 3rd measurement

p-value Discard 3rd measurement

p-value 0.0 0.004 0.017 0.1 0.010 0.037 0.2 0.026 0.077 0.3 0.062 0.154 0.4 0.140 0.289 0.5 0.290 0.499


As in the Exponent report, I will use the p-value to measure the evidence against the Patriots. The p-value must be below 0.05 to be considered statistically significant and a p-value above 0.10 shows so little to no evidence of wrongdoing. Assuming the Colts pregame pressure is 13.1 psi instead of 13.0 psi is the same as adding 0.1 psi to the adjustment. The 0.0 psi adjustment results are the same as the preliminary statistical analysis in the Exponent report. Focus on the adjustments 0.2 psi and larger. There are virtually no scenarios with statistically significant results and many cases show no evidence of wrongdoing. If the 3rd measurement is switched or discarded there is no statistical significance if the adjustment is at least 0.3 or 0.2 psi, respectively. The conclusion from the preliminary analysis has been completely reversed by including this factor. The preliminary analysis claims there is a 0.4% probability of the difference occurring naturally; yet in truth, there is almost no evidence of wrongdoing and certainly nothing statistically significant.

Step Four: Reference Preliminary Analysis

Here’s the Wells report’s argument describing the statistical evidence of deflation from page 114 of the Wells report,

According to both Exponent and Dr. Marlow, the difference in the average pressure drops between the Patriots and Colts footballs is statistically significant. …. In fact, when the halftime measurements are attributed to the gauges most likely to have generated those measurements, there is only a 0.4% likelihood—a fraction of one percent—that the difference in average pressure drops between the teams occurred by chance.


This is a direct quote from the Exponent report’s preliminary analysis. However, I just showed this is completely inaccurate because it doesn’t include the 0.3 psi adjustment to account for the Patriots footballs being measured before the Colts footballs. There is little to no statistical evidence of the difference between the Patriots and Colts football pressure drops being unnatural. The Wells report mentions the weak conclusions from the rest of the Exponent report, afterwards, as minor support, but the 0.4% likelihood is the basis of their statistical argument. Intentional or not, the blueprint has twisted no evidence into a 0.4% likelihood.

Jason Cohen has a PhD in Applied Probability and Statistics from Cornell University and works in quantitative finance. You can follow him on Twitter @jasonicohen.