## Did McNally Inflate One Football in the Washroom?

In today’s post,  I’m going to show the Deflategate data from a new perspective.   Rather than arguing about whether the Patriots used the Logo gauge, I’ve assumed, for the sake of argument, the NFL’s conclusion that the Non-Logo gauge was used, but gone further (as they ought to have done). I’ve “guessed” the amount of deflation that would be required to yield the observations. And, instead of only considering the overall average, I plotted each data point and how the “guessed” deflation would reconcile each data point.

Some very surprising results emerged, one of which raises the question in the title: did McNally inflate one football in the washroom?  If the question doesn’t seem to make sense, read on.

Rather than one guess being applicable to all measurements, I ended up needing four different groups each with a different guessed deflation.  A “good” guess (i.e. one that “worked”) for the majority of balls (7) was 0.38 psi – an interesting number that I’ll discuss in the post.  A good guess for two balls was zero deflation.  But for ball #7, it was necessary to assume that it had been inflated by approximately 0.5 psi in the washroom. One ball was lower than the others (0.76 psi) and remains hard to explain.  The Wells Report reasonably drew attention to variability, but did not address the details of actual variability other than arm-waving and did not actually show that erratic washroom deflation was a plausible explanation for observed variability.

While the approach in today’s post doesn’t appear conceptual,  statistical algorithms, including linear regression,  typically solve inverse problems.  The spirit of today’s post is approaching Deflategate as an inverse problem.  In doing so, I am aware (as Carrick has forcefully observed) that the underlying physical conditions were poorly defined, but people still need to make decisions using the available information as best they can.  I think that the approach in today’s post provides a much more plausible and satisfying explanation of the variation in Patriot pressures than those presented by either Exponent or Snyder or, for that matter, my own previous commentary.

Bear with the explanation of context, as the results are interesting.

Context

The Wells Report reported that the standard deviation of Patriot pressure measurements was 0.40 psi (11 balls), as compared to 0.144 psi (4 balls) for the Colt balls.   Using a standard F-test (and Levene test), this difference is not “statistically significant”.   Snyder’s second criticism was that this finding should have been an end to this particular line of argument.   However, this was not what happened.  One of Exponent’s most important technical findings – indeed, it is the finding that directly precedes their conclusions – was that the higher variability of Patriot footballs relative to Colt footballs was most plausibly explained by the footballs not starting the game “at or near the same pressure”:

Specifically, the fluctuations in the halftime pressures of Patriots footballs exceed in magnitude the fluctuations that can be attributed to the combined effects of the various physical, usage, and environmental factors we examined. Therefore, subject to discovery of an as yet unidentified and unexamined factor, the most plausible explanation for the variability in the Patriots halftime measurements is that the 11 Patriots footballs measured by the officials at halftime did not all start the game at or near the same pressure.

This finding, quoted verbatim, was one of the most critical findings of the Wells Report.

I agree with their conclusion in the very narrow language in which it is expressed – language in which each word matters:

the 11 Patriots footballs measured by the officials at halftime did not all start the game at or near the same pressure

On May 6 – the same date as the Wells Report was released, Exponent submitted a report on simulations in which three untrained employees attempted to deflate 12 balls in 1 minute 40 seconds with a standard needle.  The subjects, after one try, achieved remarkably consistent results, both between subjects and between balls. The average deflation was 0.76 psi with a standard deviation of 0.11 psi. This finding was not discussed in the Wells Report, but it poses a conundrum: the observed variability in Exponent’s washroom simulations is much too low to explain observed Patriot variability.

Indeed, the specific pattern of Patriot pressure variability, when examined in detail, in my opinion, argues against washroom deflation and towards a quite different explanation of the quite unusual actual pattern of variability.

The “Inverse Problem”
The diagram below summarizes my estimates of how much deflation would be required to yield the observations – all measurements are based on the Non-Logo gauge, which was calibrated as being relatively close to the Master Gauge. There’s a lot of information in the diagram (and I’ll post a script deriving the diagrams as documentation). In each panel, I’ve converted pressures to ball temperature using the Ideal Gas Law. This facilitates direct comparison of information from Colt balls to Patriot balls.  I’ve also shown a negative exponential “corridor” from the upper and lower Colt measurements to halftime temperatures for dry and wet footballs. The corridor is presented here as a rough-and-ready but plausible guide.

In the left panel, I’ve showed the implied temperatures (pressures) using the half-time Non-Logo measurements. In the right panel, I’ve shown the implied temperatures (pressures) after applying the hypothesized deflation (inflation) shown for each ellipse in the left panel. In each case, I’ve “guessed” the amount of deflation/inflation to apply. Algorithms for many inverse problems begin with a guess, so this is not quite as ad hoc as it looks.

For the largest group of balls (seven), I “guessed” that the deflation would be 0.38 psi. When the implied temperatures were re-calculated using this guess, they fit nicely within the corridor (red + signs). However, if this deflation were applied to the three balls in the two upper ellipses, they end up well above the corridor and this estimate is insufficient for one ball.

Figure 1. Left panel – Nonlogo ball pressures expressed in deg F using the Ideal Gas Law.  The eleven Patriot measurements are divided into three groups indicated by the ellipses.

Next consider the two balls in the ellipse on the upper border of the corridor on the left panel.  If these balls had also been deflated by 0.38 psi, their implied temperature in the right panel would be much too high. Indeed, any deflation for these two balls would move them too high in the right panel.  For these two balls, I “guessed” that there was zero deflation and, using this assumption, they fit nicely in the right panel.

Now for the two outlier balls – one of which is too “warm” and one of which is too “cold”.  The only way of reconciling ball #7 (upper ellipse) is to assume that it had been inflated.   I tried a “guess” of  0.5 psi inflation and, under this assumption, the implied temperature (pressure) fit nicely in the corridor in the right panel.  For the cold outlier, I guess 0.76 psi deflation and, under this assumption, it too fit into the corridor.

Summarizing the reverse engineering: one can arrive at observed pressures under Non-Logo pregame initialization if one assumes that two balls were left alone, seven balls were deflated by 0.38 psi, one ball deflated by 0.76 psi and one ball inflated by 0.5 psi.   These are not the only solutions to the inverse problem, but they give a yardstick.

Exponent’s Deflation Simulations

On May 6 – the same date as the Wells Report was released, Exponent submitted a report on simulations in which three untrained employees attempted to deflate 12 balls in 1 minute 40 seconds with a standard needle.  The subjects, after one try, achieved remarkably consistent results, both between subjects and between balls. The average deflation was 0.76 psi with a standard deviation of 0.11 psi.

Discussion

The -0.38 psi Group

First, the largest group of balls imply deflation of 0.38 psi using the Non-Logo gauge – an amount that, by “coincidence”, is exactly equal to the bias between Anderson’s Logo Gauge and Non-Logo gauge.  An alternative explanation for these particular observations is that these balls were measured pregame using the Logo gauge.  Exponent (and the NFL) rejected the possibility of Logo gauge initialization of Patriot balls, because Anderson’s pregame measurements were more or less consistent with pregame measurements by the two teams themselves.  This topic has been discussed before.  I’ll return to it after discussing the other groups.

The 0 psi Group

Secondly, notwithstanding my past advocacy of Logo gauge initialization of Patriot balls, it is implausible that the two balls in the 0 psi group were initialized pregame with the Logo Gauge, since their implied temperature would be too warm. (They would move well above the corridor in the right panel.)  If these two balls were initialized with the Non-Logo gauge and not deflated, they fit nicely.  So is there any evidence or record of two Patriot balls being treated differently?   How about this:

Anderson recalls that most of the Patriots footballs measured 12.5 psi, though there may have been one or two that measured 12.6 psi. No air was added to or released from these balls because they were within the permissible range. According to Anderson, two of the game balls provided by the Patriots measured below the 12.5 psi threshold. Yette used the air pump provided by the Patriots to inflate those footballs, explaining that he “purposefully overshot” the range (because it is hard to be precise when adding air), and then gave the footballs back to Anderson, who used the air release valve on his gauge to reduce the pressure down to 12.5 psi.

We already know that none of the officials at half-time paid any attention to which gauge they were using and inattentively switched gauges between Patriot and Colt measurements.  Once Anderson put his gauge back into his pocket, it would be random which was gauge was used next.  Suppose that Anderson measured eleven Patriot balls using the Logo gauge, finding two underinflated even with this gauge, and then put his gauge back into his pocket, making a fresh draw when it was time to re-gauge the two Patriot balls. But this time, he drew the Non-Logo gauge and deflated the Patriot balls to 12.5 psi (Non-Logo gauge), yielding the two balls in this second group.

The Outliers

Ball #7 in the above solution to the inverse problem has an estimated inflation of 0.5 psi. The reverse engineered (inverse) inflation of 0.5 psi is, by another “coincidence”, is the difference in pregame pressure between Patriot and Colt balls. Indeed, this prompted the guess.  Could it be possible that one of the Colt balls ended up among the Patriot balls?  When I looked back, it turned out that there was contemporary evidence of such a possibility.   A news story shortly after the game reports an interview with D’Qwell Jackson, a Colt cornerback who intercepted a Brady pass in the second quarter.  Jackson said that the Patriots were using a Colt football “late in the first half”:

Jackson does, however, recall one interesting moment during the first half that has something to do with the latest controversy. He recalls, during a television timeout, there was an especially long delay that prompted him to approach an official.

The game official mentioned something about their efforts to locate a usable football. Shortly after, Jackson noticed that the Patriots were using the Colts‘ footballs late in the first half. Jackson said it was odd to him that New England couldn’t find a football to use, especially in the AFC Championship Game.

It seems crazy that one of the “Patriot” balls was actually a Colt ball, but the extra pressure in ball #7 requires an explanation. The idea of McNally inflating ball #7 in the washroom seems even crazier.

This leaves the single “cold” outlier.  Its implied deflation (Non-Logo initialization) is about 0.76 psi – a typical amount observed in Exponent’s deflation simulation. Ironically this illustrates a major problem in Exponent’s deflation simulations: if the 12 Patriot balls had been deflated according to the Exponent simulations, they would all be in this range. Instead, eleven of 12 balls are way above this level.  On the other hand, the pressure (temperature) is too low to be explained simply by the use of the Logo gauge.

Variability in the Wells Report

With the above background on nuances of variability, I’ll now consider how the analysis in the Wells Report and Snyder’s criticism.

The Wells Report reported that the standard deviation of Patriot pressure measurements was 0.40 psi (11 balls), as compared to 0.144 psi (4 balls) for the Colt balls, but were unable to find that this difference was “statistically significant” because the small number of Colt balls meant that standard tests had little power.

In their simulations, Exponent observed that the difference in measurements between extremes of dry and wet balls was relatively small as follows:

“the maximum differential observed between the dry and wet footballs tested under the same conditions was only approximately 0.3 psig”.

This text is somewhat inconsistent with Figure 27 which shows a differential of ~0.5 psi, but that’s a different story. Either way, the observed pairwise differences between Patriot measurements (“fluctuations” in Exponent’s terminology) was much higher than that observed in their simulations – an issue that Snyder did not confront.  Exponent presented the table of pairwise differences shown below:

Table 1.  Exponent’s table showing pairwise differences in Patriot measurements. Exponent commented as follows: “There are seven pairs of measurements (highlighted in orange and red) in which the drop in pressure between the earlier ball tested and the later ball tested is greater than or equal to 0.75 psig, and there are three pairs of measurements (highlighted in red) in which the drop in pressure between the earlier ball tested and later ball tested is greater than or equal to 1.0 psig.”

From this, they concluded that variations in wetness could not account for the very large variations in Patriot ball pressures and that the Patriot balls measured by officials did not “all start the game at or near the same pressure” (though they didn’t define “near”):

Specifically, the fluctuations in the halftime pressures of Patriots footballs exceed in magnitude the fluctuations that can be attributed to the combined effects of the various physical, usage, and environmental factors we examined. Therefore, subject to discovery of an as yet unidentified and unexamined factor, the most plausible explanation for the variability in the Patriots halftime measurements is that the 11 Patriots footballs measured by the officials at halftime did not all start the game at or near the same pressure.

However, these findings need to be interpreted in light of the analysis in Figure 1.  The orange and red cells all occur in rows 1, 6 and 7, rows that correspond to the three balls in the two upper ellipses in Figure 1.  In the analysis presented above, these three balls did not start the game at the same pressure as the other eight balls because of different gauging and selection, not because of washroom deflation.

Snyder

Brady’s expert witness, Edward Snyder, argued that it was “improper” to proceed with further comparison of variations, once the initial comparison had not yielded a “statistically significant” result:

Secondly, Exponent looked at the variation and the measurements between the Patriots’ balls and the Colts’ balls at halftime. They compared the variances. And despite conceding that there was no statistically significant difference between the two, they went ahead and drew conclusions, but those conclusions are improper.

Later, Snyder expounded further that it was not “sound practice” to draw conclusions from an analysis that did not result in a finding of “statistical significance”:

Q. Let’s go to the next slide. And what did Exponent conclude as a statistical matter about variability?
A. No statistical — no statistically significant difference.
Q. Did they stop there?
A. No. They continued, which is striking,  because, whereas in the difference in difference  analysis, they adopted the standard five percent as  the benchmark, here, they said, no, we will just  continue on and reach conclusions. And it’s right  here at the bottom.
So without having found anything that’s  nevertheless they have a statement that begins in their report, “therefore.”

Q. And in your experience, as a statistical matter, is it a sound practice to draw conclusions  from an analysis which doesn’t reach statistical significance?
A. No.

Exponent rejected Snyder’s criticism on this (and other points) on two grounds. First, their analysis was not limited to the comparison of variability between Patriot and Colt balls, but also included comparison to their simulations.

Then we went and did all that physical testing. We saw the effect of all those other parameters, the effect or no effect of those parameters. We looked at that and then we went back and looked at the variability of the data comparing, at the same time looking at the variation of the balls, individual balls. And could we account in the difference in pressures based on other physical factors.  And the ranges and variability of factors  were not predicted by the effect of, say, ball  wetness and ball dryness that we saw. So we went back and said, you know, there is variability in 2 here.  The statistical analysis you can’t conclude, but based on a review of the fluctuations in the  data and looking at the physical experiments that we  did, we concluded that there is a difference there  and that difference is most likely the differences  in starting pressure of the footballs, two different  analyses.  The statistical analysis did not preclude us  from going back and looking at the physical  realities that we measured. And that’s what we did to come to that conclusion.

Second, they also argued that the p-value of the F-test entitled them to take notice, even if it was greater than 5%:

And similarly, if you take Case 2 which is making a larger adjustment, the reported p-value’s in the neighborhood of .2, a little bit above .2. Again, if you do the analysis without imposing an equal variances assumption, you get a p-value that’s below ten percent. So it’s statistically significant at the ten percent level, not at the five percent level. The other important point in thinking about statistical significance is that it’s not a black or white line at .05. And there’s no direct way that you can connect .05 certainly to a legal standard for preponderance of evidence. So it’s not that if you are .04, it’s more likely than not, and if you  are .06, it’s less likely than not. We have to be clear about that. So for all of those reasons, I  think that first finding is without foundation.

Because Exponent had also purported to justify their statistical analyses without consideration of timing as a sort of “preliminary” gatekeeping, Kessler scored some points as to why they didn’t do the same thing with their analysis of differences in variability.  In my opinion, this was a better rhetorical than analytic point.   I don’t have any issue with Exponent analysing variability; only that they didn’t do enough analysis or that their analysis wasn’t insightful enough.

Snyder also speculated that the greater variability among Patriot balls arose from timing, pointing out that the Colt balls, being measured later, were closer to the asymptote, but did not quantify the impact of this observation.

Q. Even putting aside the fact that Exponent’s results were not statistically significant, are you
aware of any explanation for greater variability  among Patriots’ balls compared to Colts’ balls?

A. I’m not here to offer scientific insights. I don’t know if the first-half conditions could lead
to more variance. I’m just going to focus on the scientific guidance provided by Exponent. And
recognizing that the Colts’ balls were measured some  time in here (indicating). They are measured at a relatively flat part  of the curve (indicating). And if you sample from a
relatively flat part of the curve, you get less  variance. And this was not considered by Exponent when they made this comparison and reached the  “therefore” conclusion

While timing has some effect on the variability, in my opinion, it is a secondary issue, and does not explain the actual variability.

Conclusion

The variability in Patriot pressures is larger than the variability in Colt pressures, but the form of variability is very odd and, in my opinion, is more indicative of inconsistent gauges and even mistaken inclusion of a Colt ball, than of erratic washroom deflation.

Exponent dismissed the idea that Patriot balls might have been initialized using the Logo gauge on the grounds that Anderson’s pregame measurements were more or less consistent with each team’s own measurements prior to tendering the balls.   I’ve argued (as had AEI and MacKinnon) that it was entirely possible that Anderson switched gauges between Patriot and Colt pregame measurements (as did NFL officials at halftime even under its heightened scrutiny).  This scenario removes the similarity of Colt team pregame measurements as an issue, since it adopts the assumption that Colt balls were initialized with the Non-Logo gauge, and only requires similarity between Patriot pregame measurements and Anderson’s Logo gauge. There are two ways that this could have happened, both of which were alluded to in Kessler’s examination. One possibility is that the Patriot gauge, like Anderson’s gauge, was older and had gone slightly off calibration.  A second possibility was that Patriots had done their pregame measurements while the balls were still warm from gloving, rather than waiting for them to cool down.  While Kessler raised these issues, he didn’t close off either one, getting lost on the relevance of gloving after setting up the issue.  The other alternative – and one not squarely addressed in the hearing – was that the Patriots had, for some inexplicable reason, deflated their balls by the implausibly small amount of 0.38 psi, an amount that  is the exact amount of the inter-gauge bias of Anderson’s gauge – a coincidence that, in my humble opinion, is wildly more improbable than the Patriots using an old and slightly off-calibration gauge or doing their pregame measurements when the balls were still slightly warm from gloving.

However, as discussed above, simply assuming Logo gauge calibration doesn’t solve the problem of variability and, in particular, it creates real interpretation problems for three balls that end up being too “warm”, for which I’ve proposed other alternatives. I recognize that reasonable people may regard these alternative explanations as themselves implausible. But, when the details are examined, erratic washroom deflation is not nearly as plausible an explanation for the observed variation as Brady critics assume.

1. Posted Aug 10, 2015 at 5:14 PM | Permalink

You are so much more thorough than Exponent! Thank you for sharing your many analyses.

In support of your suggestion that a Colts ball was included among the Pats balls is footnotes 3 and 47 in Exponent’s report, “According to information provided by Paul, Weiss, we understand that the Patriots may have delivered 13 primary balls prior to the
game, but it is clear that only 11 were measured at halftime.” Additionally, see footnote 39 in Wells and the deflation exercise Exponent reported in Appendix 2 was on 13 balls.

• MikeN
Posted Aug 10, 2015 at 7:03 PM | Permalink

One ball was intercepted, and one was caught for a touchdown and taken out, and 11 were measured at halftime. This is the basis for concluding they started with 13. No one remembers the Pats actually preparing 13 balls.

2. Posted Aug 10, 2015 at 5:55 PM | Permalink

deserves a bold imho:

The other alternative – and one not squarely addressed in the hearing – was that the Patriots had, for some inexplicable reason, deflated their balls by the implausibly small amount of 0.38 psi, an amount that is the exact amount of the inter-gauge bias of Anderson’s gauge

3. Chris
Posted Aug 10, 2015 at 6:46 PM | Permalink

Hi Steve:

What is the variance of the group of balls that may have been measured by the logo gauge? I assume it’s much closer to the colts balls.

All of the explanations you make are plausible. This is the problem with the lack of information on the pregame procedure and two gauges that are very different.

I still don’t understand why the Pats (and Tom Brady) would try to cheat to save 0.38psi. I seriously doubt it would make any difference.

Chris

• Chris
Posted Aug 11, 2015 at 11:09 AM | Permalink

To answer my own question, the standard deviation of the -0.38 group of balls is 0.16.

4. MikeN
Posted Aug 10, 2015 at 7:10 PM | Permalink

Another possibility, we are only seeing one group of Colt footballs. It could be that the other 8 balls have greater variability. There is about a 12% chance of getting a range <.45 by picking 4 of the Pats 11 footballs. About 20% for .7 or less.

• Steve McIntyre
Posted Aug 11, 2015 at 12:20 AM | Permalink

Another possibility, we are only seeing one group of Colt footballs. It could be that the other 8 balls have greater variability. There is about a 12% chance of getting a range <.45 by picking 4 of the Pats 11 footballs. About 20% for .7 or less.

It’s possible, but it’s also possible – and IMO – much more likely that the higher variability of Patriot footballs is because they are more variable. In this post, I’ve tried to argue that such higher variability does not necessarily imply deflation, as opposed to NFL cock-ups.

In retrospect, I should perhaps focused more on ball #7. It had a Non-Logo halftime measurement of 11.85 psi, only 0.3 psi less than the Non-Logo measurement of the last Colt ball (12.15 psi), with the Logo gauge differential 0.25 psi. One has to assume pressure gain of 0.3 psi or so, so the pressure of Patriot ball #7 is more or less equivalent to Colt ball #15, which is supposed to have been set 0.5-0.6 psi higher. This is not easy to explain, especially if the Patriot ball is supposed to have been deflated by a “substantial” amount.

5. MikeN
Posted Aug 10, 2015 at 7:15 PM | Permalink

For the two footballs inflated by Anderson, there is the possibility that the internal temperature of the air in those footballs would have increased due to inflation, by about 1.5C.
Where y of air = 1.4
T2 = T1 (P2/P1)^n, where n = (y-1)/y;

6. Posted Aug 10, 2015 at 9:22 PM | Permalink

As always your analysis is thorough and compelling. But all this seems to be immaterial given the allegations made by the league and the issues under consideration by the court.

It is surprising that the league would spend \$2,500,000 on an investigation concerning the results of measurements from a gauge that costs \$22 on Amazon. The money would have been better spent on a league wide metrology program.

Steve: I realize that the appeal issues concern the CBA and whether Goodell’s disposition of the case is justified under the CBA. I just happen to think that there’s an interesting analytic situation. While it doesn’t seem related to climate, I think that at least some readers might discen why I’m interested in the matter for reasons outside of football.

• MikeN
Posted Aug 11, 2015 at 1:16 AM | Permalink

You mean natural variation causing a change that has been blamed on man?

• Posted Aug 11, 2015 at 3:00 PM | Permalink

There are reasons outside of football? Fans won’t be happy.

• MikeN
Posted Aug 12, 2015 at 3:41 PM | Permalink

Could you put up a comparison of the Pats footballs with the 12 YAD trees?

7. Posted Aug 11, 2015 at 4:37 AM | Permalink

In light of their original blasé regarding the accuracy of the football pressure observations, the NFL’s post hoc concern and the case they’ve built around flimsy data are astounding.

The responses by those who disagree with the NFL’s charges, New England sports talk radio for example, are eerily familiar. “Deflategate” has been locally redubbed as “Framegate.” Apparently the league and the other owners were out to get the Patriots. The story continues that the same owners would now have Goodell fired for handling the situation so poorly.

8. Salamano
Posted Aug 11, 2015 at 10:40 AM | Permalink

Is there a way to submit all this kind of stuff as a sort-of Amicus to Judge Richard M. Berman, to assist in his helping the parties resolve the situation and/or rule on the situation?

9. John Faulstich
Posted Aug 11, 2015 at 10:59 AM | Permalink

Berman will not rule on the merits of the question as to whether there was defalcation of the balls, but rather on the process by which the NFL judged Brady, and determined and imposed the penalty.

But if I am Brady or Kessler I would weave the numerous errors by Wells and Exponent into why I would not accept a compromised penalty including suspension.

10. mpainter
Posted Aug 11, 2015 at 11:11 AM | Permalink

” Once Anderson put his gauge back into his pocket, it would be random which was gauge was used next. ”

###

This is important: Anderson was carrying two gauges around in his pocket.

Has it been stated at what point it became generally realized that his Logo gauge was off by 0.38 psi?

11. John Faulstich
Posted Aug 11, 2015 at 11:14 AM | Permalink

It should have been apparent January 19 when they took the halftime and post game measurements by the two gauges – perhaps not the exact offset, but the fact that there was an offset of about 0.4 psi.

12. JakeM
Posted Aug 11, 2015 at 11:47 AM | Permalink

Steve,

A retired research scientist has found another serious problem with Exponent’s transient curves. Not in the math, but in the underlying experimental methodology to create them:

“Exponent used a football on a stand to predict the temperature increase and
hence pressure increase of the footballs while they were in the locker room at
half time.”

“Exponent carried out an experiment where the football on the stand was cooled
to 48°F, which was the temperature of the footballs when they were brought in
from the field at half time.

The football on the stand was then surrounded by 72°F air to simulate conditions
in the locker room and warms up relatively quickly. As the temperature increases
the pressure increases.

The Wells report states that it was raining at half time and the temperature on
the field was 48°F. The ball boys collected the Patriot’s game balls from the field
and placed them in a ball bag. The ball bag and footballs were cold and wet and as
mentioned previously they were taken to the officials locker room and placed
against the back wall in the ball bag. They remained in the ball bag until they were
tested by the officials.

The 72°F air in the locker room can’t get to the footballs in the ball bag to warm
them up in the same manner as the football on the stand so they remain cold and
the pressure does not increase significantly.”

Exponent’s transient curves are much steeper than if they had simulated the real conditions.

I have some issues with some of his work, and he is a Patriots’s fan, but his main point regarding the balls being in a more insulated environment up until testing is a very valid one.

http://www.deflategatedeflated.com/

Steve: thanks for the link. I too had wondered about the ball bag and thought that this effect needed to be studied. Since even a light windbreaker retards heat exchange, it seemed plausible to me that the ball bag would function like a windbreaker. It’s interesting to see the numbers. Though it seems to me that he proves too much – his transients are much too low for the Colt balls. I’ll post on this.

• JakeM
Posted Aug 11, 2015 at 1:10 PM | Permalink

The fault I have is just as Exponent’s curve is artificially too steep, his is likely too flat.

In his experiment he should have removed balls one at a time in proper intervals to simulate real events, as the number of the balls in the bag decreased one would surmise that so would the insulating effects of those balls.

• Steve McIntyre
Posted Aug 11, 2015 at 1:40 PM | Permalink

AS discussed previously, Exponent’s transients for the Logo gauge werent calculated with the Logo gauge.

For my own calculations, I’ve limited myself to the implied transient to Colt measurements from half-time temperatures, with a plausible differential for wet balls. These give transients that seem plausible.

• JakeM
Posted Aug 11, 2015 at 7:13 PM | Permalink

The exact location of each ball in the bag may also explain some variation. A ball on the outside edges of the bag and mostly only insulated by the bag itself, may warm faster than a ball surrounded by other cold balls and further away from warm air exposure.

Unaware of temperature effects an official could easily grab a ball in the center of the bag, followed by one close the surface, next one down in the bag, etc.

• mpainter
Posted Aug 11, 2015 at 1:58 PM | Permalink

The rate of re-pressurization would have been dependent on the wetness of the ball,and I believe that a simulation would show that a wet ball, taken from the bag, would warm more slowly than a dry ball.

There are two reasons for such a supposition:

1. The heat capacity of water is high and a wet ball would absorb more heat than a dry ball, for any given incremental rise in ball temperature. This simply means that a wet ball would require more heat than a dry ball to warm to room temperature, hence warming would be delayed.

2. The relative humidity of the official’s room, where the ball pressure was measured at halftime, was low, at 20%. This means that much of the heat absorbed from the air by a wet ball would be returned to the air of the official’s room as latent heat of evaporation. This would have retarded the warming of the ball.

IMO, the Wells Report fails to take proper account of the effect of ball wetness on the rate of warming.

13. editstet
Posted Aug 11, 2015 at 11:58 AM | Permalink

I think that the main problem is a belief that the ball boys and then Anderson were overly precise in their measurements. The inaccuracies of the gauges, the speed at which they would check the balls likely meant that some balls were a little overinflated and some a little underinflated. If you’ve ever pumped your own tires you know that sometimes a little air may escape, and even cars with tire warnings allow for some variance. A couple tenths psi here, a couple tenths there and soon you’ve blown everything up into deflategate.

14. Posted Aug 11, 2015 at 12:55 PM | Permalink

I’m chuckling while shaking my head.

My impression of the whole affair is that several people/organizations did bad work, in haste and some incompetence.

Hence it is not like possible to make an accurate case.

(Though if Stephen were in charge there’d at least be a reasonable conclusion, which may be that it is impossible to determine guilt or innocence thus the case should be dismissed.)

15. pdtillman
Posted Aug 11, 2015 at 2:34 PM | Permalink

Um, Steve, maybe you need to rename the blog “Sports Audit”?

• Posted Aug 11, 2015 at 2:50 PM | Permalink

A foolish consistency is the hobgoblin of little minds.

16. Chris
Posted Aug 12, 2015 at 1:04 AM | Permalink

Steve:

Are these differences based on dry balls? If so, could the negative outlier be the ball on the field for most of the final first half drive? It appears to me that they really didn’t rotate balls much, especially in the last minute of the drive which took forever.

17. Rick
Posted Aug 12, 2015 at 10:32 AM | Permalink

Has it been linked before? Brady appears to be the victim of Bugs Bunny Logic.

18. Posted Aug 12, 2015 at 7:40 PM | Permalink

An unflattering courtroom sketch of Brady is leading to the usual japes online. Here’s my current favourite

19. David L. Hagen
Posted Aug 13, 2015 at 3:05 PM | Permalink

The Legal Issue

“What is the direct evidence that implicates Mr. Brady?” Judge Richard M. Berman repeatedly asked NFL lawyer Daniel L. Nash

• MikeN
Posted Aug 13, 2015 at 3:35 PM | Permalink

NFL might be wishing they’d gone to Minnesota.

• mpainter
Posted Aug 14, 2015 at 4:45 PM | Permalink

Interesting, but I can’t see a settlement unless Goodell craters. Brady has nothing to lose by forcing the issue, imo. He loses if he fails to clear his name and wins through vindication.

• MikeN
Posted Aug 14, 2015 at 6:39 PM | Permalink

The procedural issues are considerably against the NFL. Having the judge arguing the facts as well puts them in a tough spot.

20. pats1251
Posted Aug 14, 2015 at 8:02 AM | Permalink

Page 68 of the Wells Report:

“As noted, eleven different Patriots game balls were tested by the game officials during halftime, with each ball tested by each of two officials. The football intercepted by the Colts was not included in the group of eleven Patriots footballs tested. Nor was a football that Patriots fullback James Develin had caught for a touchdown in the first half, which the Patriots set aside for him to retain as a memento. Based on the evidence, we believe that the Patriots game ball bag initially contained thirteen footballs, rather than twelve. In fact, when interviewed by NFL Security on the night of the AFC Championship Game, Jim McNally volunteered that the Patriots game ball bag may have included thirteen footballs. McNally‟s statement—which we were unable to discuss with McNally because the Patriots refused to make McNally available for a follow-up interview—was consistent with information from Walt Anderson, who said that it was “certainly possible” that the Patriots provided a thirteenth ball because teams often include an extra ball or two when inclement weather is expected. Subtracting the intercepted ball and the Develin touchdown ball results in a total of eleven Patriots game balls available for halftime testing.”

“Certainly possible” that the Patriots had a 13th ball before the game. I’d say it’s just as “certainly possible” that 13th ball was the Colts balls Jackson referenced being on the field while the Pats were on offense.

• pats1251
Posted Aug 14, 2015 at 8:04 AM | Permalink

FYI – that is in footnote 39 if anyone is looking for it. True to form, Wells buried an important point in a footnote, and cast it aside with this “certainly possible” wishy-washy hand-waving BS.

• chuckrr
Posted Aug 14, 2015 at 8:32 AM | Permalink

So there were two balls that were never tested or re inflated at halftime? Were those balls ever tested after the game?

• pats1251
Posted Aug 14, 2015 at 8:59 AM | Permalink

Correct.

The Brady Interception and the Develin touchdown ball (reserved for memorabilia) were not tested.

That means, assuming the Patriots came in with 12 balls, a 13th was introduced. We already know, thanks to D’Qwell Jackson, that a Colt ball ended up in play while the Patriots took the field. This is a much more intuitive explanation for why there are 13 balls than Wells gives. Why would the team provide an extra ball in case of rain when they already have a backup set of 12? Just b/c McNally and Anderson couldn’t rule out there being a 13th ball doesn’t mean that when a better solution presents itself, it should not be investigated.

More shoddy, selective, and misleading work by Wells.

• pats1251
Posted Aug 14, 2015 at 9:00 AM | Permalink

*by “thanks to D’Qwell Jackson” I mean that we have knowledge of the Colts ball being in Patriots play. NOT that he introduced the ball.

Sorry for any confusion.

• chuckrr
Posted Aug 14, 2015 at 11:07 AM | Permalink

This point has probably already been made but if those balls were tested after he game it would have pretty much proved the case one way or the other.

• chuckrr
Posted Aug 14, 2015 at 11:09 AM | Permalink

Or at least been the strongest evidence

• MikeN
Posted Aug 14, 2015 at 10:21 AM | Permalink

If the referees had not made a record of their halftime measurements, we would have been told that it is not plausible that they switched gauges between measurements.

21. Carrick
Posted Aug 14, 2015 at 11:34 AM | Permalink

Steve is right about my skepticism of how much you can learn forensically.

I think there are just too many possibilities to allow any particular scenario to be ruled out. As I said on a previous thread, in my opinion the best you can do is show that a particular scenario is consistent with the data (demonstration of plausibility).

It remains in my opinion more plausible that New England employees tampered with the footballs (there are just a lot of coincidences that have just magically lined up otherwise).

Here are a few addressing this post:

What if McNally didn’t deflate one or more football, or was erratic in how much each football was deflated? It seems virtually impossible to me to disentangle this one from any other competing and/or contributing factors.

What if the type of treatment of the football surface preferred by New England resulted in a larger variance? As I mentioned in a previous comment, we know that New England roughed the surface of their footballs and applied more oil to it. This could yield less water absorbed than with the Indy footballs, and could yield potentially wildly different warming curves for the different New England footballs.

Anyway, it has been suggested here that a better testing protocol needs to be implemented. I’d suggest having a video camera recording the testing to be paramount here, along with pressure sensors that are accurate to 0.1 psi or better.

Had Brady fully cooperated with the investigation and had they found nothing incriminating in that, the way this has played out would have been a travesty. As admited by the NFLPA, unfortunately Brady didn’t cooperate fully (assuming he was actually innocent here), and this was probably at the (bad) advise of Michael Yee and the lawyers involved for the NE Patriots.

• Posted Aug 14, 2015 at 4:37 PM | Permalink

Brady, or pretty much any player is probably hesitant to set a precedent of allowing the league to subpoena personal phones. The NFL had the CBA negotiation to secure that right to subpoena private phones, and they didn’t do it. Brady would be adding it for all players by precedent. I would imagine the NFLPA told him not to. And frankly he shouldn’t, they had the team cell phones with McNally and Jastremski. Nothing was on them between Brady and them.

• MikeN
Posted Aug 14, 2015 at 6:46 PM | Permalink

Yes a number of coincidences, but each one is individually more plausible than not.
Logo gauge use was testified to by the referee, tipping the scales in its favor. Wells didn’t test Patriot and Colt gauges.

‘Deflator’ text from May- combined with the Jets game, I think it is exculpatory. Saying the referees are at fault and not McNally means deflator must refer to something else.

Stop in bathroom, more probable than not it is because he had to use the bathroom. The release of this video in my opinion would settle the issue. I suspect if he were breathing heavy or sweating it would have been reported by Wells.

• Steve McIntyre
Posted Aug 15, 2015 at 7:43 AM | Permalink

Carrick says:

As I mentioned in a previous comment, we know that New England roughed the surface of their footballs and applied more oil to it.

I don’t think that we “know” that. I recall reading somewhere that Brady did not want Lexol on the balls because they were expecting wet weather and had asked for the balls to be heavily gloved instead. I think that that was in the transcript somewhere.

• mpainter
Posted Aug 15, 2015 at 10:52 AM | Permalink

Concerning the Patriot practice of “gloving” the ball, there was testimony by Brady. Apparently the pebbly surface is sanded to achieve a slight abrasion of the “pebbles”. He stated his preference that the pebbling not be sanded flat, but partially removed by sanding. This would result in about half(?) of the ball’s surface being abraded. IMO this would have brought those balls which were exposed to the rain close to the point of saturation. Unprotected leather wicks up moisture quickly.

Dale Syphers, professor of physics at Bowdin College, Brunswick, Maine conducted his own tests with wet footballs. He used procedures different from those of Exponent. He found that wet, cold footballs brought into a warm, dry room actually _lost_ pressure. He attributed this pressure loss to evaporative cooling.

Unfortunately, I cannot find any details on his controls and procedures. He also admits to being a Patriots fan.

22. James Evans
Posted Aug 14, 2015 at 12:29 PM | Permalink

I’m struggling to understand. I mean, who cares?

If you devote so much time and effort to something so trivial – are you honing your skills, or do you really not get what’s remotely important in life?

23. JakeM
Posted Aug 14, 2015 at 12:50 PM | Permalink

Yes, I believe the attorney stated it this way:

“Your honor, on advise of his attorney, Mr. Brady declined to throw himself in the well, the NFLPA wasn’t involved in any discussions with Mr. Brady on the likelihood of him floating or sinking, or if that would help to determine his status as a witch.”

“It remains in my opinion more plausible that New England employees tampered with the footballs (there are just a lot of coincidences that have just magically lined up otherwise).”

This turns the truth on it’s head. In order for the NFL to make any case that actual air was let out, they have to magically* line everything up and ignore every bit of contrary evidence.

*magically as in a stage magician who gets the gullible to look at the wrong things. Hey look! He destroyed his phone!! Hey, and 50+ year old dude had to take a pee! A pee!!!

24. Posted Aug 14, 2015 at 1:04 PM | Permalink

From John Dowd, another perspective here.

25. 1sky1
Posted Aug 14, 2015 at 4:52 PM | Permalink

Nobody knows more about tricks with footballs than Lucy Brown. Too bad she wasn’t consulted.

Posted Aug 16, 2015 at 3:10 PM | Permalink

They got married??

26. John Faulstich
Posted Aug 14, 2015 at 7:04 PM | Permalink

for any interested in the court appeal http://profootballtalk.nbcsports.com/2015/08/14/nflpa-calls-goodell-ruling-a-smear-campaign-and-a-propaganda-piece/

27. Posted Aug 15, 2015 at 5:50 PM | Permalink

I have a very much more plebian question: have the Patriots, and in particular Tom Brady, been apprised of the findings shown on this site? This question may have been asked elsewhere, but I have only had limited time to skim your site on this issue, since I am busy preparing for the ice caps to melt and stocking up on air conditioners to make a killing . . . or should I be hoarding firewood and fiberglass insulation? It’s so hard to keep track these days.

Steve: I’ve been in touch with Daniel Goldberg, who acknowledged my email. I don’t think that he necessarily understood the points, other than that they were supportive.

28. John Faulstich
Posted Aug 16, 2015 at 10:16 AM | Permalink

The author has sent to Kessler among others. https://climateaudit.org/2015/08/08/letter-to-daniel-marlow-on-exponent-error/ I sent Stacy James a link to the site as well.

29. John Faulstich
Posted Aug 17, 2015 at 8:54 AM | Permalink

if anyone (ie the NFL) cared about getting the right answer, they should sit you down with the Exponent guys and have all of the calculations done in front of them. The Patriots or Brady I’m sure would support that in a second. The NFL? I doubt it.

Posted Aug 23, 2015 at 12:50 AM | Permalink

geez steve…
9 out of 10 posts since late june on the pats and deflategate?
really?

• mpainter
Posted Aug 23, 2015 at 2:01 PM | Permalink

Judith Curry at her Climate, Etc. blog has a recent post concerning genetically modified food research and the antics of the organic food zealots. Interesting read; over 500 responses so far.

• David S
Posted Aug 27, 2015 at 5:32 AM | Permalink

At least the GMO stuff is of some interest to overseas readers. Would it be possible to have a separate site called NFL audit?

• John M
Posted Aug 27, 2015 at 7:55 AM | Permalink

For the two Davids…

http://startbloggingonline.com/

• Posted Aug 28, 2015 at 10:40 PM | Permalink

The NFL is more popular than the climate debate. If this site attracts a few new readers who may look critically about how statistics can be used its probably a good thing IMO. This is the site that got me to start looking closer after all.

31. Neville
Posted Aug 24, 2015 at 7:21 PM | Permalink

This is O/T but I hope everyone has a chance to look at this latest report from the OZ Climate Council. That’s Flannery, Steffen etc. Just unbelievable, let’s hope Steve has the time to ponder their nonsense.
More heatwaves and higher temps to come, plus much higher SLR etc, etc. But UAH sat data shows that OZ hasn’t warmed for over 17 years and no warming for USA for over 18 years and no warming over the south polar region for over 35 years. Who to believe???

http://www.climatecouncil.org.au/climate-change-2015-growing-risks-critical-choices

• kim
Posted Aug 25, 2015 at 5:32 AM | Permalink

The rest never left us.
=============

32. Posted Sep 2, 2015 at 12:14 PM | Permalink

Steve, there’s another explanation for the one ball: overcorrection.

Steve: I don’t follow what you mean. Also, only two balls were reflated pregame. Three anomalies need to be explained.

33. Rob
Posted Sep 2, 2015 at 7:32 PM | Permalink

Love the analysis! Also, wanted to bring one point to your attention. In your article in the Finanicial Times you mentioned that the Pats should have only had 10 balls. (One given as a souvenir + the Jackson INT ball to bring us to the original 12 balls pregame.)

That however is incorrect. The souvenir ball was given out during the 3rd quarter according to published reports. So there should have been 11 Pats balls, not 10 as you indicated in the article.

Steve: “James Develin caught a one-yard pass from Tom Brady to put the Patriots up 14-0 early in the AFC Championship game.”

• Rob
Posted Sep 2, 2015 at 10:54 PM | Permalink

Steve, I believe you are confused. Do you have any intel that Develin kept the ball? I would love to know your source if so. Because honestly that would be quite interesting if it was proved true. Just another example of the league twisting the facts. However, I unfortunately think you are just getting the facts a little twisted.

The only ball that we publicly know was kept as a souvenir was given to a fan by LaFell in the 3rd Quarter.

• Rob
Posted Sep 2, 2015 at 11:07 PM | Permalink

Steve, I never read that part of the Wells Report about the James Develin ball being tested. I stand corrected. Sorry for the misunderstanding.

Steve: no problem. For reference of others, here is the Wells Report footnote. This connection was brought to my attention by a reader in a comment in an earlier thread.

As noted, eleven different Patriots game balls were tested by the game officials during halftime, with each ball tested by each of two officials. The football intercepted by the Colts was not included in the group of eleven Patriots footballs tested. Nor was a football that Patriots fullback James Develin had caught for a touchdown in the first half, which the Patriots set aside for him to retain as a memento. Based on the evidence, we believe that the Patriots game ball bag initially contained thirteen footballs, rather than twelve.

34. Sam
Posted Nov 1, 2015 at 11:24 AM | Permalink

What a lot of wasted effort to define the physics involved when a proper reading of Rule two would lead one to conclude that Rule two in fact doesn’t preclude deflating the balls before game time after they have been tested.

35. Posted Jan 24, 2016 at 7:33 AM | Permalink

The New York Times quotes MIT professor John Leonard: “I am convinced that no deflation occurred and that the Patriots are innocent. It never happened.” And mentions other professors who have come to similar conclusions.

Final jab: “[Y]ou can’t ignore the laws of science, unless, of course, you’re the N.F.L.”