One of the benefits of panel regressions is that it forces you to spell your null hypothesis out clearly. In this case the null is: the models and the observations have the same trend over 1979-2009. People seem to be gasping at the audacity of assuming such a thing, but you have to in order to test model-obs equivalence.
Under that assumption, using the Prais-Winsten panel method (which is very common and is coded into most major stats packages) the variances and covariances turn out to be as shown in our results, and the parameters for testing trend equivalence are as shown, and the associated t and F statistics turn out to be large relative to a distribution under the null. That is the basis of the panel inferences and conclusions in MMH.
It appears to me that what our critics want to do is build into the null hypothesis some notion of model heterogeneity, which presupposes a lack of equivalence among models and, by implication, observations. But if the estimation is done based on that assumption, then the resulting estimates cannot be used to test the equivalence hypothesis. In other words, you can’t argue that models agree with the observed data, using a test estimated on the assumption that they do not. As best I understand it, that is what our critics are trying to do. If you propose a test based on a null hypothesis that models do not agree among themselves, and it yields low t and F scores, this does not mean the hypothesis of consistency between models and observations is not rejected. It is a contradictory test: if the null is not rejected, it cannot imply that the models agree with the observations, since model heterogeneity was part of the null when estimating the coefficients used to construct the test.
In order to test whether modeled and observed trends agree, test statistics have to be constructed based on an estimation under the null of trend equivalence. Simple as that. Panel regressions and multivariate trend estimation methods are the current best methods for doing the job.
Now if the modelers want to argue that “of course” the models do not agree with the observations because they don’t even agree with each other, and it would be pointless even to test whether they match observations because everyone knows they don’t; or words to that effect, then let’s get that on the table ASAP because there are a lot of folks who are under the impression that GCM’s are accurate representations of the Earth’s climate.