I will go approximately 50-50 for a while on posting statistical and non-statistical notes. Today’s another statistical note. It’s a bit technical, but some of the statistical findings from econometrics on autocorrelated series are highly applicable to climate and, while there is occasional citation of econometric literature in climate articles and occasional forays by econometricians into climate, the diffusion seems very incomplete at present, with climate scientists often using quite (in my opinion) naive and inadequate techniques. So I’m trying to bring some statistical findings on the impact of autocorrelation to the attention of people interested in climate series.
I’m learning some of this as I go, so I’m just a one-eyed man here. Some references have been sent to me on earlier notes and I’ll comment on most of them a little later, after I get review a couple more papers. I’m also going to return to the specific examples of spurious significance cited in our GRL article – the extraordinarily high RE statistics from simulated PC1s combined with insignificant R2 statistics.
One of the reasons for discussing Granger-Newbold and Phillips is to show their approach to "spurious" regression statistics (here t-statistics and F-statistics) and why other statistics need to be considered to ensure that there is no mis-specification in the model. In the case of our MBH98 critique, we argue that it is the RE statistic that is spurious and that the R2 statistic, in this case, provides a cross-check. In the examples below, it is the t- and R2 statistics that are spurious and the DW statistic is a cross-check. The point is the need for care and due diligence, rather than magic bullets. Perhaps the discussion of Phillips and other texts on spurious signifiance will also illuminate the approach that we used in our GRL article and why we discuss "spurious significance"
Phillips has a remarkable series of articles on spurious regression – see a listing here – , a few of which are listed below. One of them deals directly with the issue of trends versus random walks in time series analysis “€œ an issue that I was wondering about in connection with the satellite series.
Phillips  commences with a discussion of Granger and Newbold , commenting that their simulations gave “dramatic demonstration”‘? of the “failure of conventional test procedures”‘?. Phillips summarizes Granger and Newbold as having shown that many time series of interest can be represented by ARIMA-type processes and are often near random walks; and that regressions between such time series frequently have high R2 statistics, combined with highly autocorrelated residuals, indicated by very low DW statistics – a situation which we’ve seen in some important climate series and (I suspect) prevalent in many others not discussed so far.
Phillips points out that no one explained “what exactly goes wrong”‘? with the conventional tests. He then introduced a very high level of mathematical sophistication to the analysis of what, until then, had been a pretty low-brow problem, demonstrating an “asymptotic theory”‘? expressed in terms of “functionals of Wiener processes”‘?, “weak convergence in probability measure”‘?, “functional central limit theorems”‘?. The entire procedure appears really remarkable to me, not just because of the mathematical sophistication, but because the mathematical tools used here seem to come from out of the blue. Who would have thought to discuss humdrum and humorous problems, like regressions of South African wine sales against Honduran birth rates, in terms of martingale convergence?
The results are summarized in his Theorem 1 here, which is not quite as hard to read as it looks, if you set aside the various integrals of Wiener processes as just being a complicated non-standard but calculable distributions and don’t worry about them further for now. Phillips’ Theorem 1 says of regression between random walks:
1) the regression estimate of the slope coefficient does not converge to a constant, but has a random distribution; (see 1a)
2) the regression estimate of the intercept not only does not converge to a constant, but has a diverging random distribution. (see 1b)
3) the regression R2 does not converge to 0, but has a random distribution as (See 1e);
4) the t-statistics for both and do not go to a limiting distribution (as in usual regressions) but increase infinitely (diverge) at a rate of sqrt(N) as (see 1c, 1d)
5) thus, based on a nominal critical value of 1.96 for the t-statistic, the t-test is biased towards showing that a relationship is statistically significant, when it isn’t (rejection of the null hypothesis of no relationship). Phillips observed that the Granger and Newbold recommended a benchmark of 11.2 (rather than 1.96); Phillips pointed out that the benchmark of 11.2 had no meaning, but simply represented the asymptotic distribution in simulations where N=50, which in this case was 1.96*sqrt(N) with N=50 “a rather neat confirmation”.
6) the Box-Pierce Q-statistic diverges with N (not sqrt(N)) as
7) the Durbin-Watson statistic “converges in probability” to 0.
The issue is not just that the t-statistic doesn’t work in 1 out of 20 cases: the problem is that, as N goes up, the t-statistic gets so that it nearly always doesn’t work (on the side of showing significance where none exists.) The problem isn’t limited to the t-statistic, but applies to many other common statistics. The one ray of light in this is the Durbin-Watson statistic. Phillips observes that “all of these results differ from the conventional theory of regression with stationary processes”. He stated that, when the “correct asympototic theory” is used, there are no surprises in the Granger-Newbold results.
This theory applies to random walks (unit root) processes and a vast literature has developed on testing for unit roots. Phillips pointed out the usual asymptotic theory still works when x and y are generated by independent stable autoregressive processes (i.e. à?’?<1), in which case the estimates for àŽ⯠and àŽⰠconverge in probability to 0. However, most calibration periods in paleoclimate are relatively short, e.g. MBH of 79 years. If “low-frequency”‘? effects are sought by averaging or smoothing, the effective calibration period can be reduced much further (even by an order of magnitude). While "near-integrated processes" (AR1 coefficient >0.9) work out OK as N => ‘Ë†Å¾, it turns out their finite-sample properties have many characteristics in common with random walks (unit roots).
Most of my interest is in finite sample situations and so I’ve looked more at what happens to “near-integrated”‘? series (i.e. à?’? near to 1), rather than the unit root literature. As a result of the recent discussion of satellite temperatures, I’ve also looked more at the issue of trends versus random walks and will get to both of them.
Phillips, P. , Understanding Spurious Regressions in Econometrics. Journal of Econometrics, 33, 1986 [30pp] http://cowles.econ.yale.edu/P/cp/p06b/p0667.pdf
Phillips, P. , Regression Theory for Near-Integrated Time Series. Econometrica, 56(5), 1988, http://cowles.econ.yale.edu/P/cp/p07a/p0711.pdf
Phillips, P. and J.Y. Park, , Statistical Inference in Regressions with Integrated Processes: Part 1. Econometric Theory, 4. http://cowles.econ.yale.edu/P/cp/p07a/p0715.pdf
Phillips, P. and J.Y. Park, , Statistical Inference in Regressions with Integrated Processes: Part 2. Econometric Theory, 5, 1989 http://cowles.econ.yale.edu/P/cp/p07a/p0722.pdf
Steven N. Durlauf and P. Phillips , Trends Versus Random Walks in Time Series Analysis. Econometrica, 56(6) http://cowles.econ.yale.edu/P/cp/p07a/p0744.p
Phillips, P. , New Tools for Understanding Spurious Regressions. Econometrica, 66(6), 1998 http://cowles.econ.yale.edu/P/cp/p09b/p0966.pdf