Mann’s technique did exactly the opposite. The bristlecones were his equivalent of a high-grade hole. Instead of cutting the area of influence like a prudent engineer, his method over-weighted it. Mining promoters would love to promote ore reserves calculated with the equivalent of Mannian principal components. Mining promoters try to use high-grade outliers to run the stock – Myles Allen’s press relesae is very much what mining promoters like to do. Mining promoters are prohibited from doing a press release like Myles Allen’s by law, but obviously do whatever they can to promote their stocks.

The mining perspective on spatial averaging very much influenced my examination of Mann’s data – where the bristlecones do what an outlier high-grade hole does for a mining promotion.

]]>Take a long series of consecutive days on which the temperature is measured at hourly intervals. Estimate the area under of daily curve and correlate it with the max daily temperature, the min daily or the difference between them. Do you get significant correlations? This is a quick test as to whether max/min temps are a suitable proxy for heat flow, which is the more important parameter.

The maths of correlation is way beyond me now, but I used to do a fair bit in mining scenarios. I keep coming back to Geostatistics, a certain type of mathematics developed by M.David at Fontainebleau in France. In a typical application, one has an ore deposit with a few expensive drill holes through it, which have been assayed every metre. The holes are X m apart. How small does X have to be before the values in hole A can be used to predict values in hole B and so allow interploation using a search ellipsoid whose size is determined by the geostatistical analysis.

This is the type of calculation I feel is important in relating far-spaced data such as climate data. There are also ways to forward project time-dependent data streams like temps at a single locality, and smoothing methods which can have advantages over weighted rolling means.

I have even extended it recreationally to correlation analysis – examples such as sunspot activity and the annual yield of tomatoes in California, substituting these arcane cases for drill holes A and B. Correlation analysis is possible when one parameter leads or lags the other.

I keep coming in from way out left field because I’m not a climatologist or a good mathematician. But I have seen good mathematics put to good effect when performed by other capable people.

]]>Unless…you beleive in Stanley Fish…or the other postmodernists…

]]>Make a synthetic seasonal cycle from a sine wave of period one year and sample it every “season” (i.e. once every three months) starting anywhere you like in the series — a pretty reasonable thing to do — I expect there are countless observational records with 3-month sampling increments. Now, calculate the Durbin-Watson statistic for this data — it comes out to be close to 2, indicating independent non-autocorrelated data! Similarly, the lag-1 correlation is zero, again indicating independent values! Now, estimate the uncertainty in the mean, assuming that the data is uncorrelated — i.e. since the data is apparently independent, you would apparently use (amplitude of sine wave)/sqrt(2 x number of values in series), which is WRONG again — the actual uncertainty is around (amplitude of sine wave)/(number of values in series), which is very different.

So what can we say about Durbin-Watson — that is is flawed? — that an “audit” has shown that is can give completely incorrect results?

No — I think it just shows that nothing in statistics is simple, that there is no “right” way to do statistics and that most statistical tests make a good few assumptions, some of which may be quite false — in this case that the correlation between adjacent points is a good indicator of the overall serial correlation.

]]>SSteve (#13): I actually disagree with the Granger and Newbold criterion that you quote (“It has been well known for ….. whatever the value of R2 observed.”). I also believe that the following statement of yours is misleading:

“I’ve posted up a graphic showing the “trend” in satellite temperatures. Formally, this “trend” is generated by a regression of the data against time. Here the DW statistic is DW = 0.4445 (p-value < 2.2e-16), clearly failing the test for autocorrelated residuals."

since it implies that the trend should be questioned just because the DW statistic indicates autocorrelation of the residuals. I would rather not spend time getting into details such as the "Gaspe calibration", but here are my general beliefs on Durbin-Watson and regression.

Firstly, the Durbin-Watson test is a very stringent test for autocorrelation. It detects "autocorrelation" even if only adjacent values are correlated. Consider the following example. Generate a series of 100 independent random numbers from a statistical distribution of standard deviation S. The series may be symbolised by "A B C D ….." where each letter represents a random number (you'll have to make up some new symbols after "Z" to get to 100 numbers). The Durbin-Watson statistic for this series is close to 2, from which you correctly deduce that the data us not autocorrelated (at least for adjacent values). Now introduce intermediate points into the series, each equal to the preceding one, so the series becomes "AABBCCDD ….". The Durbin-Watson statistic for this second series is close to 1, so it completely fails the "no correlation" test. However, the second series contains JUST AS MUCH INFORMATION as the first (100 independent numbers). Linear statistics such as the

mean or trend are essentially THE SAME for both series. The only thing that is different is the way that you estimate the uncertainty. For example, the standard error (i.e. the standard deviation of the mean) of the first series is S/sqrt(100), since we know the values are independent. However, for the second series, we may be tempted to think that standard error is S/sqrt(200), since there are 200 values in the series. This would be wrong, as there are only 100 INDEPENDENT values in the series (i.e. the number of degrees of freedom), so the standard error is actually S/sqrt(100) — the same as for the first series. So there is nothing inherently WRONG in the second series, even though it fails the Durbin-Watson test for independence — it is in fact just as "valuable" as the first series because it contains exactly the same data (if you plotted the two series, they would be almost indistinguishable). You just have to be more careful estimating the uncertainty in statistics

(e.g.the mean or trend) derived from the second series.

Secondly, the Durbin-Watson test only considers the NOISE (e.g. the residual of a linear regression) — it "cares" nothing about the SIGNAL (e.g. any underlying trend). The signal-to-noise-ratio is all important here. Consider a signal (a linear trend) plus noise made up of 100 values defined by i + r(i), where i goes from 1 to 100 and r(i) is a set of 100 random numbers with a standard deviation of unity. The first value is therefore 1 + r(1) and the last value is 100 + r(100), and each r is given roughly be r(i) = +/- 1. Plot this series and it quite clearly looks like a straight line with unit slope and a tiny amount of noise. Do a linear regression on this data and you will find a slope close to unity. The Durbin-Watson statistic for the residuals will be close to 2, indicating no autocorrelation. Now generate a second series, which is similar to the first except that you reject values with even "i" (i.e. r(2), r(4), r(6) etc), replacing them with r(2)=r(1), r(4)=r(3), r(

6)=r(5) etc. Again, if you plot this, it will look like a straight line of unit slope with a tiny amount of noise. If you do a linear regression, you will obtain a unit slope. However, this time the Durbin-Watson statistic will be close to 1, which indicates autocorrelation. Do you now reject the estimated slope as suspect, just because the Durbin-Watson test has shown autocorrelation? Of course not — as above, you just have to be a bit more careful estimating the uncertainty of that slope.

Finally, the significance of a linear regression depends mainly on six things:

a. the actual trend in the data (for which you have only an estimate), T,

b. the number of data points, N,

c. the autocorrelation length of the residuals, C,

d. the uncertainty in each data point (which you may or may not know), DP,

e. the standard deviation of the residuals, DR, and

f. the length of the record, L.

If you don't know (DP), you generally assume it is the same as (DR), which means that you cannot judge whether a fit to

a straight line is a good fit to the data. If you DO know (DP), then you can tell that a "straight line fit" is a good model if (DP) is the same order as (DR) — if (DP) is significantly less than (DR), then you know a straight line is not a good fit (for example, a parabola may be a better fit).

So let's consider only the case where a straight line IS a good fit. The uncertainty in the trend is of order DR/(L x sqrt(N)) if there is no autocorrelation, and of order DR/(L x sqrt(L/C)) = DR/sqrt(L^3 / C) if there is correlation (since autocorrelation essentially reduces the number of independent points to L/C). (Please note the "of order" caveat here — I'm omitting lots of constants (which are of order 1) just to give you a feel for the problem.) You can derive this result simply by realising that the result of a linear regression is quite similar to what you get by taking the mean of the first half of the record and the mean of the second half, and dividing the difference of these means by the record length (again, I'm ignoring any annoying constants). As the autocorrelation is reduced, then C approaches the time interval and L/C becomes N.

The Durbin-Watson test only provides a warning that C is larger than L/N (the sampling interval) and that you need to estimate the uncertainty in the trend from DR/sqrt(L^3 / C) rather than from DR/(L x sqrt(N)) (which is the case for uncorrelated residuals). The test CERTAINLY doesn't suggest that you should not use the data for trend estimation.

]]>“Putting a top limit of 0.3 on the autocorrelation coefficient seems completely unjustified to me. There’s lots of evidence of coefficients >0.9.”

I’m not sure what you mean — we are trying to find a threshold for the autocorrelation coefficient below which we are confident that the residuals are uncorrelated (or at least that any autocorrelation has little effect on the results of a linear regression). If you picked a critical value of 0.9, many highly correlated series of residuals would appear uncorrelated. Are you suggesting that we should view a series with an autocorrelation coefficient of 0.9 as NOT autocorrelated?

]]>It is a bit like attempting to estimate the trend in global average temperature by only looking at data observed on every Christmas Day — the residuals could well appear uncorrelated using the Durbin-Watson test, but the trend would ONLY represent the trend in “Christmas Day” global average temperature. If I sampled more frequently (e.g. every month), then I would learn about the seasonal cycle and be able to estimate better the trend in global average temperature (for example, it is quite likely that the trend in July temperatures is different from the trend in December temperatures). However, the Durbin-Watson statistic would now indicate autocorrelation of the residuals, which, as I have pointed out before and will point out again, does NOT necessarily disqualify the trend — it just means that I have to do things a bit differently (e.g. remove both a trend AND the seasonal cycle in the regression) and/or estimate the uncertainty in the trend appropriately).

(AND STEVE — CAN YOU DELETE #18? – IT IS JUST THE BEGINNING OF MY “LOST” POSTING)

]]>I looked at the Emery and Thomson text as well, esepcially section 3.15.1 Trend estimates and the integral time scale. They don’t provide any references for the integral time scale technique. I don’t claim to have encyclopaedic knowledge of statistical literature, but I haven’t run into this technique in general statistical literature. It’s possible that it does something similar to what’s done in HAC (heteroskedastic-autocorrelation consistent) estimation of covariance matrices in economics, but I’m not sure. It’s an interesting topic to think about, which I’m doing. It would be nice to see an actual statistical discussion of the integral time scale technique – maybe I’ll email the authors and inquire as to their source.

]]>