Willis writes: A couple of things.
First, I’ve digitized all of the Hegerl proxy data, and placed it here. I sampled it at ~three year intervals, and interpolated the actual years.
Second, I took a look at their reconstruction method. They say:
The first step of the reconstruction technique is to scale
the individual proxy records to unit standard deviation, weigh them by their correlation
with decadal NH 30-90°N temperature (land or land and ocean, depending on the target
of reconstruction) during the period 1880 to 1960, and then average them.
Now, except for one proxy, they are using decadally smoothed data. But for the reconstruction procedure, they have averaged out their data to the decadal level. Thus, they are basing their entire reconstruction on how well it fits vs. eight data points … it seems like this alone should put their error levels from “floor to ceiling”. Let me go see …
Yes, there is not a single correlation coefficient (r^2) in the lot that is significant. In fact, the series are so short (only 8 data points = 6 degrees of freedom) and the autocorrelation is so strong that the p-value for the r^2 can only be calculated for four of them. The best of these four is w. Greenland, p = 0.23. Statistically meaningless. The other ten, once they are adjusted for autocorrelation, have less than one degree of freedom, so their p value cannot even be calculated.
Thus, none of the r^2 values is statistically different from zero, and their method falls apart.
Here’s a spaghetti graph of the contestants …
Only 4 of the 14 have an r^2 that is better than a straight line with regards to the Jones data …
The result of the correlation weighting procedure is a dataset that it is more autocorrelated than most of the individual datasets … so we can’t calculate the p value of it either.
A bozo test of the value of their method, I suppose, would be to compare individual correlations with the first four decades of the Jones data, and then see how well they do in the next four decades. A little bit of “out-of-sample” test … I’ll do this using the smoothed data, rather than the decadally averaged data as they have done, to get a more accurate result. Hang on a few minutes … OK, thanks for waiting. Here’s the results …
YIKES … they have almost no correlation at all with the earlier four decades of Jones data, only with the later decades. The overall correlations dont have any relationship with either half, and the two halves have no correlation with each other … and these are the proxies that we’re going to depend on for temperatures a thousand years ago?!?
Can you say “fails the out-of-sample test”? … I knew you could.
PS – How can the correlations be so different for the different periods? Easy. Here’s Jones versus some selected datasets. You can see why the correlations are so radically different during different time frames.