I’ve been looking at MBH98 confidence interval estimation. There are many puzzles as to methodology. Here I’ll touch on some archiving oddities.

Figure 1. Standard deviation of MBH98 Reconstruction Steps

The procedure for confidence interval estimation is described in MBH98 as follows (I’ll discuss these methods on another occasion. I’ve bolded one point being discussed below.

Several checks were performed to ensure a reasonably unbiased calibration procedure. The histograms of calibration residuals were examined for possible heteroscedasticity, but were found to pass a x2 test for gaussian characteristics at reasonably high levels of significance (NH, 95% level; NINO3, 99% level). The spectra of the calibration residuals for these quantities were, furthermore, found to be approximately “Åwhite’, showing little evidence for preferred or deficiently resolved timescales in the calibration process. Having established reasonably unbiased calibration residuals, we were able to calculate uncertainties in the reconstructions by assuming that the unresolved variance is gaussian distributed over time. This variance increases back in time (the increasingly sparse multiproxy network calibrates smaller fractions of variance),

yielding error bars which expand back in time.

MBH98 did their reconstruction in steps, using different proxy rosters in each step. They reported that the confidence intervals became more expanded in the earlier portion of the data set. Figure 1 above shows the à?à’ (standard deviation) values used in the confidence interval calculations. There are a number of oddities in this graph.

First, the original SI (now deleted at Nature in an extraordinary breach of policy) reported 11 calculation steps. Most of the step changes in the above graph coincide with step changes in the MBH98 reconstruction. However, the graphic above shows a step at1650 not referred to in the original SI. I’ve also noticed elsewhere in the archived information that the archived RPC3 (reconstructed temperature principal component) commences in 1650. Further, the original SI indicated a step at 1730, but no step at 1730 is illustrated above. Based on the above, it is very difficult to conclude that MBH98 (or its Corrigendum) have accurately reported their reconstruction steps.

Secondly, in earlier steps, MBH98 used fewer proxies. However, the à?à’ declines from the 1750-1760 step (using 89 proxies) to the AD1700 step (using 74 proxies) and further still in the otherwise undescribed AD1650 step, which will presumably have 65-71 proxies. The 18-24 proxies added between these two steps somehow increase the standard error.

Finally, the à?à’ for the 1400-1450 step declines slightly from the à?à’ for the 1450-1500 step. Remarkably the à?à’ for the 1400-1450 is identical to 7 decimal places to the à?à’ for the 1500-1600 step. One is tempted to conclude that something has been spliced incorrectly and that the confidence interval for the controversial 1400-1450 is incorrect in all illustrations.

## 4 Comments

Steve,

How did they report the error bars? Was it just the à?Æ’ or some function of the à?Æ’ (along the lines of 2 X à?Æ’)?

What were the error bars for the period 1900 to 1980? How do these compare to the error bars reported with the syrface insrument record?

Jeff, the error bars are +- 2 sigma. There’s a curious difference between MBH98 and MBH99 error bars which I just noticed and will post. There is a lot of work in econometrics on variance if there’s autocorrelation, bu obviously MBH98 doesn’t consider this. It looks like they just calculated the standard error of the residuals in the calibration period (not even the verification period(!?!) as an estimate. Steve

Dear Steve, I simply think that these numbers were invented in such a way that they “would look plausible” to a fast reader. The fact that MBH98 was certainly not a perfect manipulation with the numbers where every detail can be measured has already been demonstrated in MM2003.

Generally, it looks puzzling why an older period should be more accurate than a newer one, but on the other hand, it can probably happen, can’t it? More generally, I think that the error margin for the 15th century is many, many times higher than what they say – it’s enough to look at different papers that try to answer the same question.

Hi, Lubos. If the error bars are decreasing with fewer proxies, it would indicate that the proxies being added are worse than noise, I guess. EVen so, it’s hard to figure out theoretically.

If you look at my next post, the standard error of the residuals is higher than the standard deviation of the temperature series being modeled. So one would conclude that there is no skill in the model.

But the statistics of MBH98 is really simplistic. These series have a lot of serial correlation. They also have low frequency red noise (at all scales up to ice ages) so the variance/standard deviation in a relatively short period like a 79 year calibration period would be an under-estimate. There’s a lot of research in econometrics on this.

Even on the data shown here, I don’t see how their reconstruction achieves a confidence interval below natural variation.

Regards, Steve