Continued from here.
The Dirty Laundry residual datasets for AD1000, AD1400 and AD1600 were each calculated using Mann’s “sparse” instrumental dataset, but the resultant sigmas and RE(calibration) statistics don’t match reported values. In contrast, the Dirty Laundry residual dataset for the AD1820 step, which was calculated by Tim Osborn of CRU because Mann “couldn’t find” his copy of the AD1820 residual data, used a different MBH98 target instrumental dataset – the “dense” instrumental series.
Question: is it possible that Mann had two versions of the residual data: sparse and dense? And that he chose the dense version for MBH98 statistics (sigma, RE_calibration) because it yielded “better” statistics, but inadvertently sent the sparse version (with worse values) to Osborn?
This appears to be exactly what happened. If one uses the Dirty Laundry values for the reconstruction in 1902-1980 versus the MBH98 dense temperature series, one gets an exact replication of reported MBH98 calibration RE and sigma (standard error of residuals) for the AD1400 and AD1600 step and reported MBH99 calibration RE for the AD1000 step.

Conclusion: We KNOW that MBH98 calculated residual series using the sparse target because they were sent to Osborn in the Dirty Laundry email and shown in the MBH99 submission Figure 1a. We KNOW that MBH98 calculated residual series using the dense target because of the reported RE_calibration and sigma values in MBH98. The corollary is that MBH98 calculated two sets of residual series and then selected the “better” values for display without disclosing the worse values. Or the selection operation.
MBH99 confidence intervals are related to MBH98 confidence intervals, but different. They were a longstanding mystery during the heyday of Climate Audit blog. In next post, I’ll review MBH99 confidence intervals. We’re a bit closer to a solution and maybe a reader will be able to figure out the balance.
Over and above, this particular issue is another even more fundamental issue: the use of calibration period residuals to estimate confidence intervals when there is a massive failure of verification period r^2 values. Prior to Climategate, I had written several posts and comments in which I had raised the issue and problem of massive overfitting in the calibration period through a little discussed MBH98/99 step involving a form of inverse regression. (Closer to PLS regression than to OLS regression – some intuitions of OLS practitioners have to be set aside.) There are some very interesting issues and problems arising from this observation. And even some points of potential mathematical interest. I’ll try to elaborate on this in a future post.
Postscript
There is a substantial and surprisingly large difference between the two MBH98 target instrumental series (see diagram below). The sparse series, according to MBH98, is the “subset of the gridded data (M′ = 219 grid-points) for which independent values are available from 1854 to 1901″; the dense series is calculated for 1902-1980 from 1082 gridcells. In the 19802-1980 (MBH calibration) period, there is considerably more variability in the sparse series..



5 Comments
Waiting for someone to chime in: “It was 25 years ago! Who cares!”
Jeff, how about (with due apologies):
“It was twenty five years ago today
When Sgt. Mann taught the band to play
They’ve been going in and out of style
But they’re guaranteed to raise a smile
So may I introduce to you
The act you’ve known for all these years
Sgt. Mann’s Lonely Hearts Club Band”
I don’t think there are enough apologies in the universe…
Jeff Alberts:
Because it illustrates how analyses can be botched.
and because it illuminates competency of some climate catastrophists.
I know, Keith. I was just channeling my inner leftist. And now I feel dirty.
One Trackback
[…] MBH98 Confidence Intervals […]