Barton’s question 7d asked MBH about other verification statistics. We’ve discussed their withholding of the R2 statistic here, here and here. In our GRL article, we also pointed out that their 15th century reconstruction also failed the CE statistic, another verification statistic used in dendroclimatic studies [Cook et al, 1994].
Mann’s retort was that climatologists "prefer" the RE statistic. I recently noticed an interesting discussion of the CE statistic in Mann and Rutherford .
Mann and Rutherford state:
We made use of two distinct diagnostics of the data variance resolved by the reconstructions [see e.g. Cook et al., 1994] including the reduction of error (“ÅRE’) which uses the calibration period climatology as a baseline (this is called “ÅàÅ½àⰧ by Mann et al., ), and the coefficient of efficiency (“ÅCE’) which instead employs the verification period climatology as a baseline [equations reformatted here because I don't know how to do math terminology on WordPress]:
RE = 1- àÅ½à⡠(à’¡”¬?[i] — u[i])^ 2/ àÅ½à⡠(à’¡”¬?[i] — à’¦à«[cal])^2
CE = 1- àÅ½à⡠(à’¡”¬?[i] — u[i])^ 2/ àÅ½à⡠(à’¡”¬? [i] — à’¦à«[ver])^2
RE = 0 and CE = 0 represent the scores for a reconstruction which simply specifies the climatological mean, and thus only positive values of the statistics indicate statistically significant skill. CE = 0 is a more challenging threshold since, unlike RE, CE does not reward reconstructing an observed change in mean relative to the calibration period.
Their Table 1 shows reconstructions from pseudoproxies constructed to have a relationship to temperature. Of the 29 reconstructions from pseudoproxies, 13 have RE values over 0.9 and a cumulative total of 24 of 29 have a value exceeding 0.8. Thus, very RE values seem to be obtained ffrom pseudoproxies with an actual relationship to temperature. 13 of 18 had CE values exceeding 0.5 and 20 of 29 had values exceeding 0.3. So again, reconstructions with pseudoproxies yield high CE values.
Our GRL article reported that cross-verification statistics for the 15th century MBH98 reconstruction were (inter alia): R2: 0.02; CE: -0.35.
So Mann and Rutherford  demonstrate use by two people holding themselves out as climatologists, and not limiting their statistical discussion to the problematic RE statistic
Here’s another sentence in Mann and Rutherford  that’s fraught with significance:
Uncertainties in the reconstructions can be estimated from the residual uncalibrated verification period variance. .
This is a point that has bothered me in the MBH98 confidence interval calculations, which I may have mentioned before. If you have calibration and verification periods, then confidence intervals should be calculated from the verification period, as stated here by Mann and Rutherford . Obviously I don’t rely on them as authoritative on statistical issues ; the point can be confirmed elsewhere. However, in MBH98, they used the calibration period residuals.
Here’s the rub: for the 15th century reconstruction, there was a decently high calibration R2 (from sure curve fitting), while the verification period R2 was 0. The verification R2 of 0 means that the confidence intervals are not reduced below natural variation. There is NO reduction of uncertainty. The use of calibration residuals is not acceptable and should not have been permitted by alert referees.
Mann and Rutherford  may explain another old question, that has re-surfaced recently (Crowley in EOS). After we published our first study, Mann said that the data set which we had used was not the correct data set. Mann said that we had asked for data in an Excel spreadsheet, that errors had been introduced in preparation of this spreadsheet for us and that we’d failed to notice the errors: "garbage in, garbage out". These were all provably false: I’d asked for an FTP location and had emails to prove it. I had no interest in data in a new form – I was interested in the data as used, which any reader of this blog will recognize as the way that I work. We posted up the correspondence to prove the point, but the lie still continues – it was recently published in EOS by Crowley.
The data set was not prepared for us. When we checked the date on the file at Mann’s FTP site, it was dated on August 8, 2002, shortly after the creation of Mann’s FTP site in Virginia on July 31, 2002 . So the claim that they had created it for us was a lie as well. (This does not mean that this file was necessarily used in MBH98. There are third alternatives, one of which is mentioned below in connection with Mann and Rutherford, 2002.) Shortly after publication of MM03, Mann deleted the "garbage" file from his FTP site (where it was on Rutherford’s section) together with its incriminating date evidence.
As to our supposedly failing to notice the errors, we spent over 20 pages itemizing errors in MBH98 data, one of which was collation errors. We described the re-collation and re-calculation of principal component series. For Mann to claim that we failed to notice the errors was one more lie. The disinformation continues and has been re-cycled on this board recently by "Mitch". Unfortunately, I have to respond to Crowley’s EOS article some time soon.
At the time,as I’ve mentioned, I was intrigued by a famous scientist making statements that were provably untrue. It sure caught my interest. Also a new directory of proxy data suddenly materialized in November 2003 – the "real" books, rather than the ones that we’d been shown before. There was lots of hair on this as well. However, I’ve gotten very tired of these canards continuing to circulate and the credence that they continue to receive.
As to the deleted file, it had the name pcproxy.txt. At the time, I found that there was a diagram on Rutherford’s website containing a graphic with the expression "pcproxy", but didn’t save it. A few weeks later when I went back, it had been vacuumed as well. In this case, fortunately, there was a record of Rutherford’s graphic at the Wayback Machine archive and the graphic with the phrase pcproxy is reproduced below (the date of this graphic was 2001, as I recall):
Figure 1. From Rutherford’s webpages as archived by Wayback Machine. Note the legend in which the phrase pcproxy appears.
Contemporary with MM03, we archived the re-collated data, which did not have the collation errors of Mann’s "garbage" data – in Mann’s data, the 1980 values for as many 9 series were identical to 7 decimal places. This was not easy to notice in a big data set, but caught my attention when I noticed it. Ultimately this doesn’t "affect" the results other than showing a lack of care and suggesting that a comprehensive examination of their results was required.
So we provided evidence that we didn’t use the "garbage" data. Mann has never provided evidence that he didn’t. Our surmise for some time (since the November 2003 disclosure of tte "real" books) was that pcproxy.txt was probably the product of work that Rutherford did, between MBH98 and our inquiry. Mann and Rutherford [2002 ] does not explicitly report on proxy results (as opposed to pseudoproxy results), but certainly evidences Rutherford working on this topic aronud the date of pcproxy.txt being archived at Mann’s then new Virginia FTP site.
Reference:Michael E. Mann and Scott Rutherford, 2002, Climate reconstruction using “ÅPseudoproxies’, GRL 29(10), 1501, doi: 0.1029/2001GL014554.