Yesterday we noted that “validation” in M08 means that the average of the “late-miss” and “early-miss” RE statistic is above a benchmark of about 0.35. I take no position at present whether this unusual methodology means anything, though I’m a bit dubious.
I also observed that the with-dendro reconstructions added surprisingly few series in the AD1000 step to the no-dendro reconstruction as the majority (16 of 19 NH) of dendro series are screened out.
I also noted that average RE statistic (and thus “validation”) turned out to be sensitive to the inclusion/exclusion of 1 or 2 series. In yesterday’s case, the exclusion of the one “passing” bristlecone in the AD1000 network strongly affected the period of validation.
Today, I’m going to report on another interesting calculation, this time re-visiting one issue noticed right at the time of publication of M08 and one that I just noticed today: both however involving the use of realdata.
First, in September 2008, Jeff Id and I both discussed M08′s replacement of actual Briffa MXD data after 1960 with “infilled” data. Obviously the deletion of post-1960 Briffa data has a long and sordid history. It was also deleted in the IPCC Third Assessment Report spaghetti graph and from the AR4 spaghetti graph. It was the data at issue in the ‘trick …to hide the decline” email. In this case, actual data was not replaced by temperature data, but by “RegEM data” calculated in Rutherford et al
2005. As a reader notes in comments, Rutherford et al say that they did not use Luterbacher data in making this particular sausage. At this time, I don’t know for sure exactly what is in the RegEM sausage as a substitute for actual MXD data. Actual data for 105 out of 1209 M08 series using MXD had been replaced after 1960 by, shall we say, verification-enhanced RegEM data. I had obtained realdata from CRU with a successful FOI request in 2008 and replaced the “enhanced” data with realdata for this analysis.
Second, M08 used performance-enhancing data in the form of 70 Luterbacher gridded data sets that incorporate actual instrumental data. This data was subtracted in Mann et al 2009 (Science) and I do so here as well in both the no-Tilj and the realdata cases.
Third and this was intriguing: yesterday I noted that there were only three dendro series in the AD100 with-dendro network. I discussed the bristlecone series yesterday. Today I looked at the “Tornetrask” series, the citation for which was Briffa, K.R. et al, 1992 (Clim Dyn) (the “bodged” reconstruction.) However, the M08 version went from 1 to 1993 – the time period of Briffa (2000). However, the M08 version didn’t match the Briffa (2000) version or the Briffa et al 2008 Tornetrask-plus-Finland version. Adopting the policy of using realdata whenever possible, I replaced the M08 “Tornetrask” version with realtornetrask data (the Briffa 2000 chronology in this case).
I then re-ran the M08 CPS algorithm using a realdata version extracting the RE statistics by step as before. Without the performance-enhancing Luterbacher series (fortified with instrumental data) and with real MXD data, the RE statistics after 1500 are noticeably reduced for both late-miss and early-miss versions. Before 1500, the late-miss RE statistic is also reduced significantly (due to no-Tilj and the impact of realtornetrask).
Figure 1. Late-miss and early-miss RE statistics for 3 M08-style networks: M08 from archived statistics; no-Tilj: removing contaminated Tiljander sediments and Luterbacher; realdata: using real Schweingruber MXD data, real Tornetrask data and without performance-enhancing Luterbached gridded series with instrumental data.
I then calculated the average RE statistics used to “validate” the M08 reconstructions. As you see, the realdata network results in substantially lower average RE statistics in all periods.
M08 do not report their benchmark “95% significant” statistic, but by inspection of their diagrams and the archived data cps-validation.xls, the 95% benchmark looks to be about 0.355 or so. This is plotted as a horizontal line.
Figure 2. Average RE statistics for three cases as above. Solid line shows the “adjusted” M08 RE statistic (which basically is the “best” RE statistic if there is a higher RE in a network with an earlier starting date ). Dotted line shows M08 “95% significant” RE benchmark.
In this example using realdata, the with-dendro M08 CPS reconstruction fails “validation” for all intervals. This example (CRU_NH) had the “best” “validation” statistics in the base M08 examples and so I don’t expect other combinations to fare any better.
The script to make these figures from previously collated Notilj and realdata versions is at
http://www.climateaudit.info/scripts/mann.2008/blog_20100808.txt. The runs to make the versions are at