[Steve: Editorial comment] – This is John A’s post. I do not agree with his editorial flourishes linking this to models. I view the following as illustrating the defects of sole reliance by multiproxy reconstructions on the RE statistic – a statistic for which there are no distribution tables and which is little known or unknown to ordinary statisticians and unstudied outside tree ring circles. This is a battleground issue with respect to Mann and similar studies and it is important to illustrate how it operates; this is useful enough; I don’t view the post as showing more than that.]
Dave Stockwell has shown the future of climate prediction with a great new device that allows you create a new prediction of future climate change which are at least as good as the tree ring proxies were for the past. Arguably this new technique puts expensive climate modelling exercises like climateprediction.net to shame.
Here’s a prediction of future climate generated each time the page is refreshed. Note that the RE and r2 statistics are calculated automatically.
The data in blue is the instrumental data (courtesy of the CRU) and the red is the prediction of future temperatures for the next 100 years.
Let Dave Stockwell explain:
The validation is based on the 11 points at the end of the temperature record not used in generating the simulated points. Two statistics were calculated and can be seen on the figure:
- The R2 correlation is ubiquitously used for quantifying the strength of association of two variables. A critical value of 0.1 would indicate a possible mild correlation, but values closer to one indicate significance.
- The RE reduction of error statistic is used in dendroclimatology and in the “Åhockey stick’ reconstruction of MBH98, where critical values greater than zero are claimed to indicate significance of the model. RE is claimed to be superior to the R2 statistic in WA06.
Hit reload a few times to get a feel for the average of the statistics. The R2 statistic is usually close to zero indicating the prediction has no statistical skill over the validation period. The RE statistic, however, is always greater than zero, and often greater than 0.5.
MBH98 uses an RE benchmark of zero to indicate significance. The random numbers here give RE statistics greater than the critical value of zero. Therefore, using the RE statistic with a critical value of zero would attribute statistical skill to random numbers. That is, under criteria used in MBH98, random numbers could be regarded as skillful predictors of future temperatures.
This example illustrates (if the code is correct) a situation, similar to MBH98, where the R2 statistic correctly indicates no statistical skill in the predictions, but the RE statistic erroneously indicates statistical skill.
Conclusions hinge on the choice of statistic and where you set the benchmark. MM05 obtain a critical value for RE of greater than 0.5 using random red-noise data in a replication of the procedure used in MBH98. Non-existent statistical skill of the models is one of the main arguments in MM05 against the reconstruction method in MBH98.
So there you have it, statistical skill or not? If the statistical tests can be equalled or bettered using random numbers which have long-term persistence, then the next IPCC review, just like the previous one, will contain just as much information to inform policymakers as a table of random numbers. If this is so, then why are climate journals still publishing studies with just this behavior?