I’m now completely file compatible with Wahl and Ammann. In terms of my algorithm, I had to tweak the procedure for scaling RPCs a little, but the RPCs themselves were identical. It took some patience to reconcile differing data setups. I archived results from a complete run-through from their base case a couple of days ago. They have not archived results yet, but the graphs look very similar.
Our standing prediction for MBH98 type reconstructions is that that verification statistics other than the RE statistic will be insignificant. I suppose that another prediction is that the Hockey Team will not report these other verification statistics. Both predictions are correct here. Needless to say, only the RE statistic is at Ammann’s website. The archived program does not even calculate other statistics, although it’s hard to imagine that they have not peeked to see what the other statistics are.
As someone with prospectus experience, I think that these statistics are highly relevant to readers and that they should not be withheld by the Hockey Team. If they then want to argue that verification statistics other than the RE statistic are incorrect, then let them. (They will then have to defend this position against their own use of these other statistics in their other writings where it is to their advantage.) Anyway, here are my calculations for standard verification statistics for the AD1400 step from the Wahl-Amman run-through. The cross-validation R2 is virtually 0, the sign is correct only 54% of the time just better than random, the product mean test has a t-distribution and is insignificant, the CE statistic is negative. No wonder they don’t want anyone to look at these statistics.
|RE (cal)||RE(ver)||R2 (ver)||CE||Sign Test||Product Test|
I’ve reconciled so that I can now specify differing methods as parameters, press a button on my function which calculates MBH98 NH temperature reconstructions and get both an AD1400 step result and a stepwise result. I’ve reconciled my algorithm step by step to WA methods and kept a reconciliation script if anyone is interested. There was a real curiosity in the stepwise reconciliation. My results were identical to 7 decimal places or so in the AD1400 calculation and to 7 decimal places or so in the stepwise reconstruction, except for 1750-1759 where there was a divergence amounting to about 0.9 sigma in one year. This pointed out an interesting little sensitivity. MBH98 changed the selection of temperature PCs used in different steps. The schedule is shown here. This information used to be on the old Nature SI, which has been deleted; other than the relict UMass FTP website, this information would no longer be publicly available.) If you look at the selected PCs, you’ll see that the AD1750-1759 step [why does this decade have its own step?] uses PCs 1-3,5,6,8,11,15.
I’ve never been able to determine how MBH98 determined which PCs to retain. It always seemed to me that the lower PCs had no conceivable physical meaning and were mere artifacts of matrix decomposition. MBH do not prove that these lower PCs have any stable meaning – they grandly reify them "climate field reconstructions". However, WA’s lower PCs using annual data differ considerably from MBH lower PCs. (Also if the sqaure root of cos(latitude) is used for area weighting in the PC decomposition, as it should be, a different set again emerges.) Anyway, this decade is the only decade in which PCs 6 and 8 are used. In all other periods, PCs 7 and 9 are used instead. I have no idea why. I presumed that this must have been an erratum and substituted PCs 7 and 9 in my parameterization. Wahl and Ammann didn’t – hence the discrepancy. But it makes for an interesting sensitivity result, which I would never have thought of calculating otherwise: using PCs 7 and 9 instead of PCs 6 and 8 can have an impact of up to 0.9 sigma on the reconstruction in a 10-year period.
I’ve not attempted to explore other aspects. However, it does seem presumptuous to assert that you have a 2 sigma confidence interval when such trifles can use up half of your confidence interval, before you even start considering real errors. The difference in scaling results from a couple of factors. MBH98 starts with temperature principal components calculated from a centered calculation using a 1902-1993 reference period. The resulting temperature PCs are re-based to a 1902-1980 reference period and final MBH98 results are reported for a 1902-1980 reference period. After calculating RPCs (reconstructed temperature principal components which have a 1902-1980 reference period under the calculation), WA appear to re-transform them to a 1902-1993 reference period, although MBH98 results and instrumental comparisons are in a 1902-1980 reference period. The effect does not appear to be material, but it is hard to understand why they would do this. I’ve set this up as an optional method parameter so that the re-transformation can be excluded if desired.
The other scaling difference resulted from re-scaling the variance of estimated RPCs to the variance of "observed" TPCs. The variance from the MBH98 RPCs does not typically correspond with "observed" variances. This is probably inherent in the MBH98 procedure for calculating the RPCs – which is not really inverse regression as von Storch et al. . It’s a two-step procedure in which inverse regression is used to establish a set of proxy models (described in inflated terms in MBH98); and then a minimization of the mean squared error of the various models. (This can also be shown to be a regression.) I think that the combination is a type of simple neural network, but, beyond the label, I’m not presently in a position to apply this information. But I’m pretty sure that that would be the direction to look in if one wanted to really examine confidence intervals for this type of process. Ammann added a step to do this re-scaling in April 2005. In our emulations without re-scaling, we also got completely different variances than MBH98; I ended up doing a re-scaling at the NH temperature index stage by re-scaling the variance of our final reconstruction to the variance of the MBH98 reconstruction.
These are pretty minute differences. Most of our principal results all relied on contrasts and are insensitive to this sort of procedural difference. Other results, such as the insignificance of MBH98 verification statistics, are stable to the sifference. Nonetheless, for good order’s sake, I’ll re-execute our reported results under WA methods (I’m still thinking about whether to re-transform to 1902-1993 as they do – I can’t see why.) I’ll send an email to Ammann, but, being a member of the Hockey Team, it’s unlikely that he’ll provide an explanation, but you never know. Now that I’ve reconciled on the temperature PCs used by Ammann, I’m also going to check the impact of the "simplifications" i.e. use of annual data and elimination of MBH98 weights. In a reconciliation, it is never a "simplification" to change some of the parameters. You should start with the exact parameters. Things are seldom what they appear with the Hockey Team. I’ll bet that the "simplifications" yield "improved" RE statistics. We’ll see.