Wahl and Amman: The Early Returns

I’m progressing nicely with the process of parsing Ammann. The correlation between our AD1400 emulations of the MBH98 reconstructed PC1 is 0.9993201!

The emulations are virtually identical up to scaling/centering. W-A do not use archived MBH temperature PCs, but re-calculate them; this appears to account for scaling differences identified so far (and they should get washed out). I’m in the process of working through the downstream scaling/centering in W-A, which has some puzzling features in MBH98 methodology. For the AD1400 step (which is the one in controversy), there is only one PC in the reconstruction so the final NH reconstruction is going to be a linear transformation of the RPC1. So it looks almost certain that our emulation has been right on the money and that Wahl and Amman is an almost perfect replication of MM methods – and much closer to MM technically than to MBH. It would have been nice of them to acknowledge this. All of the outstanding questions about MBH98 methods which I’ve pointed out on this weblog and elsewhere will naturally remain outstanding even though the Hockey Team has ventured out of the foxhole with this code. I’m sure that you will all understand the many temptations to editorialize more but I’ll wait till I’ve done a little more on the code.

amman6 amman7.gif

Figure 1. Comparisons of AD1400 Step RPC1 MM05 versus WA. Left – scatter plot; Right- RPC1s (not re-scaled). Here are some other odds and ends as progress to date.

Proxy Collation The order of the proxies in Ammann is a little different from MBH and Ammann provides no index for the proxies. In the AD1400 roster, the Stahle/SWM PC1 usually in the 16th spot occurs in the 22nd spot. I don’t plan to examine other steps. A first small difference – rounding before normalization of proxies. As far as I can tell, Mann’s practice is to normalize without rounding. Ammann sometimes rounds to the first decimal place before rounding. The differences can be as much as 0.15 deg C (e.g. below for the Tasmania temperature reconstruction. I don’t suppose that it matters much, but, if he’s trying to replicate, it would be easier to do little things the same. Otherwise, our proxy data set in our MBH98 replication is virtually identical to Ammann.


Figure 2. Difference between MM05 and WA Version of Tasmania Proxy (both normalized), due to varying rounding procedures

Temperature Principal Component Collation

Ammann’s approach seems pretty weird here. In his covering notes, they point out that an erroneous argument in MBH98 (also pointed out previously by others including us) that you needed to have more months than gridcells to do principal component calculations. You don’t. Ammann calculates principal components using annual series. The dataset used is an annualized version of the underlying temperature dataset (92 years x 1082 gridcells). Their version is available for download. Their dataset has been regularized to deal with missing data. MBH98 Corrigendum said that they interpolated missing values in the monthly series, but didn’t explain how they dealt with missing opening and closing values. The exact results are affected by the order of annualizing and regularizing, which is not provided. The selection of 1082 gridcells cannot be replicated, but this is not considered by Ammann. I haven’t checked the selections yet. The correlations between Ammann’s temperature PCs and Mann’s temperature PCs declines in lower order PCs: 0.997 0.965 0.922 -0.846 0.878 -0.059 -0.543 -0.284 -0.240 -0.772 -0.085 0.261 -0.342 0.079 -0.214 -0.189. Here are plots of temperature PCs 6:9. PCs can be calculated from any data set. It’s hard to believe that these PCs have any sort of physical permanence and can be replicated from different periods. Ammann certainly doesn’t examine this loose end.


Figure 3. Differences between Temperature PCs 6-9 between WA and MBH Archived

For what it’s worth, I was able to replicate MBH98 temperature PCs more closely than Ammann [link to Replication #x]. The correlation between the PC1s is very high, so this won’t impact the AD1400 step, but the discrepancies will presumably affect the regional reconstructions, which look pretty dubious to me. (For example, I’m pretty sure that, in the gridcell reconstruction for Vienna where there is a long history, the North American PC1 has more impact than the observed temperature – which seems like the ultimate in lousy RE. The actual temperature dataset is included as a “proxy” and is lost in the reconstruction. At least no one can accuse them of peeking at actual temperature records.) Let me re-iterate that the differences here are between WA and MBH; I’m not here comparing to our own emulations, as we used archived MBH values here.

The Mann-Ammann Cosine Error

There is much to be said for the Biblical advice to remove the “beam in your own eye before trying to remove the mote in your brother’s”. It’s easier said than done and I’m sure that inconsistencies on my part can be pointed out. But it’s still worth doing. Many of you will remember the huffing and puffing about a cosine error in McKitrick and Michaels – which was promptly identified because they made their script public. They promptly acknowledged the error and published a Corrigendum, including an impact assessment. (Despite this, in an effort to block publication of our materials, Mann made false allegations to Natuurwetenschap & Techiek, stating that McKitrick and Michaels had failed to issue a correction notice, but that’s another story.) There has been a rather delicious latent cosine error in MBH98. Von Storch et al. [2004] pointed out that, for the purposes of their EOF calculations, they should have normalized by the square root of the cosine rather than the cosine. I’ve seen a reference to the need to use a square root of the cosine in an article by Wallace of University of Washington [link to come]. If Ammann was not trying to exactly replicate MBH98 temperature PCs (since he has not done so), you would think they would have tidied up this error. The data set used for temperature PC calculation has already been standardized and regularized and there is no code provided for these processes. I’m not sure when I’ll get to trying to reconcile these steps. They don’t appear critical, but, given that they are on much better behaviour about code, it would have been nice to see.

Calibration Test The correlation between our coefficents and theirs is 0.9992717. There’s a difference in scaling – I’m not sure why right now, but I can’t see why it would matter (since it merely affects downstream coefficients).

ammann6 F

igure 4. Scatterplot of MM and WA Calibration Coefficients in AD1400 Step


All in all, this is looking pretty good so far. One could scarcely have contemplated that WA should replicate our results so accurately. The irony is delicious.


  1. Jeff Norman
    Posted May 14, 2005 at 6:18 AM | Permalink

    Replicating previously published work without acknowledging the authors of the previous work… Isn’t there a word for that?

  2. Peter Hearnden
    Posted May 14, 2005 at 8:03 AM | Permalink

    Steve, you seem to be able to do an amazing amount of work in a very short time! Is anyone else helping you? Or did you have all this pre preparred ;). Whatever, if it’s all your own work, well, you’re very industrious and I congratulate you on your efforts.

    Steve: Thanks for what I take to be a compliment. It’s just me against the Hockey Team plus their programmers. How could I have this prepared? I’d never seen Hockey Team code before. I didn’t start on this until yesterday since I was travelling on Thursday. I really didn’t know what to expect they finally emerged with some code. The reconciliation has been a lot easier than I expected since they programmed in R and their code is very similar to mine, so that it is pretty well possible to pause at each step and check the differences. (You can get results really fast in R.) You would have thought that they would have done the same thing – isn’t that the logical way to try to reconcile calculations? If they had done so, they would have observed the same thing: there’s virtually no difference in the code (so far – I haven’t got to the scaling to temperature). So this raises a question: did they try to reconcile step by step (my code has been public)? If they didn’t, why didn’t they? They are writing a long article criticizing us – shouldn’t they examine the code, especially when it’s a matter of public controversy and Ammann is funded by a public agency? If they did, why didn’t they point out the obvious parallels?

    They’ve blurred the reconciliation a little by using completely different temperature PCs as a starting point – ostensibly because this “simplifies” things. Well, it doesn’t “simplify” a reconciliation. If they want to argue that it doesn’t matter, then that’s fine, but reconcile first and prove it. Computers don’t care whether you’re using monthly or annual data. After I look some more at the final scaling, I’m going to go back and plug in Mann’s archived PCs into their methods and reconcile to my results. If we’re at three 9s correlation already on the 15th century results, it will probably be five 9s using Mann’s PCs.

    When you think about it, it would have been more accurate for them to say that they’ve replicated our results. What a bizarre turn of events. I still have to work through the scaling to NH temperature, but should have that done by the end of the day (unless my granddaughter comes over which will take precedence.) Regards, Steve

  3. Spence_UK
    Posted May 14, 2005 at 9:05 AM | Permalink

    The difference between having code and data to compare versus having no code and data to compare is remarkable. As you’ve ably demonstrated Steve, it allows rapid comparison and the ability to home in on any subtle differences with incredible speed. The benefits to science of picking out subtle differences in otherwise very complex code will hopefully open a few eyes in the climate science world – although perhaps, for political reasons, many will be reluctant to adopt this approach so quickly.

    Glad to see the replication so accurately reflects the M&M work, one tricky aspect of this is how two people can generate such similar results and yet come up with such different conclusions. It is less clear to me how this will play out in the wider community. I will keep watching with great interest 🙂

    To follow up on #2, I think it is a mixture of hard work on Steve’s part and the difference between having source code available to compare (in a common language – R). Talking of which I am tempted to download R… because I habitually use MATLAB, I have to recode things from scratch which is quite time consuming. The ability to straight run some of this stuff would have been useful.

  4. John A
    Posted May 14, 2005 at 12:35 PM | Permalink

    It’s interesting to find that Wald and Amman have provided ammunition to shoot down one of their own claims: that the MBH reconstruction is robust to the removal of dendrochronitic data – No, it ain’t.

    Just like Steve’s emulation, once you take out the bristlecones, there’s no hockeystick shape to mine for.

  5. Peter Hearnden
    Posted May 14, 2005 at 12:56 PM | Permalink

    Well, if it all hangs on the bristlecones lets see what you say the HS looks like without them. I don’t know for sure but I don’t think I’ve seen such a thing here.

  6. Doug L
    Posted May 14, 2005 at 3:47 PM | Permalink

    I can’t see the graphs to the right of figure one, the menu on the right side jumps over and covers it up. This occurs with my pet browser MYIE2 as well as IE.

    Does it show the size of the 15th century bump?
    (Wait a minute, one commentator (other thread)says this doesn’t matter!)

    You would think from what the hockey teams are saying the bump would be tiny or not there.

    I’m on the edge of my seat but it is of no use!

    I’m just a dabbler been following this story for a few weeks, don’t waste too much valuable time on me, I smell a bunch of rats– go get ’em!

    Hope I posted this in the right spot, I’m going crazy!

    Steve: I’ll try to tidy the page in a day or two, but I’m in a bit of a full court press right now, trying to decode Ammann plus do Replies to 5 (!) Comments. AS John A. said, this graphic is just trying to replicate Wahl and Amman in the 15th century Reconstructed Temperature Principal Component #1 (the active ingredient). So it will be low in the 15th century. The bizarre thing is that Ammann seems to have replicated our calculation method for RPCs rather than Mann’s. Now our method is pretty close to Mann’s and usable for analysis, but his is almost EXACTLY the same as ours. In our EE article, we reported that you get high 15th century under some circumstances and low 15th century under some circumstances, so their results were not robust to the presence/absence of bristlecones (let alone to the presence/absence of tree ring series altogether as they had falsely misrepresented). We also argued that their reconstruction failed a basic statistical test (R2) and was thus, shall we say, bankrupt. Thus, figuring out why the RE statistic was psuriously high was only forensic interest (but I think that we’ve done that.)

    They are trying to re-define the issue to show that they can “get” high 15th century under various circumstances – now “robustness” is no longer robustness to dendroclimatic indicators, but saying that they can get similar results in a couple of different ways. We don’t dispute that. But every such way is just a way of bringing the bristlecones in through the back door. They justify this on the basis of improving RE statistics, but it hardly matters if they all fail obligatory R2 tests (which Ammann fails to report!).

    They have a new scenario trying to attenuate the bristlecone argument; it fails, but I don’t have time right now to explain why. It’s interesting to see how they avoid apples and apples calculations – even if only to show a reconciliation. But that also takes a little time to show.

  7. John A
    Posted May 14, 2005 at 4:21 PM | Permalink

    Re: #6

    Doug, the graphs shows that MM05 and AM05 reproduce (or should I say emulate?) exactly the same form of the Hockey Stick. The difference is the scaling, and nothing more.

%d bloggers like this: