Wahl, Ritson and Ammann, the authors of two rejected comments on MM05 to GRL – see here and here for our Replies to the rejected comments – have joined forces and pulished a critical comment on Von Storch et al [2004] in Science, to which von Storch et al have issued a Reply. realclimate has issued an editorial here.
There’s quite a bit of back story to cover on this exchange, which I’ll cover in more detail on another occasion. For now, here are a few quick observations. I’ve pointed out on this site for a long time both (1) that VZGT had not correctly implemented MBH procedures; (2) that I did not think that VZGT had correctly diagnosed the problems with MBH98.
I had identified at least 3 different problems with VZGT implementation of MBH: (1) they did not appear to implement a re-scaling step unreported in the original MBH98 article. As I pointed out in my AGU05 presentation, the variance differences alleged by VZGT empirically did not exist; (2) in the GRL Comment on MM, VZ did not accurately implement the goofy MBH principal components method, seemingly not fully comprehending just how bad the method was; [Update -May 4, 2006: I’ve reconciled code with Eduardo. Their description in GRL of what they did certainly suggested otherwise, but they did implement the key features of the Mannian PC method – so the differences with us lie either are probably due to the next item.] (3) relying on Jones and Mann 2004, the VZ (and the VZGT) pseudoproxies wildly over-estimated the temperature signal content of MBH proxies and did not allow for "bad apples".
I don’t blame (and didn’t blame) VZGT for any of these "problems"; the fault lies entirely with Mann et al. (1) How could VZGT replicate a re-scaling step that was never mentioned in the oriignal article? (2) While I think that we provided enough information in our articles to decode the MBH principal components method, I can readily understand how people would assume that the method was more reasonable than it really was and think that it wasn’t possible that MBH used such a weird method; but it was possible and it did happen. (3) absent a detailed investigation of MBH98 proxies of the type that we’ve carried out, anyone relying on MBH information would assign much better behavior to the proxies than is justified.
To the above three points, Wahl et al add a fourth: that VZGT calibrated on detrended proxies rather than non-detrended proxies. In reply, VZGT argue that Wahl et al overstated the impact of this methodological difference on their results, which they claim to be valid even with calibration on non-detrended proxies. Both Wahl et al., and especially realclimate, gloat over this seeming "error" in VZGT implementation. However, to the extent that VZGT have incorrectly implemented this MBH procedure, then one can certainly see some basis for the misunderstanding. In a criticism of MM03, Mann et al said:
The use of gridpoint standardization factors based on undetrended data (MM) to unnormalize EOFs that had been normalized by standardization factors of detrended data (MBH98) implies a pattern of bias in the projection of an eigenvector onto the surface temperature field that is increasingly large in regions where the 20th century is large.
Similarly in the Corrigendum SI, MBH stated:
Standard deviations were calculated from the linearly detrended gridpoint series, to avoid leverage by non-stationary 20th century trends. The results are not sensitive to this step.
While neither of these points specifically refer to proxies, MBH procedures are so poorly and inaccurately described that one can see why VZGT might innocently assume that if detrending was the"correct" MBH method in standardizing gridcell standard deviations, then it might very well be what MBH did with proxies. In the Original Supplementary Information to MBH at Nature (now deleted but presered in a University of Massachusetts mirror), MBH report results from a detrended (DET) run. So even if VZ have inadequately modeled one MBH variant, one could see why they might at least think that they had modeled the DET alternative.
I must confess to feeling a certain amusement at Mann savaging VZGT for allegedly "incorrectly" implementing his precious methodology. Back in 2003, when we sought clarification of MBH methodology, Mann refused, on the basis that von Storch and Zorita had found his existing disclosure sufficient to implement his methodology (see Mann correspondence). In the Corrigendum SI, Zorita et al 2003 is cited on 2 different occasions as an accurate implementation of MBH methodology. In summer 2004, we advised Nature that the Corrigendum SI remained insufficient; Nature said that anything further was up to the authors. So if VZ subsequently misinterpreted MBH, surely Mann has only himself to blame. All in all, surely it proves a point we’ve been making for a long time: code should be archived so that this sort of confusion is avoided. Even now, code archived by MBH is incomplete – how do they calculate confidence intervals? This mystery would be resolved in 2 seconds by looking at code. Likewise their supposed Preisendorfer calculations. Neither was archived last summer.
Realclimate completely mischaracterizes the handling of the detrending issue by Bürger and Cubasch, accusing them of following VZGT in using detrended calibration. In fact, Bürger and Cubasch carefully distinguish between trended and detrended calibration, analyzing each as separate "flavors". Nothing wrong with that.
The closing paragraph of the VZGT response raises two important issues, which are familiar to readers of this site. They state:
It is commonly accepted that proxy indicators may contain nonclimatic trends. This is particularly true with tree-ring data (8), which were intensively used in the study by MBH98. The calibration and validation of any statistical method using nondetrended data are dangerous, because the nonclimatic trends are interpreted as a climate signal. Only in the case that the trend in the proxy indicators can be ascertained to be of climate origin is a nondetrended calibration and validation permissible.
Does this sound to anyone like they are coming around to recognizing the impact of bristlecones on MBH? They go on:
In the validation period, in contrast, the correlation between the (5-year-smoothed) reconstructed and observed NHT in the validation period 1856 to 1900 is 0.23. This low correlation skill in the validation period has been recently acknowledged (9 – citing Wahl and Ammann, submitted to Climatic Change).
They do not cite M&M – unfairly under the circumstances, since these are points that we originally made and have dragged out of Wahl and Ammann, kicking and screaming. Citing the verification r (correlation), which is not mentioned in Wahl and Ammann, rather than the verification r2 , which is so directly associated with our critique seems a little sly. However, on balance, I’m happy to see them coming round to the points that we made.
I think that Mann’s going to regret that Wahl et al have put all of this back on the table. It’s pretty amazing how they complain on the one hand about people criticizing a "10 year old paper" – with Mann, arithmetic within 25% is pretty good – and then continually pick at scabs by publishing stuff like Wahl and Ammann [Climatic Change] and Wahl et al [Science]. Without Wahl and Ammann, I’d have "moved on" to other long overdue studies. However, as long as they keep contesting things – and especially when they do so with misrepresentations and withheld data – then I’m content to keep returning the ball from my side of the court.
NY Times: For Science’s Gatekeepers, a Credibility Gap
From the NY Times, a wide-ranging article on the problems of journal peer-review:
Continue reading →