Our Reply to a Rejected Comment to GRL

4 Comments have been submitted to GRL to date on McIntyre and McKitrick [2005]. We reported previously that the Wahl and Ammann comment was rejected (although this has not been acknowledged at the UCAR website.) A second comment has now been rejected. I think that there is a good chance that the other two Comments will be published together with our Replies. In our opinion, none of the criticisms have any bearing on our findings; in the two cases under consideration, we think that interesting issues were raised and that the comment and response will illuminate matters.

When Mann and others talk of supposed refutations of our findings as being in review, readers should bear in mind that the supposed refutations may not actually be refutations and may never see the light of day. In this case, the rejection of the Comment means that readers would not get to see our Reply. I think that our Reply may be of interest to readers so I’ve posted it up below.

Abstract: X [2005] appears to have misunderstood the points in McIntyre and McKitrick [2005a] (herein MM05-GRL) and his note does not overturn any of our conclusions. We showed that the principal component (PC) methodology of Mann et al. [1998] (MBH98) was severely biased, that it overweights problematic bristlecone pine series thereby incorrectly implying these are the dominant pattern in the NOAMER network, and that the MBH98 reconstruction lacked cross-validation skill in the controversial AD1400 step. X bypasses all these issues and instead tests the PC method on the irrelevant exercise of calculating an average tree ring chronology: a step that plays no role in the MBH98 method or in our critique thereof. Furthermore he fails to distinguish between identifying a distinctive pattern in a data set and establishing that a series has a significant relationship to climate.

1. Assessing the Impact of the MBH98 PC Method
X compares two PC methods, the MBH98 short segment standardization, reported in MM05(GRL), and a conventional centered calculation, but the comparison is not in the MBH98 context, which he made no attempt to emulate. X considered matrices re-combined from truncated PC decompositions under the following circumstances: a) both PC methods — MBH98 short-segment and conventional centered; b) two different truncations — 1 PC and 2 PCs. In each case, X calculated the vector of annual averages for each re-combination. He also calculated the vector of annual averages for the data set without PC decomposition and recombination. Under both PC methods, X reported that (1) the difference in annual averages between using 2 PCs and 1 PC was “quite significant”; (2) the contribution after the 2nd PC to the re-combined annual average was “negligible”; (3) if 2 PCs are used, the difference between the MBH98 short-centered method and a conventional method in annual average was “negligible”. X then concluded that our assertions “that the MBH results are fatally flawed by short-centering are groundless.”

With respect, X has proved nothing of the sort. At most, he has shown the biased PC method does not necessarily affect an averaging calculation that plays no role in the matters under dispute. He ignores the question of the impact of the biased PC method on actual MBH98 temperature index calculations and does not even mention our own detailed treatment of this matter in McIntyre and McKitrick [2005b], (herein MM05 (EE)). We did not take up the lengthy topic of how the PC methods affect actual calculations of the NH temperature index in MM05 (GRL), but specifically referred to MM05(EE), where we surveyed various permutations and combinations, sometimes yielding hockey stick shapes and sometimes not. We identified the critical factor in the non-robustness as being due to the impact of bristlecone pines under the various methods. Contrary to X’s implication, we did not carry out a NH temperature index calculation using only 1 PC from the North American network either in MM05 (GRL) or MM05 (EE). In our actual calculations, the fewest that we ever used for this network was 2 PCs, which, ironically, is the number advocated by X, notwithstanding the fact that we disagree with his justification for this number.

Irrespective of X’s assertion that the column means are relatively unaffected under biased PC methods, it is obviously not the case that the “same results” are obtained in all relevant situations, as X claimed. Notably, the PC1s are very different. Using the MBH98 method, the PC1 has a pronounced hockey stick shape and its weights are loaded on a small subset of bristlecone pines. The PC1 using a conventional centered algorithm does not have a hockey stick shape and the bristlecones are demoted to the PC4 and are not the dominant variance. While X also notes that a PC1 by definition summarizes the “major share” of variance, he then fails to acknowledge that it is materially different under the two methods.

2. Necessary versus Sufficient Conditions for Significance in PC Calculations
X appears to equate a tree ring series having a distinct pattern in a PC decomposition with it being a valid temperature proxy. He partitions the North American data set into two groups, the bristlecone pines discussed in Graybill and Idso [1993] and all the others. The Graybill-Idso series indeed have a substantially different growth pattern than the rest of the North American tree ring data set, but Graybill and Idso themselves (and others) deny that this pattern is temperature-driven. Indeed, that is one of the principal grounds of our critique of MBH98″¢’¬?that data widely regarded as a nonclimate proxy receive the dominant weighting in the final results. We elaborated this point in MM05(EE), and pointed our GRL readers to that discussion (see para. 13). Merely because a shape term is “different” is not sufficient grounds to conclude that it is a “proxy derived temperature history”. The latter claim has to be established on other grounds. But the available literature, which we survey in MM05(EE), shows these series are singularly unsuited as climate proxies.

At the risk of appearing didactic, we can illustrate the difference between a series having a “distinct shape term” and it being a valid temperature proxy through a simple but vivid counterexample. In the North American AD1400 tree ring network (consisting of 70 chronologies), we substituted 15 weekly technology stock price series sampled over 581 weeks prior to the market peak in April 2000 for the 15 bristlecone series that were top-weighted in the MBH98 PC1. PC calculations on the new network, combining tree ring chronologies and technology stock prices, show that, under both short-segment and centered PC methods, the inclusion of the technology stocks resulted in a “distinct shape term” requiring a separate PC to represent. Indeed, under the MBH98 method, the “shape term” from the technology stocks manifested itself in the PC1.

X uses informal criteria to decide if a “shape term” is different. A more formal method for deciding the number of PCs to retain is Preisendorfer’s Rule N [Preisendorfer, 1988; Overland and Preisendorfer, 1982]. But neither Rule N nor informal identification of a “distinct shape term” is a sufficient condition for significance, merely a necessary one. Passing a Preisendorfer Rule N test would obviously not qualify the technology stocks as climate proxies. The same argument applies to the bristlecone pine series. In terms of X’s argument, the appearance of a distinct “shape term” due to the bristlecone series may in fact identify outliers and imply that the underlying series contributing to the “shape term” ought to be excluded altogether from the data, if they are known on a priori grounds to be invalid for the purpose of the study.

3. Other Issues
X asserts that the first step in MBH98 principal components calculations was to standardize on the 1902-1980 period. He implies that this information came from MBH98 itself, but the original article states only that “conventional” methods were used. His source for this information was our article, but X fails to attribute the observation to us.

X suggests that the small size of the standard deviations of our simulated PC1s is somehow relevant. The standard deviations of our simulated PC1s are quite comparable to the standard deviations of the MBH98 NOAMER PC1. In actual MBH98 procedures (not discussed in X), tree ring PCs are standardized a second time on the 1902-1980 interval. Thus, the multiplication by a factor of 50 complained about by X is inherent in MBH98 methods and his complaint should be directed at them. But since the series then enter into a regression calculation, the matter is moot since changes in scale simply result in changes in regression coefficients.

REFERENCES
Cook, E.R., Briffa, K.R. and Jones, P.D. (1994), Spatial regression methods in dendroclimatology: a review and comparison of two techniques. Int. J. of Climatol., 14, 379-402.
Graybill, D. A., and S. B. Idso (1993), Detecting the aerial fertilization effect of atmospheric CO2 enrichment in tree-ring chronologies, Global Biogeochem. Cycles, 7, 81– 95.
Mann, M.E., R.S. Bradley and M.K. Hughes (1998), Global-scale temperature patterns and climate forcing over the past six centuries, Nature, 392, 779-787.
Mann, M.E., R.S. Bradley and M.K. Hughes (1999), Northern Hemisphere temperatures during the past millennium: Inferences, Uncertainties, and Limitations, Geophys. Res. Let., 26, 759-762.
McIntyre, S. and R. McKitrick, (2005a), Hockey Sticks, Principal Components and Spurious Significance, Geophys. Res. Let., 32, L03710, doi:10.1029/2004GL021750.
McIntyre, S. and R. McKitrick (2005b), The M&M Critique of the MBH98 Northern Hemisphere Climate Index: Update and Implications, Energy and Environment 16, 69-99..
Overland, J.E. and R.W. Preisendorfer, (1982), A significance test for principal components applied to a cyclone climatology, Mon. Weather Rev., 110, 1-4.
Preisendorfer, R., (1988), Principal component analysis in meteorology and oceanography. Elsevier Science.
X, (2005), Comment on “Hockey sticks, principal components and spurious significance” by S. McIntyre and R. McKitrick, Geophys. Res. Let., 32, XXXX, doi: XXXXXXXXXX.

This entry was written by Stephen McIntyre, posted on Jul 18, 2005 at 1:35 PM, filed under MBH98 and tagged ritson. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

8 Comments

Michael Mayson

Posted Jul 19, 2005 at 3:12 AM | Permalink

Steve, is this one of the comments you think will be accepted?
http://web.mit.edu/~phuybers/www/Hockey/Huybers_Comment.pd

Steve: This one and one by von Storch and Zorita are both in play. We have really nice replies to both articles. It’s frustrating to me that Huybers has put this Comment on the internet – I didn’t think that you’re allowed to do this and haven’t posted up our Replies, leaving this Comment unrebutted, when there are really excellent reply points.

On the RE point in Huybers, I did new simulations adding in a re-scaling step as MBH98 did it with white noise series and a simulated PC1. These also had a spurious RE statistics. This neatly added to our previous results and completely refuted Huybers’ point. However, the exchange was a net addition and worth being in print. Because the issue is so argumentative, I’m sure that the Manns of the world will focus on the Comment and ignore the Reply, but I guess that’s part of life.

Huybers also argued that PC calculations on tree ring networks should be done using a correlation matrix rather than a covariance matrix (what he called “full normalization”) . We used a covariance matrix calculation in our emulation of MBH98 since MBH98 said that they used “conventional” PC methods and the covariance matrix is the default option in conventional PC calculations. Thus calculating PCs using a covariance matrix is hardly an “error” as an implmentation of reported MBH98 methods. Ironically given our criticism of short-segment standardization and Huyber’s argument for “full normalization”, Huybers’ diagrams applied a form of short-segment standardization and it was this short-segment standardization that yielded any rhetorical effect in these diagrams. (This part of his submission may not survive refereeing.) We also did some neat calculations showing that 1) the PCs calculated using the covariance matrix had better properties as predictors of the full network results, standing one of Huybers’ claims on its head; 2) the differences reult in added emphasis to bristlecones which only makes sense if these are valid proxies; 3) a really nice calculation in which we applied a result long known in econometrics i.e. that the sample standard deviation is biased under-estimate of long-run variance in a time series with serial autocorrelation. If the series were normalized using a HAC-consistent variance estimator (e.g. Newey-West 1987) then the bristlecones go even lower down in the PC rankings. There was an interesting and clever point in its own right.

Wahl and Ammann touched on the issues raised by Huybers in connection with correlation vs covariance matrices and this was one reason for their rejection. There were additional problems with their submission. It contained many misrepresentations and mischaracterizations and would have required repeated editing. It was also structured as a “pyramid scheme” in which points supposedly argued in their Climatic Change submission were carried forward into their GRL piece without any proof or argument. These unsupported points were not merely mentione in passing, but were carried forward into the conclusions and then into the abstract which you see at the UCAR website. I’m sure that there were a variety of reasons for the WA rejection.

Een if the Huybers comment doesn’t go forward, there are points in our Reply that I’d like to see in play and we would re-format and re-submit these points somewhere.
John

Posted Jul 19, 2005 at 11:45 AM | Permalink

It’s darkly ironic that after so much carping about “peer reviewed articles” in “quality scientific journals”, Mann is reduced to quoting his own unpublished work and another rejected piece co-authored by his PhD student, in justification of his original study.

Remember that, Steve? It was only six months ago…
Max

Posted Jul 19, 2005 at 3:00 PM | Permalink

May it be that in this whole issue there is a “Memory of Water”-effect (how I call it). I mean, sometimes people expect certain results in an experiment and so they are likely to deny any other explanations. It was the same with this French(Jacques Benveniste) scientist, who thought that Water really has a memory (used in homeopathics). Then an independent study was done by BBC and several scientific institutes (Royal Society) and backed by James Randi (an American Sceptic), that proved everything to be a subjective hoax and there is indeed no evidence (beside placebo-effects) for any such thing as a memory of water.

May it be the same within other subjects of science (especially the highly politicized Climate Research)?
Roger Bell

Posted Jul 19, 2005 at 3:57 PM | Permalink

Can someone fill me in on this busines of Comments, please? I’d never heard of them before. Are they unique to GRL or do other journals have them? Is the idea that having “Comments” gives people a chance to criticise a paper without having to write a complete paper themselves?
Ed Snack

Posted Jul 19, 2005 at 5:52 PM | Permalink

Quick note, the url posted by Michael needs an “f” added to make the .pdf file download. Correct is http://web.mit.edu/~phuybers/www/Hockey/Huybers_Comment.pdf
Michael Mayson

Posted Jul 19, 2005 at 11:58 PM | Permalink

Ed, thanks for the correction – I seem to have a habit of posting incorrect links!
Larry Huldén

Posted Jul 22, 2005 at 12:50 AM | Permalink

RE #4
“Comments” are also complete papers although they are commenting on some particular issues in another paper. You can find them in most journals. They are often important contributions to the understanding of the research results. A “complete paper” might include errors in statistics, omission of data, misinterpretations of causality, or the results may have been already published in another context but missed by the authors. Anything that the referees didn’t observe. Because of this the original author’s usually get a possibility to “reply” to clarify their position or even admit their mistakes. If the comments by chance includes errors or obvious inconsistencies and the “reply” can show it, the comments are usually not published.

The current case of Wahl & Amman’s submitted paper (“comments”) to GRL is exceptional because it was “published” on UCAR website before acceptance in GRL. After the “reply” by McIntyre & McKitrik, GRL decided not to publish Wahl & Amman’s “comments”. The “comments”, however, still remain on UCAR website as “submitted to GRL”. In this context it is quite fair that the “reply” also is available on internet.

Steve: The reply that we posted up is to a 4th submission, not to Wahl and Ammann. We were not required to write a Reply to Wahl and Ammann, which was rejected without our needing to write a Reply.
TCO

Posted Sep 20, 2005 at 11:26 PM | Permalink

Any skinny on why WA got the reject? What is status of stork comment?