Errors Matter #1: the no-PC Alternative

Mann et al. have responded to our criticism by claiming that the errors which we have identified “don’t matter” because they can “get” MBH-type results under several different methods, one of which is through not using any PCs. Ross and I previewed an initial reply to these arguments here and plan to issue a pdf version of our reply. I’ll amplify our earlier discussion here, starting with Mann’s no-PC salvage proposal. For a variety of reasons – abandonment of any pretence at even spatial sampling, non-robustness to bristlecone pines and lack of statistical skill over a range of verification statistics – the no-PC reconstruction fails to salvage MBH98.

Here is a typical statement of how the Hockey Team presents this argument:

We quickly recap the points for readers who do not want to wade through the details: i) the MBH98 results do not depend on what kind of PCA is used, as long as all significant PCs are included, ii) the results are insensitive to whether PCA is used at all (or whether all proxies are included directly), and iii) the results are replicated using a completely different methodology (Rutherford et al, 2005).

This hyperlink restates arguments made at realclimate on Dec. 4, 2004 even before our papers were released and underpins public statements by realclimate coauthors Gavin Schmidt and William Connolley.

Representations and Warranties: Now that the errors of their PC methodology are being understood, it is my view (and it seems self-evident to me) that it is insufficient for Mann et al. to merely "get" a hockey stick shape some other way – they have to do so while continuing to achieve the representations and warranties of MBH98, which led to the widespread acceptance of this study. Among the most important such representations are the following:

1. “a reasonably homogeneous spatial sampling in the multiproxy network was achieved by representation of ‘densely sampled regional dendroclimatic data sets’ through principal components analysis ensured” (p. 779);
2. “the long-term trend in NH [temperature] is relatively robust to the inclusion of dendroclimatic indicators in the network” (p. 783 and re-stated in Mann et al. [2000], A Further Note)
3. the MBH98 network attained a “high level of skill …in large-scale reconstruction back to 1400” (p. 785). Recently, Mann has amplified this by saying that studies which do not satisfy a number of “statistical verification exercises” should not be considered in climate studies .[….]
4. the proxies were selected according to “objective criteria” [Mann et al., 2000]
5. the proxies are linearly related to large-scale climatic patterns, and are neither local, non-linear nor non-climatic. (p.780)

Let’s look here at the consistency of the no-PC alternative with the first three MBH98 representations.

Even Spatial Sampling: : In the 1400-1450 period of the MBH98 Northern Hemisphere temperature reconstruction, “reasonably homogeneous spatial sampling” was achieved by representing 76 North American tree ring chronologies by 3 PC series. Without PC (or other representation method), North American tree ring series would account for 80 of 95 series in this step (of which 20 were bristlecone pine ring width chronologies) – which would have been an obvious failure of "reasonably even spatial sampling". Instead, through PC representation, MBH98 temperature calculations in the regression steps were done with 22 proxy series, of which 7 were still based on North American tree ring networks (2 PC series from the North American network, 1 PC series from the Stahle/SWM network, the extrapolated Gaspé series and 3 Stahle precipitation reconstructions from tree ring chronologies.) Even after PC representation, 7 of 22 series (32%) used in temperature calculations were based on North American tree ring data – period, which we believe already strains the limits of “reasonably homogeneous spatial sampling”. A calculation with 80 of 95 proxies being North American tree ring series would be a total abandonment of the premise of reasonably even spatial sampling. Had such a reconstruction been proposed in the first instance, we do not believe that it would have been taken seriously or could have been used to ground major policy initiatives. We see no reason why it should be taken seriously now.

Robustness: this salvage reconstruction does not deal with the issue of non-robustness to bristlecone pines. If the bristlecone pines are removed from this network, there is no hockey stick even under a no-PC method. Robustness was an important representation for the acceptance of MBH98; the no-PC salvage alternative needs to meet this representation and it does not.

Statistical Skill: Mann claimed great statistical skill for MBH98, a claim which we dispute in MM05(GRL) but which was essential in MBH98 aceptance. Mann has recommended the following for a series which lacks "skill":

[it] fails statistical verification exercises, rendering it statistically meaningless and unworthy of discussion in the legitimate scientific literature

Mann does not provide a suite of verification statistics or digital information about the no-PC reconstruction, but only the RE statistic for two different versions of the no-PC reconstruction (0.39; 0.33). From related calculations, our surmise is that the R2 statistic for the no-PC reconstruction (like for MBH98 itself in this period) will be approximately 0.0 and not statistically significant. Contrary to claims by Mann et al., we do not argue that analysts should exclusively look at the R2 statistic and not look at the RE statistic; we’ve argued that they should look at both of these statistics (and others) — hardly a controversial point. In MM05 (GRL), we provided a benchmark distribution of RE statistics under the data mining methods used in MBH98. The RE statistics of the no-PC salvage reconstruction are only slightly above the median. Taken together with the statistically insignificant values of other verification statistics, it is obvious that the salvage reconstruction fails “statistical verification exercises” of realclimate. Mann has stated in arguments against us that a reconstruction which fails "statistical verification exercises" should not be considered in climate science. What’s sauce for the goose is sauce for the gander.

Thus, for a variety of reasons – abandonment of even spatial sampling, non-robustness to bristlecone pines and lack of statistical skill over a range of verification statistics – the no-PC reconstruction fails to salvage MBH98. I will discuss the consistency of the other salvage alternatives with MBH98 representations in the next few days.

This entry was written by Stephen McIntyre, posted on Feb 11, 2005 at 5:27 PM, filed under MBH98. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

10 Comments

Louis Hissink

Posted Feb 12, 2005 at 12:39 AM | Permalink

Steve,

One of the more interesting facts of the various historical records published in graph form seems to be the policy of displaying past temperatures as temperature anomalies. The various graphs you have here all show anomaly values within 1 degree Celsius of range. Why does no one publish the actual mean temperature per year over time for the last 10,000 years? This would seem a useful thing to show, or if that was done, there might be a problem?

And a temperature range within 1 degree Celsius would be physically imperceptible too yet all the debate is about variation within this miniscule range – talk about making mountains out of molehills!
John A.

Posted Feb 12, 2005 at 5:05 AM | Permalink

Yes, but how much skill does it take to make over 300 errors of collation , foul up statistical sampling using “conventional” PCA (which was nothing of the kind), use invalid proxies (like using 1 or 2 cedars to represent the early 15th century), to produce a reconstruction of climate whose inputs cannot be distinguished from random noise?

I think the real skill with Mann and Co., was selling this shockingly bad piece of statistical manipulation and poor statistical control as the new paradigm of climate reconstruction. That the IPCC (or at least the three of four people who control it) believed this reconstruction at face value, and then communicating this to the world as the key evidence of 20th Century warming being anomalous.

I think its fair to say that this study is key to the claim of “greenhouse warming”, a key shibboleth of the global warming industry. As such it cannot be abandoned, for without this study, it is impossible to keep the public alarmed, and the money flowing to the vested interests of eco-alarmism.
John A.

Posted Feb 12, 2005 at 5:48 AM | Permalink

One further comment:

Mann et al. have responded to our criticism by claiming that the errors which we have identified “don’t matter” because they can “get” MBH-type results under several different methods, one of which is through not using any PCs.

Two points:
If Mann and Co. can get Hockey Stick type results without bad programming and worse statistical control, then they should withdraw MBH98 and produce the goods. It’s no use saying “our methods and statistical quality where trash before but we can replicate our results with proper methods and statistically sound data”. Replicating the Hockey Stick (or at least saying that you can) does not validate the poor science that went into MBH98 or 99.

Bet they can’t though.
Louis Hissink

Posted Feb 12, 2005 at 6:05 AM | Permalink

Especially intriguing is the view that temperature anomaly fluctuations of less than 1 Kelvin are considered “significant”. The impression I gain is that mean temperatures are so stable over time, that special techniques have to to be used to emphasise the minute fluctuations in the mean temperature over time. Reminds me much of the enhancement techniques we use in highlighting geochemical data wehre extremely low values need to be extracted from the background signal.

I would say that it is extremely misleading to display the historical temperatures as temperature anomalies using a highly exaggerated vertical scale – my initial impression was that it represented temperature, but on studying the axis labelling you find that it isn’t – but must lay people would not understand this subtley, and would assume that it represents the historical temperatures.

Hence my earlier comment about the lack of any published graphs showing the mean global temperature trend over time.

There are lies, damned lies and statistics.
jon gilbert

Posted Feb 15, 2005 at 11:41 AM | Permalink

Steve you eco-heretic you! How dare you use real scientific methodologies in the context of global warming. As Dr. Pournelle rightly points out on his site *www.jerrypournelle.com* no one is certain what is happening with climate change, much less why and legeslative efforts to ‘fix’ a problem (which may be neither manmade nor real) are pointless and can lead to far greater damage.

The scare mongers on the ‘left’ are so devoid of rational thought it would boggle one’s mind, if it were neither so tragic nor expected. Sigh.

Excellent work and an excellent expose on the political correctness of Nature and by extension, other ‘peer reviewed’ journals. If they are not willing to publish your paper BECAUSE it is accurate, reproducible, and most damning of all; in opposition to what all ‘right thinking’ people are supposed to believe, that is thier problem. In addition, I found their repeated ‘explanations’ and rejections both spurious and disingenuous. It would have been far more honest of them to simply refuse it because they did not understand the point, or (more accurately) because they did and were afraid of the political backlash that your demolition of the Mann model may have engendered.

All of which goes to point out the extreme scientific illiteracy of the current climate change debate. Warming, cooling, or stable, no one can say for certain, and until we KNOW what is going on, punishing some indulstrialized nations (primarily the US) while exempting other heavilly industrialized (and industrializing) nations (e.g. People’s Republic of China) is not the way to ‘fix’ anything. Except the egos of those who are advocating it, of course. And the attorneys.

Most respectfully.
TCO

Posted Sep 11, 2005 at 12:50 PM | Permalink

If the no PCA version is the way to do the analysis then publish that. But Mann keeps trying to jump away from defending his PCA methodology. Either it was ok or not. He’s trying to get away from addressing the point where he is in danger.

Pussy.
Stephen Phillips

Posted Oct 1, 2005 at 11:52 AM | Permalink

It sounds like you have had to put together your own version of Mann’s no-pc reconstruction, so I’m just wondering how reliable your estimation can be of both its R2 statistic and its other verification statistics. I’d be interested to know what difficulties, if any, were involved in reproducing it, so I can get a better idea of how accurate the skill assessments are.

Many thanks.
Stephen Phillips

Posted Oct 13, 2005 at 9:00 AM | Permalink

Still waiting for an answer on this point, which seems crucial given that the construction of the PCA is so contentious. At present, your article appears to present pure speculation in the form of something authoritative and conclusive. If you were unable to calculate the skilfullness of Mann’s reconstruction then you should say so clearly, rather than muddying the waters with words like “surmise”. What on earth is that supposed to mean in the context of precise calculations?!
Dave Dardinger

Posted Oct 13, 2005 at 10:10 AM | Permalink

Stephen,

That’s all in the papers which were published which can be linked-to in the sidebar under “Articles”. Also, when W&A published their supposedly “independent” version of MBH98 which included actual code which Steve could implement, it also had about 0 for the R statistic, which is probably why W&A didn’t put it in their paper either. If you’ll look around at the articles here (look especially under Categories / MBH98 / and under replication and wahl and ammann you’ll find plenty of discussion of what you’re asking.
Steve McIntyre

Posted Oct 13, 2005 at 11:28 AM | Permalink

Re #7: Sorry that I missed your query. At the time that I posted this, I hadn’t done a replication of his no-PC method.

My surmise that the cross-validation R2 will be about ~0 is based on bristlecone imprinting. The no-PC method is just a variation of a bristlecone-dominated reconstruction. This is the basis of my estimate: a bristlecone-dominated reconstruction (using PCs) has a cross-validation R2 of ~0 and so (almost certainly) will a bristlecone-dominated reconstruction developed another way.

It wouldn’t do any harm to do an emulation of his no-PC method to show this and I’ll try to get to doing it. But before you blame me too much for not doing this, why don’t you blame Mann et al. for withholding this information? Also why don’t you ask Mann or Ammann at realclimate what his cross-validation R2 is for the no-PC result? I doubt that you’ll get an answer.

Please note that the “no-PC” reconstruction abandons any pretence of geographical balance and is overwhelmingly just an index of Southwestern U.S. tree growth (especially bristlecones). This is not to say that PCs are mandated as a way of avoiding this geographical imbalance, only that the geographical imbalance needs to be recognized and dealt with in some way.