Nature’s Statistical Checklist for Authors

Nature’s Guide to Authors includes an excellent statistical checklist which authors are asked to comply with to "ensure statistical adequacy". I’ve reproduced the checklist below, bolding a couple of interesting criteria. Readers of this blog can readily imagine how this checklist would apply to MBH98 or, for that matter to Moberg et al [2005].

One wonders sometimes if the left hand knows what the right hand is doing at the big science journals. Nature’s handling of statistics reminds me of Science’s handling of data archiving. In both cases, the policy is terrific, but neither journal seems to have any procedures for implementing the policy for paleoclimate articles. Maybe they are better on medical and biological topics.

As you see below, Nature has a policy requiring that "Any data transformations are clearly described and justified". Whatever else one may think of our criticism of Mann’s PC method, it remains unarguable that the PC methodology used in MBH98 was not "conventional" and that the data transformation was not "clearly described and justified." Obviously the editors and reviewers were unaware of this at the time of the original article. But what about at the time of the Corrigendum? At this time, Nature editors were clearly aware of the data transformation prior to the PC calculations. Let’s say that they, in good faith, felt that our own submission had not demonstrated that the data transformation "mattered" in terms of its ultimate effect. That does not excuse them not insisting on a proper description and justification of the data transformation in the Corrigendum.

One could go on and on. I think that I’ve pointed out the small size of the Moberg data set as well as the extraordinary non-normality of key data sets. It is inconceivable to me that Moberg et al. could have considered the Nature statistical checklist below and reported that they were in compliance with it. So one presumes that, at no point in the Nature editorial process, did they ever ask the authors to confirm that they had carried out the statistical checks listed in Nature’s policy or check to see that they had. Just imagine what a questionnaire on MBH98 would look like.

The nice thing about policy statements like this is that they give objective standards for evaluating articles like Moberg et al 2005 or even MBH98. I think that I’ll submit Nature’s checklist to the NAS panel. Continue reading

A New Spaghetti Graph

Von Storch and Mann have both said that, in an MBH98-type reconstruction, it is impossible to allocate the impact of individual proxies. This is incorrect as we pointed out in MM05b. My posts on MBH98 Linear Algebra showed this more clearly (or at least in more detail). However, those posts only took the analysis back to the PC series. Since the bristlecones were represented in the PC series, this by itself did not segregate the bristlecone impact, other than indirectly through the PC series, and the connections have not always been as clear to others as they have been to me.

However, since the tree ring PC series are themselves linear combinations of the underlying tree ring networks, with a little more linear algebra, the approach of those posts can be extended to represent the MBH98 NH temperature reconstruction as a linear combination of the individual proxies, which, in turn, enables one to create classes of individual proxies and show the effect of individual proxies

Here I’ve done the calculations so that I obtain the MBH98 temperature reconstruction (working here only with the 15th century proxies) as a linear combination of the 95 individual proxies in the 15th century network. I’ve used 9 classes – by joint continent/proxy type class, distinguishing bristlecones from other North American tree rings. (I’ve grouped Gaspé with the bristlecones, because Mann fiddled with this series to get it into the 15th century network. ) Thus the classes are : Asia tree rings, Australia tree rings, European ice core; Bristlecones (and Gaspé); Greenland ice core; non-bristlecone North American tree rings; South American (Quelccaya) ice core; South American tree rings.

Figure 1 top panel shows the absolute contributions of each continent-proxy class to the MBH98 15th century reconstruction (bristlecones in red.). This vividly shows the noise of the other networks. If I overlaid the final reconstruction on this graphic, it overlaps the bristlecone contribution almost exactly. The bottom panel of Figure 1 shows all 9 series in a standardized format of the spaghetti graphs. What the Mann weighting system does is to pick out the bristlecones from the noise (by enhancing their weights). On another occasion, I’ll do a similar graphic without the bristlecones (which is the supposed “MM reconstruction”).

You can see quite easily how by enhancing the weight of the bristlecones and reducing the weight of all the other proxies, you can “get” a hockey stick. You have to work pretty hard to “find” the bristlecones out of this pig’s breakfast of noise; that was Mann’s “new” statistical method. If you take the bristlecones out of this system, there is no HS.

contribution
Figure: Spaghetti graph showing top- absolute contribution to MBH98 reconstruction (1400-1980 for AD1400 step proxies) by the following groups: Asian tree rings; Australia tree rings; European ice core; Bristlecones (and Gaspé); Greenland ice core; non-bristlecone North American tree rings; South American ice core; South American tree rings. Bottom – all 9 contributors standardized.

A Slight Change to NAS Panel Terms of Reference

Readers may recall the consternation of the NAS Panel when von Storch (and ourselves) started presenting answers to some of Boehlert’s questions. I received notice today from NAS that:

You might also notice that there have a been a few minor changes to the Committee’s Statement of Task.

Continue reading

Weblog update: fixed some problems

Sorry about the comment outage a few hours ago. I had to reset the Spam Karma logs and forgot I was supposed to reinitialize them, hence the error. The problem was that some people (including Ross McKitrick) immediately fell foul of the spam filter so I had to tell Spam Karma that these were nice people again. Let me know if you feel unjustly labelled a spammer…

Also fixed (although I don’t know why it stopped working) is the live comment preview. When you start typing your comment, a preview of what it looks like with all the tags in place appears below the comment box.

For Steve, he’s got his file uploader back (this is all background stuff that WordPress users will know about but everyone else could care less about)

There are some other little tweaks I might experiment with which will improve the look of the site somewhat, and give more superpowers for Steve.

Oh well, back to reading the R manual….

More on PCs

DF criticized my post on principal components yesterday as follows:

Most of your figures for conventional PC analysis are misleading. You are comparing PCA1 to mean as if PCA1 has an intrinsically meaningful scale, when it does not. If you rescaled your comparison plots so that PCA1 and the mean had the same variance, then the results would be nearly indistinguishable (aside from questions of orientation). I do not believe such near equivalence holds for the Mann, offset-centered method.

I disagree with this comment on a number of grounds. In fact, I think that the scaling appropriately illustrates the near-identity of the PC1 and the first HS-shaped series. I agree with DF that, in the circumstances of this example, the rescaled mean approximates the PC1, but I take home an entirely different message: this illustrates the well-known non-robustness of the mean and illustrates the need for climate scientists to use a robust measure of location. Continue reading

Rob Wilson on Bristlecones

Rob Wilson sent in a post on another thread arguing that bristlecones are not as bad a proxy as I would have everyone believe. Unlike realclimate, opposing views are not censored here. In fact, I’m happy to highlight them. I’ll read Rob’s note and reply on an another occasion. I’ll only note now that, in our discussions of bristlecones, especially in EE [2005], we relied on specialist publications such as Graybill and Idso [1993], Hughes and Funkhouser [2003], and even (implicitly) IPCC 2AR, as questioning the validity of bristlecones as a temperature proxy, rather than arguing the point ourselves from first principles; otherwise I won’t editorialize further here, but will re-visit the topic on another occasion. Continue reading

Some Principal Components Illustrations

TCO has been pressing about the exact impact of various properties of the MBH PC methodology, asking some "elementary" questions about PC impact. Some readers have criticized him for in effect asking for a tutorial on PC methods. However, if someone asked: where can I find an article showing the statistical properties of PC methods applied to time series, I don’t think that I could give a reference that would be helpful for what we’re talking about, other than our articles, which sort of start in the middle. Some of the properties that concern me are very elementary in mathematical terms, but the surprise that came from our GRL article indicated that these mathematically elementary properties had not been thought about.

Arguably, since Mann proposed using PC methods to extract "signals" from tree ring networks, the obligation to demonstrate the validity of the MBH98 PC1 as a temperature proxy should rest with him. However, he didn’t do so at the time.

Be that as it may, I’ve spent quite a bit of time thinking about the properties of PC methods as a means of recovering "signals". There are two layers of issues with respect to Mannian PC methods: 1) problems with the Mannian method relative to conventional methods; 2) problems with PC methods themselves as applied to tree ring networks. There’s no statistical rule that says that PC methods are an appropriate way of extracting temperature proxies – surely that has to be proven. There are comments in our Reply to von Storch which refer to these issues. (In both our Replies, we introduced some new material because we were trying to be thoughtful. However, in the sound bite world of climate science, no one seems to have picked up on these comments.) Anyway here are a few more illustrations. One nice thing about blogs is that you’re not limited to 12,000 characters.

Figure 1 is constructed as follows: series 1 goes from 0 to 1 between 1902 and 1980; while series 2-10 are 0. All series are then blurred with white noise with a small standard devation (sd=0.05). One reason for blurring with white noise is because principal component methods carry out singular value decomposition on matrices and this avoids singularity. (The singularity may not "matter" but there’s no reason not to avoid it.) As you can see, there is a big difference between the simple average and the PC1. The PC1 is obtained by a linear weighting of the unerlying series: the weighting on series 1 is 0.9994 causes it to contribute more than 99.89% of the variance to the "composite" PC1. The simple average (red) is quite different. This illustrates a big difference between PC methods and averaging.


Figure 1. Series 1 goes from 0 to 1 from 1902 to 1980. Series 2-10 are 0. All series blurred with white noise sd=0.05. Weights of series 1 is 0.9994.

Figure 2 shows the same set-up but with 100 series in total. As you see, the PC 1 is essentially unchanged, but the amplitude of the mean is now reduced to nearly 0. So while averaging over a larger and larger network will gradually attenuate the impact of an outlier, PC methods in this sort of time series context will consistently pick out the high-variance outlier and pass it through essentially unscathed into the PC1. That’s why we use the term "data mining" in connection with PC methods.


Figure 2. As Figure 1, but with 100 series. Label in 2nd panel should read 99 series.

Figure 3 shows the set-up from Figure 1, but using the Manomatic PC method. In this context, the Manomatic has little incremental difference (I’ll show below how it does affect things.)


Figure 3. As with Figure 1, but with Mannomatic PC method.

What happens when you have both a front-end HS and a back-end HS? This is illustrated in Figure 4. In an actual PC calculation, the PC series might be pointing up or pointing down (the PC series intrinsically have no orientation.) I’ve arranged things so that they point up – since that is Mann’s method. As I’ve pointed out before the MBH99 HS points down. (The flippping of the PC series is a very different issue than flipping of individual series to match.) The take-home point here is that, in this set-up, the front-end and back-end HS are allocated opposite signs. Does this "matter"? Well, I happen to think that people should know the time series properties of their methodologies before they are used in big reports. Also, the properties of the PC algorithm that do this do other things as well, so I’m disinclined right now to agree that the properties can be analytically separated (but I don’t preclude that I might change my mind on this.)


Figure 4. Ordinary PC method. Weights are 0.67 and -0.94 to two "dominant" series and under 0.02 for all others.

Figure 5 shows the same set-up using the Mannomatic. So this effect happens under ordinary PC methods as well as the Mannomatic. I’m sure that the effect is more intense or more frequent in the Mannomatic in some sense which could be defined but I’ve not had occasion to precisely de-limit it.

Figure 5. As with Figure 4, Mannomatic version.

For Figure 6, I’ve modified the setup of Figure 1 so that 9 series contain an actual "signal" using an ARMA(1,1) method (ar =0.9; ma=-0.6) to look like ARMA features of many actual temperature series. Then I’ve added in white noise as above. I picked a standard deviation for the signal that I thought would illustrate the point, but I didn’t fiddle with it to get this result. Figure 6 shows the PC1 using a conventional calculation. In this case, the outlier pulls the average up a little bit at the close, while the PC1 picks up the signal a little better than the average.


Figure 6. As with Figure 1, but 9 series also have a "signal".

Finally, Figure 7 shows the Mannomatic. In this case, the Mannomatic PC1 completely misses the signal and picks up the HS instead.

Figure 7. As Figure 6, but with Mannomatic.

The examples here don’t illustrate the extraction of the HS from red noise series where none of the series have a HS shape (discussed in GRL) . This effect was again denied by Mann at New Scientist, but exists nonetheless. Obviously in the above examples, there is a HS example in each of these series. Von Storch and Zorita asserted that the "Artificial Hockey Stick " effect was characteristic only of red noise environments. I think that our Reply to VZ gave a good response to this, by pointing to the effect of "bad apples" – which "steered" the algorithm even more.

Now one reaction to the signal examples might be to say: well, using the Mannomatic, we missed the signal in the PC1, but we got it in the PC2 (which is Presiendorfer significant.) That would be true in this toy example and in examples of practical interest. However, the problem with the Mannomatic is not that, given enough PCs, that it doesn’t recover the "signal", but that it will recover things that aren’t signals and they look Presiendorfer-significant. We’ve shown examples with tech stocks – sure, the Mannomatic can pick out tech stocks, but that doesn’t make them temperature proxies.

The Mannomatic has some ability to recover an actual signal, but the search for HS-series is strongly distorting that search. That’s why it finds the bristlecones, which actually do have a HS shape. Remember how we found the bristlecones. Once we noticed the data mining of the Mannian PC method, we asked: what does this do in the North American network? One of the outcomes of MM03 was that this network was isolated as what made MBH stand or fall – we didn’t know that in MM03 and didn’t know why the results were so different. When we applied this to the North American network, all the bristlecones bubbled out. We only found this by matching id-codes one by one to ITRDB identifications (since Mann had not disclosed this effect).

That’s how bristlecones came into the picture. Since the MBH version of the HS depends on bristlecones, that’s why we spend so much time on the question: are the bristlecones valid proxies? I don’t think that they are. But it shouldn’t matter. In synthetic examples where you have an actual signal, you can remove one class of proxies and still get a "robust" result. MBH should not be affected by the presence/absence of bristlecones. The inability to obtain a valid reconstruction without bristlecones (which Wahl and Ammann acknowledge, although they express it in different terms) shows that either all the other proxies are no good or the MBH method is no good or both. Ross’s rhetorical question to MBH is: why even bother with the other proxies?

The effect is particular damning because they claimed that their HS wasn’t affected by the presence/absence of dendroclimatic indicators altogether. If it’s not robust to bristlecones, this claim is obviously untrue. Has anyone ever seen an answer to this problem from the Hockey Team? This was one of the Barton questions. Mann didn’t answer it. We raised it with the NAS panel and we’ll see if they deal with this thorny question.

The CENSORED DIRECTORY

It often feels like shoveling out a swamp in dealing with the misrepresentations of our stuff. Someone over at Tim Lambert has said that I “originated incorrect information” about Mann’s CENSORED directory:

So my original point stands that McIntyre originated incorrect information such as the idea that the data in the ftp://holocene.evsc.virginia.edu/pub/MBH98/TREE/ITRDB/NOAMER/BACKTO_1400-CENSORED directory only has bristlecone pine proxies removed when it actually has the entire North American tree ring data set and Queen Anne data set removed.

Here are the facts. Continue reading

Letter to Climatic Change

I was asked to review the Wahl and Amman submission in May 2005 and recently posted up my review here. The first recommendation in my review was that all Wahl and Ammann remove all arguments that depended on their rejected GRL article. They didn’t and now it’s come back and should haunt them. Despite providing a diligent review of the original submission, I was not sent a copy of the revised version for review (it’s at Ammann’s website). I’ve sent the following letter to Schneider today. Continue reading

New Scientist on the Hockey Stick

New Scientist ran a lengthy article on the Hockey Stick. They seem to have talked to everyone involved except Ross and I.

In 2004, even before our GRL article published, a freelancer for New Scientist had got interested in the story and spent a lot of time interviewing me on the telephone. It got to a very advanced stage and then got spiked by the New Scientist editor, following some ExxonMobil type disinformation of the type that Mann sent to Natuurwetenschap to try to prevent publication there

The editors decided not to publish it after all. Your connections with the oil industry raised doubts in their minds about your disinterested independent researcher status and the scientific corroboration from other groups for Mann’s findings persuaded the editors that the story simply did not stand up.

Continue reading