TCO has been pressing about the exact impact of various properties of the MBH PC methodology, asking some "elementary" questions about PC impact. Some readers have criticized him for in effect asking for a tutorial on PC methods. However, if someone asked: where can I find an article showing the statistical properties of PC methods applied to time series, I don’t think that I could give a reference that would be helpful for what we’re talking about, other than our articles, which sort of start in the middle. Some of the properties that concern me are very elementary in mathematical terms, but the surprise that came from our GRL article indicated that these mathematically elementary properties had not been thought about.
Arguably, since Mann proposed using PC methods to extract "signals" from tree ring networks, the obligation to demonstrate the validity of the MBH98 PC1 as a temperature proxy should rest with him. However, he didn’t do so at the time.
Be that as it may, I’ve spent quite a bit of time thinking about the properties of PC methods as a means of recovering "signals". There are two layers of issues with respect to Mannian PC methods: 1) problems with the Mannian method relative to conventional methods; 2) problems with PC methods themselves as applied to tree ring networks. There’s no statistical rule that says that PC methods are an appropriate way of extracting temperature proxies – surely that has to be proven. There are comments in our Reply to von Storch which refer to these issues. (In both our Replies, we introduced some new material because we were trying to be thoughtful. However, in the sound bite world of climate science, no one seems to have picked up on these comments.) Anyway here are a few more illustrations. One nice thing about blogs is that you’re not limited to 12,000 characters.
Figure 1 is constructed as follows: series 1 goes from 0 to 1 between 1902 and 1980; while series 2-10 are 0. All series are then blurred with white noise with a small standard devation (sd=0.05). One reason for blurring with white noise is because principal component methods carry out singular value decomposition on matrices and this avoids singularity. (The singularity may not "matter" but there’s no reason not to avoid it.) As you can see, there is a big difference between the simple average and the PC1. The PC1 is obtained by a linear weighting of the unerlying series: the weighting on series 1 is 0.9994 causes it to contribute more than 99.89% of the variance to the "composite" PC1. The simple average (red) is quite different. This illustrates a big difference between PC methods and averaging.

Figure 1. Series 1 goes from 0 to 1 from 1902 to 1980. Series 2-10 are 0. All series blurred with white noise sd=0.05. Weights of series 1 is 0.9994.
Figure 2 shows the same set-up but with 100 series in total. As you see, the PC 1 is essentially unchanged, but the amplitude of the mean is now reduced to nearly 0. So while averaging over a larger and larger network will gradually attenuate the impact of an outlier, PC methods in this sort of time series context will consistently pick out the high-variance outlier and pass it through essentially unscathed into the PC1. That’s why we use the term "data mining" in connection with PC methods.

Figure 2. As Figure 1, but with 100 series. Label in 2nd panel should read 99 series.
Figure 3 shows the set-up from Figure 1, but using the Manomatic PC method. In this context, the Manomatic has little incremental difference (I’ll show below how it does affect things.)

Figure 3. As with Figure 1, but with Mannomatic PC method.
What happens when you have both a front-end HS and a back-end HS? This is illustrated in Figure 4. In an actual PC calculation, the PC series might be pointing up or pointing down (the PC series intrinsically have no orientation.) I’ve arranged things so that they point up – since that is Mann’s method. As I’ve pointed out before the MBH99 HS points down. (The flippping of the PC series is a very different issue than flipping of individual series to match.) The take-home point here is that, in this set-up, the front-end and back-end HS are allocated opposite signs. Does this "matter"? Well, I happen to think that people should know the time series properties of their methodologies before they are used in big reports. Also, the properties of the PC algorithm that do this do other things as well, so I’m disinclined right now to agree that the properties can be analytically separated (but I don’t preclude that I might change my mind on this.)

Figure 4. Ordinary PC method. Weights are 0.67 and -0.94 to two "dominant" series and under 0.02 for all others.
Figure 5 shows the same set-up using the Mannomatic. So this effect happens under ordinary PC methods as well as the Mannomatic. I’m sure that the effect is more intense or more frequent in the Mannomatic in some sense which could be defined but I’ve not had occasion to precisely de-limit it.

Figure 5. As with Figure 4, Mannomatic version.
For Figure 6, I’ve modified the setup of Figure 1 so that 9 series contain an actual "signal" using an ARMA(1,1) method (ar =0.9; ma=-0.6) to look like ARMA features of many actual temperature series. Then I’ve added in white noise as above. I picked a standard deviation for the signal that I thought would illustrate the point, but I didn’t fiddle with it to get this result. Figure 6 shows the PC1 using a conventional calculation. In this case, the outlier pulls the average up a little bit at the close, while the PC1 picks up the signal a little better than the average.

Figure 6. As with Figure 1, but 9 series also have a "signal".
Finally, Figure 7 shows the Mannomatic. In this case, the Mannomatic PC1 completely misses the signal and picks up the HS instead.

Figure 7. As Figure 6, but with Mannomatic.
The examples here don’t illustrate the extraction of the HS from red noise series where none of the series have a HS shape (discussed in GRL) . This effect was again denied by Mann at New Scientist, but exists nonetheless. Obviously in the above examples, there is a HS example in each of these series. Von Storch and Zorita asserted that the "Artificial Hockey Stick " effect was characteristic only of red noise environments. I think that our Reply to VZ gave a good response to this, by pointing to the effect of "bad apples" – which "steered" the algorithm even more.
Now one reaction to the signal examples might be to say: well, using the Mannomatic, we missed the signal in the PC1, but we got it in the PC2 (which is Presiendorfer significant.) That would be true in this toy example and in examples of practical interest. However, the problem with the Mannomatic is not that, given enough PCs, that it doesn’t recover the "signal", but that it will recover things that aren’t signals and they look Presiendorfer-significant. We’ve shown examples with tech stocks – sure, the Mannomatic can pick out tech stocks, but that doesn’t make them temperature proxies.
The Mannomatic has some ability to recover an actual signal, but the search for HS-series is strongly distorting that search. That’s why it finds the bristlecones, which actually do have a HS shape. Remember how we found the bristlecones. Once we noticed the data mining of the Mannian PC method, we asked: what does this do in the North American network? One of the outcomes of MM03 was that this network was isolated as what made MBH stand or fall – we didn’t know that in MM03 and didn’t know why the results were so different. When we applied this to the North American network, all the bristlecones bubbled out. We only found this by matching id-codes one by one to ITRDB identifications (since Mann had not disclosed this effect).
That’s how bristlecones came into the picture. Since the MBH version of the HS depends on bristlecones, that’s why we spend so much time on the question: are the bristlecones valid proxies? I don’t think that they are. But it shouldn’t matter. In synthetic examples where you have an actual signal, you can remove one class of proxies and still get a "robust" result. MBH should not be affected by the presence/absence of bristlecones. The inability to obtain a valid reconstruction without bristlecones (which Wahl and Ammann acknowledge, although they express it in different terms) shows that either all the other proxies are no good or the MBH method is no good or both. Ross’s rhetorical question to MBH is: why even bother with the other proxies?
The effect is particular damning because they claimed that their HS wasn’t affected by the presence/absence of dendroclimatic indicators altogether. If it’s not robust to bristlecones, this claim is obviously untrue. Has anyone ever seen an answer to this problem from the Hockey Team? This was one of the Barton questions. Mann didn’t answer it. We raised it with the NAS panel and we’ll see if they deal with this thorny question.
New Scientist on the Hockey Stick
New Scientist ran a lengthy article on the Hockey Stick. They seem to have talked to everyone involved except Ross and I.
In 2004, even before our GRL article published, a freelancer for New Scientist had got interested in the story and spent a lot of time interviewing me on the telephone. It got to a very advanced stage and then got spiked by the New Scientist editor, following some ExxonMobil type disinformation of the type that Mann sent to Natuurwetenschap to try to prevent publication there
Continue reading →