As noted previously, Tamino did not quote, cite or discuss how our articles reported key issues in his post – an omission that results in our research not being accurately represented in the record at Tamino’s site. I’ll discuss a couple of examples. It’s unfortunate that time has to be spent on such matters prior to dealing with issues like short-centering.
Tamino reports breathlessly that the hockey stick pattern can be observed in the PC4 of the North American tree ring network, because it’s “a pattern in the data” and not an “artifact”.
the hockey stick pattern is still there even with centered PC — which is no surprise, because it’s not an artifact of the analysis method, it’s a pattern in the data.
and later asks with this illustration,
Did MM really not get this? Did they really discard the relevant PCs just to copy the bare number of PCs used by MBH, without realizing that the different centering convention could move the relevant information up or down the PC list?
You betcha. When done properly on the actual data, using 5 PCs rather than just 2, the hockey stick pattern is still there even with centered PC — which is no surprise, because it’s not an artifact of the analysis method, it’s a pattern in the data. Here’s a comparison of PC#1 for the North American ITRDB (international tree ring data base) data using the MBH method (red), and PC#4 from using the MM method.
Tamino has misrepresented the research record, as both MM2005 (GRL) and MM2005 (EE) report the occurrence of the hockey stick pattern in the North American PC4, attributing it to the bristlecones.
In MM2005 (GRL), we stated:
Under the MBH98 data transformation, the distinctive contribution of the bristlecone pines is in the PC1, which has a spuriously high explained variance coefficient of 38% (without the transformation – 18%). Without the data transformation, the distinctive contribution of the bristlecones only appears in the PC4, which accounts for less than 8% of the total explained variance.
In MM2005(EE), we re-iterated the same point at a little more length:
In the MBH98 de-centered PC calculation, a small group of 20 primarily bristlecone pine sites, all but one of which were collected by Donald Graybill and which exhibit an unexplained 20th century growth spurt (see Section 5 below), dominate the PC1. Only 14 such chronologies account for over 93% of the variance in the PC1, effectively omitting the influence of the other 56 proxies in the network. The PC1 in turn accounts for 38% of the total variance. In a centered calculation on the same data, the influence of the bristlecone pines drops to the PC4 (pointed out in Mann et al., 2004b, 2004d). The PC4 in a centered calculation only accounts for only about 8% of the total variance, which can be seen in calculations by Mann et al. in Figure 1 of Mann et al. [2004d].
In MM2005 (EE), we further reported the effect of carrying out an MBH-type reconstruction under many permutations (most of which were re-stated by Wahl and Ammann without citing our findings), including the use of both 2 and 5 North American covariance PCs.
If a centered PC calculation on the North American network is carried out (as we advocate), then MM-type results occur if the first 2 NOAMER PCs are used in the AD1400 network (the number as used in MBH98), while MBH-type results occur if the NOAMER network is expanded to 5 PCs in the AD1400 segment (as proposed in Mann et al., 2004b, 2004d). Specifically, MBH-type results occur as long as the PC4 is retained, while MM-type results occur in any combination which excludes the PC4. Hence their conclusion about the uniqueness of the late 20th century climate hinges on the inclusion of a low-order PC series that only accounts for 8 percent of the variance of one proxy roster.
These are not exotic references; these points at issue in Tamino’s posts are specifically and clearly discussed in these articles.
Wegman also stated that the hockey stick from the bristlecone/foxtails occurred in the PC4 (see Question 10b):
Without attempting to describe the technical detail, the bottom line is that, in the MBH original, the hockey stick emerged in PC1 from the bristlecone/foxtail pines. If one centers the data properly the hockey stick does not emerge until PC4. Thus, a substantial change in strategy is required in the MBH reconstruction in order to achieve the hockey stick, a strategy which was specifically eschewed in MBH. In Wahl and Ammann’s own words, the centering does significantly affect the results
“In the Data”
Here’s a related Tamino straw man. Tamino states:
PCA (centered or not) doesn’t create patterns at all, they have to be there already even to “exhibit a larger variance.”
No one disagrees with this. We stated that the MBH algorithm “mined” for hockey stick patterns; we did not say that it “manufactured” them.
In effect, the MBH98 data transformation results in the PC algorithm mining the data for hockey stick patterns.
Wegman (see question 9b) expressed the point in similar terms:
If the variance is artificially increased by decentering, then the principal component methods will “data mine” for those shapes. In other words, the hockey stick shape must be in the data to start with or the CFR methodology would not pick it up… Most proxies do not contain the hockeystick signal. The MBH98 methodology puts undue emphasis on those proxies that do exhibit the hockey-stick shape and this is the fundamental flaw. Indeed, it is not clear that the hockey-stick shape is even a temperature signal because all the confounding variables have not been removed.
What is Tamino’s point of disagreement on this issue with either Wegman or ourselves?
As to the PC4, we stated clearly that the hockey stick shape was a distinct pattern in the North American tree ring data set, observable in the PC4 under a centered calculation (as Mann et al had done as well in their Nature submission placed online and in a post at realclimate). We reported that the pattern could be traced back to the bristlecones and spent a considerable amount of time analyzing bristlecones. For Tamino to present the PC4 without making any citation or reference to our comments on the matter – and to then snottily ask “Did MM not get this?” results in our research not being accurately represented in his posting.