In the last two days, I’ve argued that it’s insufficient for Mann et al. to merely “get” a hockey stick shape some other way, but that they have to show that any such salvage reconstruction meets the representations and warranties of MBH98 as to reasonably even spatial sampling, robustness, statistical skill and proxy validity. I’ve shown that 2 of the 3 MBH98 salvage attempts by Mann et al. the no-PC reconstruction and the Rutherford et al.  reconstruction ” fail not just on one count, but on numerous counts. Today, I’ll discuss the 5 PC or Preisendorfer Rule N salvage attempt and shows that it also fails on numerous counts.
This third salvage attempt (see realclimate Dec. 4, 2004) argues that Preisendorfer’s Rule N is the “standard selection rule” for determining the number of PCs and that this rule was used in MBH98 to determine the number of retained PCs. Mann et al. then argue that, even if they made a mistake in their PC methods, application of Preisendorfer’s Rule N to the North American tree ring network using correctly calculated PCs results in the selection of 5 PCs. They point out (and I agree with this) that the hockey stick shape of the bristlecones appears in the AD1400 North American PC4 using correct calculations. They argue that, once this happens, they “get” a hockey stick shape in the NH temperature calculations. I agree that the presence of the PC4 in the calculations under their methods does result in a hockey-stick shaped NH temperature reconstruction (and stated this in MM05(E&E), but obviously disagree with other premises of this calculation and with other aspects of the calculation.
I’ll show here both that this salvage reconstruction, like the original MBH98 reconstruction, also fails the representations set out in MBH98, which were material to the widespread adoption of the results of this study. The lines of argument are similar to the two previous posts, Errors Matter #1 and Errors Matter #2. I will additionally discuss whether there is such a thing as a “standard selection rule” for the number of PCs, using some recent citations of Mann’s which indicate the opposite and will re-cap (and put into context) questions in an earlier post about whether Preisendorfer’s Rule N was actually used in MBH98.
First, let me re-cap some of the most important representations and warranties of MBH98:
1. “a reasonably homogeneous spatial sampling in the multiproxy network was achieved by representation of “densely sampled regional dendroclimatic data sets” through principal components analysis ensured” (p. 779);
2. “the long-term trend in NH [temperature] is relatively robust to the inclusion of dendroclimatic indicators in the network” (p. 783 and re-stated in Mann et al. , A Further Note)
3. the MBH98 network attained a “high level of skill ” in large-scale reconstruction back to 1400 (p. 785). Recently, Mann has amplified this by saying that studies which do not satisfy a number of “statistical verification exercises”should not be considered in climate studies .
4. the proxies are linearly related to large-scale climatic patterns, and are neither local, non-linear nor non-climatic. (p.780)
Robustness: We have already discussed the non-robustness of MBH98 results to the presence/absence of bristlecone pines in the no-PC and Rutherford et al.  contexts. The same applies to the Preisendorfer context. If the bristlecones are excluded from the network and principal components are calculated without the bristlecones, there is no hockey stick shape. In fact, this result is so strong that even Mann’s incorrect PC method (mining for hockey stick shaped series) was unable to produce hockey stick shaped PCs, as shown in Figure 1 below, which shows the first 5 PC series in the BACKTO_1400-CENSORED directory at Mann’s FTP site, as shown in Figure 1 below. The PC5 actually has a rather amusing 15th century uptick. If the NH temperature calculation is done with these PCs, we have found that you get high 15th century values, virtually identical to the graph illustrated in MM95(E&E). Thus, even under the Preisendorfer salvage reconstruction, the results continue to be non-robust to the presence/absence of bristlecone pines, contrary to an important MBH98 representation about robustness. As we pointed out in MM05 (E&E), Mann et al. were aware of these results and it is hard to see the justification for claims in MBH98 and later in Mann et al.  that their reconstruction was robust to the presence/absence of dendroclimatic indicators in total, when it was known that they were not robust to the presence/absence of bristlecones in the 15th century.
Figure 1. The five PCs in Mann’s BACKTO_1400-CENSORED subdirectory. None have a 20th century hockey stick pattern.
Statistical Skill: We place great emphasis on the analysis of statistical skill presented in MM05(GRL). We pointed out that the RE statistic lacks a theoretical distribution which can be looked up in tables and is determined by simulations. We showed that a much more appropriate benchmark for RE significance was to use MBH98 PC methods on red noise and found a 99% benchmark of 0.59 (as opposed to the 0.0 level used in MBH98.) We also showed that Mann et al. had failed to present a suite of verification statistics, contrary to the recommendations of Cook et al.  and other references, and that consideration of these other statistics would have shown fatal problems in MBH98 statistical significance.
These same problems apply here. Once again, Mann et al. have failed to provide a suite of verification statistics . They only reporting a RE statistic now only 0.22 – which they claim as “clearly skillful”. Given that this RE statistic is less than the median RE statistic which we obtained from applying Mann’s PC method to red noise, we disagree that this RE statistic is “clearly skillful”. We surmise that the R2 statistic will be approximately 0.0 (and other verification statistics will be similarly insignificant.)
According to Mann’s own standards, this reconstruction should not be used in climate matters.
“Standard Selection Rules” for PCs: Preisendorfer’s Rule N is one possible rule for PC retention, but there are many other possibilities. Jackson  states that there are “few guidelines” and goes on to list the Kaiser-Guttman criterion; bootstrapped Kaiser-Guttman; Scree Plot; Broken-stick; Proportion of total variance; Sphericity test; Bartlett’s test of the equality of àŽ»1; Lawley’s test of àŽ»2, bootstrap eigenvalue-eigenvector. Another quite different and very interesting approach that I’ve seen recently is that of Santhanam and Patra , who focus on information in the eigenvectors, as well as eigenvalues, and argue that the “eigenmodes of the atmospheric empirical correlation matrices that have physical significance are marked by deviations from the eigenvector distribution.”
A precondition for the applicability of PC methods themselves (not demonstrated in MBH98) is that the data is i.i.d. with normal distributions (which does not appear to be the case with these tree ring networks). In fact, the distributions in tree ring chronologies are not normal. I have not analyzed the specific effect of this on MBH98 calculations, but the effect should have been addressed at the time of the original publication.
In Jackson’s terminology, the Scree Plot more or less corresponds to Preisendorfer’s Rule N, although Jackson notes that the method is “used infrequently by ecologists” [I've looked at some of the ecological literature because of Mann's citation of Urban's class notes on Jan. 6, 2005]. Jackson notes that there are several variations of this method, including regression and Monte Carlo approaches. In Preisendorfer’s original discussion, Preisendorfer’s Rule N is stated in terms of white noise. In the case of autocorrelation, Preisendorfer states that the “effective” number of observations should be reduced. The procedure shown in the example at realclimate on November 22, 2005 is a variation of Preisendorfer using an AR1 model of the tree ring networks.
Here are some other comments on selection rules. Cummins and Lagerloef  stated:
Application of rule-N, a selection rule based on a Monte Carlo method (Priesendorfer, 1988), would indicate that approximately 32 modes are statistically different than ‘noise’. However, the usefulness of selection rules is disputed (von Storch and Zwiers, 1999) and no significance is attached here to this result.
Hare  stated:
A large number of selection (or retention) rules to determine statistical significance of EOF/PC pairs have been derived, most based on the relative magnitudes of the eigenvalues. The number of EOF/PC modes to retain, and for that matter the meaning of the word "retain", is context-dependent.
It should be noted that because the goal of PCA is essentially utilitarian, the choice of how many axes to retain is ultimately subjective. In practice, either 2 or 3 axes are retained, simply because it is difficult to project more than this onto a printed page.
For example, Franklin et al.  stated:
In the final analysis, the retained components must make good scientific sense (Frane & Hill 1976; Legendre & Legendre 1983; Pielou 1984; Zwick & Velicer 1986; Ludwig & Reynolds 1988; Palmer 1993).
Thus, I don’t see how a reader could conclude that Preisendorfer’s Rule N is itself anything more than a convention for necessarily the appropriate method of determining the number of PCs to retain (perhaps a good convention) or that, in the absence of an explicit statement in methodology, that a reader could assume that this method was used for tree ring networks.
A couple of final points: Overland and Preisendorfer  stated that Preisendorfer’s Rule N should be construed as a necessary condition for significance, not a sufficient condition for significance, as follows:
We interpret the rule as indicating that physical interpretation of EOFs is suspect when the corresponding geophysical eigenvalue is less than one generated from the random data set.
Overland and Preisendofer  also provided the following caveat (echoing concerns of later expressed by Franklin et al. :
an increasing number of researchers have used principal component analysis to study large data sets in meteorological and oceanographic settings. This method of analysis unfortunately is potentially dangerous in the sense that too much is often required of it or, worse yet, read into its results.
We’ve expressed at considerable length our concerns about bristlecone pines as a proxy for temperature – see MM05 (E&E). The admitted non-climatic effect in bristlecone pines is inconsistent with MBH98 representations. As discussed elsewhere, the purported adjustment for non-climatic effects in MBH99 was not applied to 15th century results here in dispute and is far from being proven as an adequate adjustment especially if world climate history is held to depend on the effectiveness of this adjustment. Indeed, merely clearly stating the problem shows the ludicrousness of the proposed solution.
Selection Rules in MBH98: I have discussed this previously in the post Was Presiendorfer’s Rule N Used in MBH98?used There is no statement in MBH98 itself or the SI thereto that Preisendorfer’s Rule N was used to determine the number of retained PC series in tree ring networks.
MBH98 does refer to the use of Preisendorfer’s Rule N as follows in connection with the calculation of temperature principal component series, a different calculation, as follows:
a conventional Principal Component Analysis (PCA) is performed… An objective criterion was used to determine the particular set of eigenvectors which should be used in the calibration as follows. Preisendorfer’s selection rule “rule N’ was applied to the multiproxy network to determine the approximate number Neofs of significant independent climate patterns that are resolved by the network, taking into account the spatial correlation within the multiproxy data set.
The temperature principal components calculation used a standard centered calculation, while we have reported elsewhere that the tree ring principal components calculations used an uncentered principal components calculation, after de-centering the data, which Mann has acknowledged is not a "standard" principal components method. MBH98 stated the following about the retention of PC series in tree ring calculations:
Certain densely sampled regional dendroclimatic data sets have been represented in the network by a smaller number of leading principal components (typically 3-11 depending on the spatial extent and size of the data set). This form of representation ensures a reasonably homogeneous spatial sampling in the multiproxy network (112 indicators back to 1820). [our bolds]
This statement obviously contains no reference to the use of Preisendorfer’s Rule N. When we tested the application of Preisendorfer’s Rule N to actual PC retentions in MBH98, it was impossible to verify the actual retentions. In addition to the lack of replication, in several cases, different numbers of PCs were retained from the same network in different calculation steps – see our previous discussion on this.
When we examined this matter, we were not only unable to replicate MBH98 retentions, but, even after prolonged examination of the matter, we were unable to discern any rhyme or reason to the actual retention policy in MBH98. Accordingly, for this and for other reasons, we re-iterated our request to Nature for production of the source code to demonstrate actual MBH98 procedures, which Mann et al. had previously refused. Despite manifold problems with MBH98, Nature refused to intervene. Thus, the production by Mann et al. of a calculation on November 22, 2004, now said to be essential for the replication of MBH98 results, but never previously published, is really quite bizarre. We re-iterate that MBH98 source code should be produced so that this and other inconsistencies can be examined and urge climate scientists and concerned parties to involve themselves in this.
Cummins, Patrick F. and Gary S. E. Lagerloef, 2002, Low Frequency Pycnocline Depth Variability At Station P In The Northeast Pacific, Journal of Physical Oceanography.
Franklin, Scott B., Gibson, David J., Robertson, Philip A.,Pohlmann, John T. and Fralish, James S. (1995), Parallel Analysis: a method for determining significant principal components, Journal of Vegetation Science 6: 99-106.
Hare, Stephen , 1996. Low Frequency Climate Variability and Salmon Production, Ph.D. Thesis. Downloaded from http://www.iphc.washington.edu/Staff/hare/html/diss/other/intro.html
Jackson, Donald, 1993. Stopping Rules in Principal components Analysis: a Comparison of Heuristical and Statistical Approaches, Ecology 74(8), 2204-2214. Downloaded from http://www.zoo.utoronto.ca/jackson/pca.pdf.
Overland and Preisendorfer , A Significance Test for Principal Components Applied to a Cyclone Climatology, Mon. Wea. Rev. 110, 1-4.
Santhanam, M.S. and Prabir K Patra, 2004. Statistics of Atmospheric Correlations, Phys. Rev. E. Downloaded from http://arxiv.org/PS_cache/physics/pdf/0105/0105006.pdf