Hu, I’ve just done a post on the inapplicability of North et al 1982 to the problem at hand. (This was something that I’d visited in February, but deserves re-visiting.) Steig’s “tutorial” on this topic suggests that it would not be out of line for Steig to take a refresher course.

]]>I’ve read North now and you’re quite right that he expressly states that he is not trying to derive a stopping rule. He’s considering an application where the EOF’s may have a physical interpretation of interest, and notes that if a given EOF has an eigenvalue that is insignificantly different from that of another EOF, the two EOFs are only determined up to linear combinations, and hence neither will show the underlying pattern by itself.

Preisendorfer’s “Rule N” (which has nothing to do with the sample size N, since it is just the 14th of several alphabetically identified stopping rules he considered) is much more relevant, but still has a problem, IMHO. His null is that all the eigenvalues are equal so that no linear combination has any more explanatory power than any other. The estimated eigenvalues will then be different with probability 1 and will ordinarily decay gently when they are arranged in decreasing order. The exact distribution of the j-th largest estimated eigenvalue is messy since it is the j-th largest of N identically distributed random variables, so he proposes simulating it by Monte Carlo means, taking into account the serial correlation of the data. The distribution of the very largest estimate will be considerably higher than North’s formula evaluated with all lambda’s equal would predict, since it is the largest of N such estimates, and not a single such estimate as in North’s case.

Preisendorfer’s simulated critical values work fine for the first eigenvalue estimate, but once having rejected that the first is equal to the others, I believe they give the wrong distribution for the second, since they are based on the original null that all N (or N-m if m seasonal or other paramters have been estimated) of the eigenvalues are equal. If the first eigenvalue was say 40% of the total variance, then there is only .6 as much variance to spread over the remaining N-1 (or N-m-1) eigenvalues under the revised null that these remaining eigenvalues are all equal to one another, but not to the first. Accordingly, the critical value for the second eigenvalue would be only about .6 as high as for the first in this example, and so forth. So the rule tends to stop much too soon. (Remember that the “singular values” generated by SVD of the data matrix are not the eigenvalues themselves of the covariance matrix, but just their square roots.)

If you did have a data set with a pair of equal eigenvalues that were substantially higher than the others, and if a rule like Preisendorfer’s (with or without modification as above) were applied to it and the first lit up as significant, the second would almost surely light up as well. So the event of multiple eigenvalues doesn’t pose a particular problem for stopping rules per se.

Of course, in order for equal eigenvalues to be a meaningful indicator of non-correlation, each variable must have first been normalized to have equal value. While it may be useful to apply a SVD directly to eg Steig’s AVHRR file, as Steig has apparently done, it would not be meaningful to apply a stopping rule like Preisendorfer’s Rule N to the resulting (squared) singular values.

I hope Steig sees this discussion in time to correct his graduate lectures on PCA this week!

BTW, North expressly assumes 0 serial correlation in the data matrix. If the data are serially correlated, perhaps an adequate fixup would be just to adjust his N for “effective DOF” a la Santer, Nychka, Quenouille and Bartlett. But as it stands, his se’s wouldn’t be valid, even for the limited question he addresses.

]]>Unfortunately, it is comments like these that do not get directly into the discussion. They do, however, have a place in our knowledge base and help us individually judge the evidence and methods in papers like Steig et al. (2009).

]]>In case anyone’s curious, here’s my reply. It hasn’t gotten through moderation yet.

Was Ryan finally granted the courtesy of a right to reply?

]]>Instead, what is actually done is closely related to Principal Components Regression, where it is well-known that low variance components may be very important.

Overfitting is not mentioned in the pdf. Is this relevant as well:

A cautionary note on PCR:In practice, zero eigenvalues can be destinguished

only by the small magnitudes of the observed eigenvalues. Then, one may be tempted to omit all the principal components with the corresponding eigenvalues below a certain threshold value. But then, there is a possibility that a principal component with a small eigenvalue is a good predictor of the response and its omission may decrease the efficiency of prediction drastically.

(Rao, Toutenburg; Linear Models Least Squares and Alternatives )

]]>IMO the main problem in Steig’s “educational” post, and also somewhat discussion here, is the focus on PC retention as in ordinary PCA. This is not what they are doing in their Antarctic paper. Instead, what is actually done is closely related to Principal Components Regression, where it is well-known that low variance components may be very important. I guess a quote from Jolliffe (1982) is in order.

Hill et al. (1977) give a thorough and useful discussion of strategies for selecting principal components which should have buried forever the idea of selection based on size of variance. Unfortunately this does not seem to have happened, and the idea is perheps more widespread now than 20 years ago.

References:

Ian T. Jolliffe, A Note on the Use of Principal Components in Regression, Applied Statistics, Vol. 31, No. 3 (1982), pp. 300-303.

R.C. Hill, T.B. Fombay, and S.R. Johnson, Component selection norms for principal component regression, Commun. Statist.- Theory. Method., A6 (1977), pp. 309-334.

I agree, David.

Here’s a suggestion: If Dr. Steig and others like him have the courage of their convictions, and are genuinely interested in advancing the science, they should bring their findings here to ClimateAudit or some similar website for scrutiny before going out and submitting them to publications that, regardless of how strenuous the peer-review process, may not have the wherewithal to fully ascertain the accuracy of the findings.

]]>