Here’s a comment on handling of Preisendorfer’s Rule N in the code dump; I’ll post some further comments on Preisendorfer on Preisendorfer’s Rule N in a few days. There is NO source code showing the application of Preisendorfer’s Rule N to tree ring networks – a battleground issue, if you will. There is some source code showing the application of a procedure described as "Rule N" to determine the number of temperature PCs to retain, but the procedure in the source code is not consistent either with Rule N of Preisendorfer [1988] or with the procedure illustrated for tree ring networks at realclimate here

Preisendorfer’s Rule N is a simulation procedure (see footnote here for exact quotation) establishing necessary conditions for significance under some restrictive assumptions described in Preisendorfer [1988]. As I’ve noted before here and amusingly here with the dot.com hockey stick , contrary to realclimate claims , Preisendorfer’s Rule N is **not** a method of establishing **sufficient conditions** for statistical significance, bit only for **necessary** conditions. The difference is fundamental.

Preisendorfer [1988] has some very explicit language on this, which I’ll post up in a couple of days. One wonders whether Ammann and Schmidt, the authors of the realclimate Dummies Guide have actually read Preisendorfer. Perhaps the title of their post may describe it more accurately than they intended. The use of Preisendorfer’s Rule N for tree ring networks is not mentioned in MBH98 and tree ring networks do not meet the specific network conditions of Preisendorfer [1988] (although gridded temperature series do.)

The first published mention of the application of Preisendorfer’s Rule N to tree ring networks was at realclimate here, together with an example here. The purpose was to rationalize the inclusion of bristlecones in the NOAMER network if they were dominating the PC4 instead of the PC1. The reason for the intense battle over the esoteric issue of PC retention is that, as long as the bristlecones dominate one of the PCs in the network, they will impart a hockey stick shape to the reconstruction. It doesn’t matter whether it’s the PC1 or the PC4. The main issues are, of course, whether the bristlecones are valid temperature proxies and whether one should rely on a reconstruction which is not robust to the presence/absence of bristlecones.

But an interesting side issue was whether Preisendorfer’s Rule N according to the December 2004 realclimate example was actually used in MBH98 itself. I was able to accurately replicate the diagram at realclimate on the AD1400 NOAMER network and then applied this methodology to every other network/calculation step combination. I was unable to replicate the demonstrated retentions (which were only archived in July 2004) using this supposed procedure, as reported here. The discrepancies in other networks was quite remarkable: it was impossible to demonstrate the selection of 9 PCs for the AD1700 Stahle/SWM network or 2 PCs in the AD1600 Vaganov network. The actual procedure for retaining tree ring PCs was one of the issues that I listed as outstanding in my June 2005 review (# 11 here ). So it was one of the first things that I looked for in the source code dump.

First, there is **nothing** in the source code that shows the application of Preisendorfer’s Rule N to tree ring networks. I’ve looked top to bottom through the code and state this categorically. Given the importance attributed to this procedure at realclimate in December 2004 as a battleground issue, critical to rebutting us, one would have supposed that this important source code would have been produced. While there is no code demonstrating the application of Preisendorfer’s Rule N to tree ring networks, there is some very curious code around page 15 (Notepad printout) referring to "rule N" in the context of temperature PCs (as one alternative method of selection). The comment states:

now determine suggested number of EOFs in training based on rule N applied to the proxy data alone during the interval t > iproxmin (the minimum year by which each proxy is required to have started, note that default is iproxmin = 1820 if variable proxy network is allowed (latest begin date in network) we seek the n first eigenvectors whose eigenvalues exceed 1/nproxy’. nproxy’ is the effective climatic spatial degrees of freedom spanned by the proxy network (typically an appropriate estimate is 20-40)

What does Mann do to implement this? First, he counts the number of proxy series M0 in the calculation step (112 in the 1820 step; 22 in the AD1400 step) and divides this by two to get the value np3. He also uses two other values for np1 and np2 which are said to be:

taking into account spatial dependence between different proxy records, estimate number of climatic spatial degrees of freedom sampled (actually, estimate reasonable range)

We see the following code: *np1 = 60 np2 = 40 np3 = M0/2 ruleN1 = one/float(np1) ruleN2 = one/float(np2) ruleN3 = one/float(np3) *ruleN3 is the parameter that counts: it is 2 / # of proxies in the network i.e 2/112 in the AD1820 network and 2/22 in the AD1400 network. Then Mann does a SVD on the proxy network (this, as elsewhere is a weird method in which he transforms all variables into complex numbers and then uses a 1969 Fortran routine; other than being weird, it isn’t incorrect). The code call is: *call CSVD(AA,nmax1,mmax,N0,M0,IP0,NU0,NV0,S,UU,VV) *Next he calculates the total variance as the sum of squares of the eigenvalues (variables *partsum* and *sumtot*). Next he counts the number of eigenvalues that have a share of total variance exceeding 2/M0 (i.e. 2/112 in the AD1820 network and 2/22 in the AD1400 network) up to a maximum of 30. I don’t know where the 30 comes from.

The retention policy of 2/M0 is not documented here or in MBH98 itself. It’s possible that that this comes from some unpublished simulations, tying in somehow to Preisendorfer’s Rule N. I haven’t tried to replicate the retention of temperature PCs at here according to this procedure, but I would be simply dumbfounded if the selections evidenced at this link can be obtained from this calculation. The rule exemplified here is obviously a different calculation than the calculation illustrated at realclimate here. I have not checked whether this procedure applied to tree ring networks will yield the strange retention pattern of PCs – I doubt it. I am actually baffled as to how any objective methodology can get 2 PCs for the AD1600 Vaganov network and 9 PCs for the AD1700 Stahle/SWM network?

Does this *matter*? Well, they said that their PC retention policy was the "mathematically correct" way to do the calculations. I can’t replicate their calculations. Wahl and Ammann didn’t show a replication of these calculations and I can’t imagine that they could (or else it would be reported). They didn’t **report **that they couldn’t replicate them; they reported the opposite – that they had replicated MBH98 top-to-bottom.

## 16 Comments

Steve,

Perhaps you don’t want to post this. Anyway at

http://www.gfi.uib.no/~nilsg/kurs/notes/node1.html

there are a list of topics for an academic course which mentions Barnett-Preisendorfer CCA.

Underneath is the name David Stephenson 2000-09-02, who probably taught the course. There is a David Stevenson who is George Van Osdol Prof of Planetary Science at Caltech –

http://www.gps.caltech.edu/faculty/stevenson.

Maybe, just maybe Stephenson can give you some help.

Roger Bell

I got this using Google

Please let me know if it helps or not, if not I’ll try again

Steve:Thanks for this. CCA is a little different than PCA. I’ve been re-reading Preisendorfer, which is very nice for mathematically inclined people. His comments about Rule N are really quite shocking when one sees the realclimate usage. Preisendorfer keeps things nice and logical and away from ad hoc Mannian recipes or the ridiculous Dummies Guide by Dumb and Dumber – and here I was going to be serene.Eigenvalues? Back to maths 101😦 I though we were dealing with simple climate issues…….

Whenever I hear eigenvalue or eigenvector it scares me…

Eigenvalue basically just means what you get if you put any whole number (among those allowed by an equation) into the equation. IOW: substituting n= 3 we get Vx = 6.74 * pi / Planck’s constant or whatever.

Oh…it’s like orbitals? An eigenvalue would be the energy?

An eigenvalue can be anything. Energy, momentum (probably an eigenvector in that case), and it doesn’t have to even be quantum mechanics related. If you’ll recall, there’s an equilvance between the matrix mechanics and the Schroedinger equation sort of thing. So I’m sure there’s a branch of pure mathematics which deals in eigenvalues and that sort of thing. Probably related to linear algebra or possibly advanced calculus. But it’s been so many decades since I studied the stuff I only remember a bit of it. Of course nowadays it’s easy to look up a tutorial on-line if you’re interested.

I don’t really know “mattreses”. My mind tuned out when they started talking about determinants. And I’ve never had a full up class on it. Just the snippets that you get in other stuff.

I wish I knew more math. I aced differential calc, integral calc, got a B in multivariable calc(not as quickly as intuitive as the others), and then A in diffeqs.

Haved had that much more than that. some snippets for other courses. Just makes me recognize terms and at times concepts. But not really know much.

Any matrix can be used to transform space. The eigenvectors of the matrix represent those vectors in the space whose direction is not affected by the transformation; the eigenvalues represent the amount by which the eigenvectors are stretched or compressed.

The fun bit is thinking through what the space is representing in any particular case.

Don’t be scared of eigens, they’re really pretty.

dude, that’s all cool and all. But did you listen to my math background? I don’t know what the word “transform” means mathematically.b

Any rectangular matrix H of dimension (n,m), usually n>m, can be represented as follows:

H = U* D * V, where U is orthogonal (n,n), D is diagonal with decreasing values from left top to bottom right and V is orthogonal (m,m). The matrix terminology is that D are the eigenvalues, V the eigenvectors. "Eigen" means singular and so you sometimes see "eigenvalues" called singular values; the operation is usually called singular value decomposition (svd).

In principal components, you do a svd on the covariance matrix or correlation matrix of H. If H is centered on its columns (as required by Preisendorfer to be an analysis of

variance), then svd on centered H and svd on the covariance matrix will yield equivalent results. Svd on a centered and scaled version of H will yield equivalent results to svd on a correlation matrix.If the covariance matrix is Q, then svd on it yields:

Q=U*D* V (not necessarily the same U and V as for H unless H is centered). V is symmetric t(V)=V and t(V)=V^-1. If you project the original data matrix H on the eigenvector matrix V, you get the principal components F:

F= H*V,

Thus, the original data matrix is the product of the principal components and the eigenvectors, H=F*V, since

since F*V=F*t(V)= H*V*t(V)=H.

In this decomposition, the eigenvalues are related to the standard deviations of the left matrix series. THey are reported a little difffferently in princomp and prcomp – with one dividing by n and one dividing by (n-1). It ttook me a whole to figure this out.

The principal component programs in R (and S) have slightly different terminologies. The left matrix is called "scores" in princomp and "x" in prcomp; the right matrix is called "loadings" in princomp and "rotation" in prcomp and the weightings are called "sdev".

I first used princomp, but Im using prcomp now, since it ties directly to svd decompositions, while princomp needs to make a slight weight adjustment to recover corresponding eigenvalues.

Red flag…bull…horns lowered…Steve-gonna-get-a-gorin’ 😉

TCO, there’s still one thread on this site you haven’t made a comment on, but I’m not going to tell you, you’ll have to find it for yourself!

Interestingly, although I did four years of stats at university, I don’t recall that we did any PCA (or factor analysis, for that matter). I came across it in a commercial situation a few years later – I bought a couple of texts, and learned it from scratch.f

Re #3: Here’s another go at understanding eigenvalues/vectors. You have a square matrix A with k columns and k rows. And you have a vector c with k rows (and 1 column). The product Ac yields another vector d with k rows. If you graph a vector it’s just a point on a diagram with k axes, and there’s a line between it and the origin so we can think of a vector having a direction and a length. When you multiply A times c you’re creating a new vector d. It’s also just a point in k-space, with a length and a direction.

Now suppose you multiply a vector p by A and the resulting vector q (=Ap) has the same direction as the old vector. If you picture this in 2-d, what that means is q is a scalar multiple of p. It’s just stretched or shrunk along the same direction. So you didn’t even need to use the matrix A, just the scalar multiple, which we’ll call z. So q = zp, where z is just a number. Then we have 2 equations, q=Ap and q=zp, and that implies Ap=zp. With some algebraic rearranging that gives us (A-Iz)p=0 where I is an identity matrix with k rows/columns, all 0’s except 1’s down the diagonal.

Now suppose you try out every possible vector in k-space and note down all the pairs of vectors p and scalars z for which (A-Iz)p=0. There will be at most k of them, and they can be solved using a standard algorithm. These are the “eigenpairs”– eigenvalues (z’s) and eigenvectors (p’s). My German secretary told me that “eigen” means “essence” or “of itself”. The eigenvectors are the directions in which a matrix A transforms a vector “eigenly”–it doesn’t change the vector’s direction, it only dilates it (by the scalar amount z). In other words it’s the simplest transformation associated with the matrix A.

Principal component analysis starts out by describing a different problem. You have a data matrix M and you want to approximate it with a vector w. If you set it up as a sum of squares minimizing problem you end up with an algebraic expression that involves computing the eigenvectors/values for the matrix M’M. If the columns of M are centered to a zero mean this is the covariance matrix of M.