A few days ago, I mentioned that I thought that Bürger et al 2006, while recognizing the linear relationship between MBH proxies and their RPCs, had incorrectly formulated the form of the relationship as the form of the linear relationship was inconsistent with my own derivation, which I had cross-checked and verified against source code (not that the methodology is "correct" only that the representation of the methodology is accurate.) This has led to an interesting exchange with Gerd Bürger, which I’m now in a position to summarize.
Gerd initially thought that my derivation must be incorrect. I asked him to show how he got Tellus equation 8. (I thought that the form of Tellus equation 8 was ugly and contained unnecessary complications but that was seemingly a side issue.) Gerd sent in a derivation which looked plausible; when I tried the calculations his way, you could get Tellus equation 8 so I figured that maybe it was OK after all. The calculations still didn’t seem to reconcile, so I figured that I’d gotten lost somewhere in the reconciliation – although I’d spent lots and lots of time on it.
Gerd then wrote in correcting his calculation and stated that Tellus equation 8 did contain an error after all, so that my original conclusion was correct. Gerd says that the error does not affect their simulations and I suspect that this is so. However, it directly affects the argument in Bürger et al 2006 about VZ variance attenuation. I’m not sure that this argument is all that valid anyway, but the argument doesn’t seem to apply with a corrected equaiton 8 or at least it will require a total re-working.
Bürger et al used Moore-Penrose pseudoinverse inverse notation, which I had not used. The Moore-Penrose pseudoinverse is familiar to people who use linear regressions:
For rectangular matrices, you need to be careful as to whether you are taking a left pseudoinverse or a right pseudoinverse as the following is also a (right) pseudoinverse:
The error in Bürger et al equation 8 appears to be traceable to the following property of Moore-Penrose pseudoinverses. Suppose that R is a rectangular matrix and S is a square matrix, if you do a right Moore-Penrose pseudoinverse,
but the following does NOT hold
You can esily verify this by expanding the original definition. But notation can deceive and, if the latter is (incorrectly) assumed to be correct, then Tellus equation 8 is obtained as I’ll show here. This or something formally equivalent is almost certainly what happened in Bürger et al 2006.
Let’s review the bidding. Tellus equation (8) states:
In my notation, I used columns and rows in the transpose arrangement with time series arranged in columns which seems to be a far more common practice. A second difference in notation is that I used a Y matrix for the proxies ( which are X in Gerd’s notation). Given that temperature supposedly causes ring widths rather than ring widths causing temperature, I prefer not to use X for ring widths, to avoid any temptation to reify an inverse relationship. Thirdly, I use U for temperature PCs rather than X, to act as a reminder that we are dealing with temperature PCs in this situation. Now in MBH, the monthly PCs are orthonormal and the U series entering into the calculations are annual averages and are no longer orthonormal. However, they are close to being orthogonal and using the U nomenclature helps remind this. In this case, the PC decomposition of the temperature fields is
Nothing turns on the notational differences, except that it’s easier for me to keep track of my own notation and then re-substitute at the end. The calibration step in my derivation was:
Using Bürger’s Gamma nomenclature, this is equivalent to:
In one of his posts, Gerd wrote that
The estimate for the temperature TPCs is derived by the least-squares solution of:
In Gerd’s notation, this yields without any trouble (as I had equivalently done in my calculations):
There are available simplifications, which I pointed out in the one-dimensional case. Using Gerd’s Gamma terminology, the same equation would be:
In this case, you can extract the left matrix but not the right matrix from the expansion. If you incorrectly extracted both of them, you would get:
Allowing for rows and columns exchanging u for y and y for x, this converts to the following, which is their Tellus equation 8:
So it’s now pretty easy to see where they probably went wrong. I say this not as a gotcha, but simply because it’s always worth understanding where a calculation went wrong so that the error is not replicated again (a trap that I temporarily fell into when I tried to start from Gerd’s equations rather than my own.)
In passing, I wouldn’t say that I observed the error in Tellus equation 8 from attempting to "audit" Bürger et al 2006, so much as observing an inconsistency between it and my own derivation. However, I did spend a lot of time unsuccessfully trying to reconcile these results before making any comments on the matter online.
A Re-Statement of My Derivation
I thought that it would be useful to present my previous results from the same starting point as Gerd so that the reconciliation can be clearly seen and perhaps persuade Gerd and others as to the correctness of the PLS viewpoint. You will also see the simplifications and cancellations that result from how I did it and why this leads to the PLS derivation that had eluded previous commentatots. First let’s get rid of the notation:
since the is a left-matrix under right inverse
This is both simpler than Gerd’s equation; it’s accurate and reveals the structure. At this point, we can expand the Moore-Penrose (right) pseudoinverse:
In the one-dimensional case with standardized Y, being the vector of columnwise correlations of to , the first temperature PC, and ||.|| denoting the standard deviation (I use this notation from linear algebra since s.d. behaves rather like a linear algebra norm in this context),
this reduces to:
as I had previously derived. I think that this one-dimensional case is much the most important for several reasons: (1) the early portion of MBH is one-dimensional; (2) the NH temperature index is dominated by the PC1 and so the calculation of primary interest can be closely approximated by the one-dimensional case in subsequent periods; (3) arguably the multidimensional case can be either represented or closely approximated as piecewise one-dimensional calculations with some of the multivariate apparatus not actually contributing to the result.
Here’s a way of representing the multidimensional case in a similar format to how I did the one-dimensional case. Because is standardized to unit standard deviation and is nearly orthogonal (and exactly orthogonal if the MBH temperature PCs are calculated on annual data as is more logical – MBH98 used monthly because they incorrectly and amusingly thought that you needed more months than gridcells to calculate 16 PCs), the multivariate version can be simplified along the lines of the one-dimensional version. Let be the matrix of columnwise correlations. Then
– the definition of . Thus,
Since is about equal to as a result of near orthogonality, we have:
This is just a piecewise version of the one-dimensional calculation. If annual temperature PCs are used, as in Wahl and Ammann re-working of MBH, as opposed to the monthly temperature PCs of MBH, then is the identity matrix and the above equation is:
The "standardization" operation for Y assumed here is handled a little differently in Bürger et al 2006 in their discussion of canonical correlations. Using canonical correlations the normalized form of is while the corresponding matrix representation of simply dividing by the standard deviation is . Oddly enough and – this is one of the most ridiculous things about MBH proxies which is totally misunderstood – the early MBH proxy network is close to being orthogonal (some "signal") and thus , but this is only contingently true and an undesireable aspect of data purporting to contain a "signal".
Bürger’s K Equation
In response to my earlier post, Gerd offered the following replacement of Tellus equation (8):
yielding revised equation (8):
In my notation, this converts to:
In the analysis of Bürger et al 2006, properties of the matrix were used to draw conclusions about variance attenuation. However, nothing is known about the properties of the matrix, which, at this point, doesn’t seem to be anything more than a notational artifice. I don’t see any advantage in continuing with this terminology as opposed to using the simple and direct expression resulting from my calculations.
I’m in the process of evaluating the canonical correlation arguments in Bürget et al 2006. While TCO and others have commented that they found Bürger et al easy to understand, I found much of it very difficult to understand (while liking other parts). I found equation 8 very hard to understand and it turned out that there was a reason for this. I find their analysis of canonical correlations and their impact on VZ variance attenuation very difficult to understand as well. I’m struggling to see the point of using canonical correlations and I’m dubious as to whether this analysis will survive the demise of equation 8. Maybe TCO understands some of this better than I do, but I’d be surprised.
Also the analysis in Bürger et al 2006 does not IMHO properly analyze the specific effect of re-scaling the RPCs. Yes, they consider this as a flavor in their simulations, but IMHO the re-scaling operation is intimately related to VZ variance attenuation, which cannot be properly understood without explicitly including the re-scaling operation in the linear algebra. I think that Bürger et al got distracted by the canonical correlation argument which doesn’t survive and missed an easier argument, which actually does the job. Once you do that – using the simple and direct representation of MBH methodology that I’ve presented here, you can show (IMO) that VZ variance attenuation is a simple application of orthogonality and the Pythagorean Theorem. From this vantage point, the huffing and puffing by the Hockey Team goon line against VZ is little more than a pointless diatribe against the Pythagorean Theorem.
I hadn’t previously thought about Tellus as a potential market for an article showing that MBH can be construed as a partial least squares method, but it’s probably logical. If they’ve published an incorrect formula, they can probably be persuaded to publish a correct formula, although you never know. I’ve got other material in hand advancing the attenuation issue, which I’ll write up for publication. (I will invite participation from others.) In order to protect myself against academic fussing about too much internet disclosure, you’ll have to be content for now with these teasers. With academic journal schedules, if I don’t get impatient in the mean time, you should learn about these other results no later than 2008 or 2009.