What I’m going to show here is that the MBH98 method can be reduced to a few lines of code and, in doing so, show some other interesting results as well. Today I’m just going to get to the reconstructed temperature PCs, but I’ll show that these are linear in the proxies and later show that the NH temperature index is, up to a very slight linearization in the RPC re-scaling step, also linear in the proxies. We reported this as long ago as MM03, but I’ve never shown this in detail. The result is important because it contradicts claims by MBH, von Storch et al  and Zorita et al  that the effect of individual proxies cannot be isolated. I’ve been massaging this material for a long time and plan to submit it for publication. Recently, Mann has argued that RegEM is a magic bullet and I’ve been re-visiting my methodological notes.
In the important case where only one temperature eigenvector is reconstructed (the AD1400 step in MBH98 and the 1000-1399 step in MBH99), I’ll show that the weights assigned to each proxy are proportional to the correlation of each proxy to the temperature PC1 – demonstrating the essential similarity of MBH98 methodology to “correlation weighted” proxies used in other studies [e.g. Mann and Jones, 2003].
The procedure described here (or close variations up to re-scaling) has been used since MM03. I have reconciled all results step by step to Ammann and Wahl source code (discussed last May) and have analysed MBH98 source code. I’ll discuss some nuances pertaining to weighted regression in the next instalment. (I apologize for transpose where you’ll have to read T as a transpose, as I can’t figure out how to do a superscript on "Wordpress").
MBH98 has 11 steps with the first step beginning in AD1400 with 22 proxies and 1 reconstructed temperature PC (RPC) and the last step beginning in AD1820 with 112 proxies and 11 RPCs. Note that the temperature PCs are different than the tree ring PCs and were obtained by a PC decomposition of 1082 gridcell series interpolated so that there were no missing values. For the analysis here, the steps don’t matter, although I will discuss separately a simplification resulting from use of only 1 RPC.
MBH98 stated that they “trained” the proxies by finding the “least-squares optimal combination” of (temperature) PCs to represent the proxies. Their procedure is described in typically rotund and inflated prose as follows:
These Neofs eigenvectors were trained against the Nproxy indicators, by finding the least-squares optimal combination of the Neofs PCs represented by each individual proxy indicator during the N=79 year training interval from 1902 to 1980 (the training interval is terminated at 1980 because many of the proxy series terminate at or shortly after 1980). The proxy series and PCs were formed into anomalies relative to the same 1902-80 reference period mean, and the proxy series were also normalized by their standard deviations during that period. This proxy-by-proxy calibration is well posed (that is, a unique optimal solution exists) as long as N> Neofs (a limit never approached in this study) and can be expressed as the least-squares solution to the overdetermined matrix equation, Ug = Y[,i] , where U is the matrix of annual PCs, and Y[,i] is the time series vector for proxy record i. The Neofs-length solution vector g = G[i,] is obtained by solving the above overdetermined optimization problem by singular value decomposition for each proxy record i = 1,…,Nproxy. This yields a matrix of coefficients relating the different proxies to their closest linear combination of the Neofs PCs .This set of coefficients will not provide a single consistent solution, but rather represents an overdetermined relationship between the optimal weights on each on the Neofs PCs and the multiproxy network. [ notation slightly re-stated for consistency with notation here.]
The most obvious implementation of the above procedure is qr-minimization, which for the i-th proxy yields a Neofs -length vector of coefficients G[,i] as follows:
1) , for i = 1,…, Nproxy
where G is the matrix of calibration coefficients, U is the matrix of (selected) temperature PCs and Y is the matrix of proxies, with both U and Y restricted to the calibration period cal. (The index cal restricting the matrices to the calibration period will be assumed but not shown in these equations, but is in all code where appropriate. The indexing in equation (1) is according to the conventions of S and R.) This is the approach used in Ammann and Wahl code (which like ours is in R).
The above procedure can be seen to be no more and no less than a multiple linear regression of each proxy in the calibration period against Neofs temperature PCs (TPCs) without a constant term:
2) ~ … for i=1,…, Nproxy
In the formalism of R,
3) G[,i] = lm( Y[,i] ~ U -1) )$coef
The coefficients from regression and from qr-minimization are identical. An important advantage of recognizing the inherent regression nature of the calculation is that there is considerable literature on statistical significance in regression. The regression formalism of equation (6) enables the calibration procedures to be viewed in more general statistical terms, enabling the consideration of the validity of individual statistical relationships through t-statistics, Durbin-Watson statistics. Merely expressing the problem in qr-minimization terms does not avoid the need to consider statistical significance. While it is beyond the scope of this note, many of the key individual calibrations (e.g. Gaspé tree rings) fail standard regression diagnostics testing for mis-specified or spurious relationships.
The matrices of regression coefficients G, which have dimension Neofs by Nproxy, can be rather large. In the 1820 step, there are 11×112 coefficients and 1×22 coefficients in the AD1400 step. Each linear regression can be expressed as:
4) , for i = 1,…, Nproxy,
the stacked version of which is:
Regression assumptions presume independent identically distributed (i..i.d) errors, which is not demonstrated in MBH98 and appears to be untrue, although this discussion is beyond the scope of this article. Standard linear algebra solves for the coefficients G as follows:
We use this formula in our code, which in one line yields all the calibration coefficients for a single step. Again we emphasize that results reconcile exactly from this step to both Ammann and Wahl code and MBH98 code mutatis mutandi.
Although MBH did not report many salient details of their calculations, here they reported that used singular value decomposition to obtain their calibration coefficients. Indeed, instead of calculating the matrix G directly as above, it is possible to derive identical results using principal components, which can be shown as follows. First, Mann carried out a PC decomposition [UU,SS,VV] of the matrix U of temperature PCs, yielding:
They then calculated the matrix of coefficients G through a series of operations in Fortran equivalent to the following formalism ( a transliteration of their Fortran steps is here):
This formalism yields identical results in the unweighted step as the usual regression linear algebra as shown here [...], but problems arise in the weighted case as discussed here .
In the AD1400 step (and the AD1000 step of MBH99), MBH only use one temperature PC (U[,1]) in the reconstruction. This changes the matrix formulations into vector formulations and enables some helpful simplifications. The covariance between a TPC and the i-th proxy (and mutatis mutandi, variances) can be expressed in inner product terms as follows:
Accordingly, a regression coefficient in equation (6) can be expressed as follows:
If the vector G[1,] is denoted by and applying cov and sd to the columns, then we get:
since the proxies Y are standardized in the calibration period. In other words, the calibration coefficients àÅ½àⱠfor the AD1400 step are proportional to the individual correlations of each proxy to the TPC1. (This would apply, mutatis mutandi, for the TPC2,…)
Estimation of RPCs
The estimation step in MBH98 consists of the estimation of the matrix of reconstructed TPCs, a step once again described in MBH98 in rotund prose as least-squares minimization.
Proxy-reconstructed patterns are thus obtained during the pre-calibration interval by the year-by-year solution of the overdetermined matrix equation, Gz = y [j,], where y[j,] is the predictor vector of values of each of the P proxy indicators during year j. The predictand solution vector z=[j,] contains the least-squares optimal values of each of the Neofs PCs for a given year. This optimization is overdetermined (and thus well constrained) as long as P >Neofs which is always realized in this study. It is noteworthy that, unlike conventional palaeoclimate transfer function approaches, there is no specific relationship between a given proxy indicator and a given predictand (that is, reconstructed PC). Instead, the best common choice of values for the small number of Neofs predictands is determined from the mutual information present in the multiproxy network during any given year. The reconstruction approach is thus relatively resistant to errors or biases specific to any small number of indicators during a given year. This yearly reconstruction process leads to annual sequences of the optimal reconstructions of the retained PCs, which we term the reconstructed principal components or RPCs and denote by [,k]. [notation changed slightly]
Again, the most obvious implementation of the stated procedure is through qr-minimization (this time on a year-by-year basis), which yields for the reconstruction of the k-th temperature PC in the j-th year:
Once again, this is the procedure applied by Wahl and Ammann. Once again, this is formally equivalent to a linear regression without an intercept – this time, the proxies are regressed against the calibration coefficients, being expressed as follows:
13) Y[j,] ~ G[k,1] + G[k,2]+ … +G[k,Nproxy], for j=1,…, Nyear; for k=1,…,Neofs
In the AD1820 step covering 161 years, there are 11×161 regressions yielding 11×161 individual RPC values. Equation (13) can be interpreted as:
14) , for j = 1,…, Nyear,
or stacked (this time estimating U instead of G):
The direct linear algebra solution of the qr-minimization (or the regression) is the following:
The stacking is very awkwardly handled in the Ammann and Wahl code, where the calibration and pre-calibration periods are handled separately, although the matrix algebra is identical and the stacking yields identical results.
Once again, Mann source code shows that they carried out operations which yield identical results to equation (16) in an unweighted case using PCA methods, similar to the corresponding operations used for calibration. Once again, temporarily leaving aside weighting issues, MBH98 first carried out PCA on the transpose of the matrix of calibration coefficients , yielding the triplet [UUU,SSS,VVV] such that:
They estimated the matrix of reconstructed temperature PCs as follows:
All derivations of in unweighted cases yield identical results. Using direct linear algebra, all the reconstructed PC series in MBH98 can therefore be derived in 2 lines:
In the case where only one TPC is used in the reconstruction, we have from (11): where cor (U[,1],Y) is the columnwise correlation of the PC1 U[,1] with the standardized proxies Y. Denote this vector by and let u = U[,1].
Simplifying equation (16) in this case:
That is, the reconstructed TPC1 in the AD1000 and AD1400 steps is simply a linear weighting of the proxies, with the weights proportional to their correlation to the TPC1.
The final RPC includes a re-scaling step described in the following post on this topic here.
Tomorrow I’ll talk about calculating the NH temperature reconstruction in one more line of code. See continuation here.