An Example of MBH "Robustness"

In the MBH source code, they apply steps that purport to weight the temperature PCs in their regression calculations proportional to their eigenvalues. Comments on their code say:

c set specified weights on data

c weights on PCs are proportional to their singular values

This is one of two weighting procedures in MBH for the regression, the other being the weighting of the proxy series. Wahl and Ammann tout one of the key results of their study as showing that the MBH algorithm is “robust” against “important” simplifications and modifications. They claimed:

A second simplification eliminated the differential weights assigned to individual proxy series in MBH, after testing showed that the results are insensitive to such linear scale factors.

This claim is repeated in slightly different terms on several occasions:

Additionally, the reconstruction is robust in relation to two significant methodological simplifications-the calculation of instrumental PC series on the annual instrumental data, and the omission of weights imputed to the proxy series.

Our results show that the MBH climate reconstruction method applied to the original proxy data is not only reproducible, but also proves robust against important simplifications and modifications.

Indeed, our analyses act as an overall indication of the robustness of the MBH reconstruction to a variety of issues raised concerning its methods of assimilating proxy data, and also to two significant simplifications of the MBH method that we have introduced.

I’ll discuss this claim in respect to weights on proxies in detail on another occasion, but today will note that it can trivially be seen to be untrue, merely through considering the acknowledged impact of differing weights on the NOAMER PC4. Obviously results are not “insensitive” to the “linear scale factors” (weight) of the PC4. Low weights remove the HS-ness of the result and yield a low RE statistic. So this particular claim is patently false.

However, it is true that the MBH result is “robust” to the use or non-use of weight factors on the temperature PCs. This can be seen through some trivial (though long-winded) linear algebra as follows. Continue reading

Squared Weights in MBH98

A couple of weeks ago, I said that I would document (at least for Jean S and UC) an observation about the use of squared weights in MBH98. I realize that most readers won’t be fascinated with this particular exposition, but indulge us a little since this sort of entry is actually a very useful of diarizing results. It also shows the inaccuracy of verbal presentation – which is hard enough even for careful writers.

MBH98 did not mention any weighting of proxies anywhere in the description of methodology. Scott Rutherford sent me a list of 112 weights in the original email, so I’ve been aware of the use of weights for the proxies right from the start. Weights are shown in the proxy lists in the Corrigendum SI (see for example here for AD1400) and these match the weights provided by Rutherford in this period. While weights are indicated in these proxy lists, the Corrigendum itself did not mention the use of weights nor is their use mentioned in any methodological description in the Corrigendum SI.

In one place, Wahl and Ammann 2007 say that the weights don’t “matter”, but this is contradicted elsewhere. For example, all parties recognize that different results occur depending on whether 2 or 5 PCs from the NOAMER network are used together with the other 20 proxies in the AD1400 network (22 or 25 total series in the regression network). Leaving aside the issue of whether one choice or another is “right”, we note for now that both alternatives can be represented merely through the use of weights of (1,1,0,0,0) in the one case and (1,1,1,1,1) in the other case – if the proxies were weighted uniformly. If the PC proxies were weighted according to their eigenvalue proportion – a plausible alternative, then the weight on the 4th PC in a centered calculation would decline, assuming that the weight for the total network were held constant – again a plausible alternative.

But before evaluating these issues, one needs to examine exactly how weights in MBH are assigned. Again Wahl and Ammann are no help as they ignore the entire matter. At this point, I don’t know how the original weights were assigned. There appears to be some effort to downweight nearby and related series. For example, in the AD1400 list, the three nearby Southeast US precipitation reconstructions are assigned weights of 0.33, while Tornetrask and Polar Urals are assigned weights of 1. Each of 4 series from Quelccaya are assigned weights of 0.5 while a Greenland dO18 series is assigned a weight of 0.5. The largest weight is assigned to Yakutia. We’ve discussed this interesting series elsewhere in connection with Juckes. It was updated under the alter ego “Indigirka River” and the update has very pronounced MWP. Juckes had a very lame excuse for not using the updated version. Inclusion of the update would have a large impact on a re-stated MBH99 using the same proxies.

Aside from how the weights were assigned, the impact of the assigned weights on the proxies in MBH formalism differs substantially from an intuitive implementation of the stated methodology. In our implementation of MBH (and Wahl and Ammann did it identically), following the MBH description, we calculated a matrix of calibration coefficients by a series of multiple regressions of the proxies Y against a network U of temperature PCs in the calibration period (in AD1400 and AD1000, this is just the temperature PC1.) This can be represented as follows:

G=(U^TU)^{-1} U^TY

Then the unrescaled network of reconstructed RPCs \tilde{U} was calculated by a weighted regression (using a standard formula) as follows, denoting the diagonal of weights by P:

\tilde{U}= YPG^T (GPG^T)^{-1}

However, this is Mann and things are never as “simple” as this. In the discussion below, I’ve first transliterated the Fortran code provided in response to the House Energy and Commerce Committee (located here ) into a sort of pseudo-code in which the blizzard of pointless code managing subscripts in Fortran is reduced to matrix operations, first using the native nomenclature and then showing the simplified matrix derivations.

You can find the relevant section of code by using the word “resolution” to scroll down the code, the relevant section commencing with:

c NOW WE TRAIN THE p PROXY RECORDS AGAINST THE FIRST
c neofs ANNUAL/SEASONAL RESOLUTION INSTRUMENTAL pcs

Calibration

Scrolling down a bit further, the calculation of the calibration coefficients is done through SVD operations that are actually algebraically identical to pseudoinverse operations more familiar in regressions. Comments to these calculations mention weights several times:

c set specified weights on data

c downweight proxy weights to adjust for size
c of proxy vs. instrumental samples

c weights on PCs are proportional to their singular values

Here is my transliteration of the code for the calculation of the calibration coefficients:

S0< -S[1:ipc]
# S is diagonal of eigenvalues from temperature SVD;
# ipc number of retained target PCs
weight0<- S0/sum(S0) #
B0<-aprox * diag(weightprx)
# aprox is the matrix of proxies standardized on 1902-1980
#weightprx is vector of weights assigned to each proxy;
AA<-anew * diag(weight0)
# this step weights the temperature PCs by normalized eigenvalues
[UU,SS,VV] <-svd(AA)
# SVD of weighted temperature PCs : Mann's regression technique
work0<- diag(1/SS) * t(UU) * B0[cal,]
# this corresponds algebraically to part of pseudoinverse used in regression
#cal here denotes an index for the calibration period 1902-1980
x0<- VV * work0
# this finishes the calculation of the regression coefficients
beta<-x0
#beta is the matrix of regression coefficients, then used for estimation of RPCs

Summarizing this verbose code:

[UU,SS,VV] < -svd(anew[1:79,1:ipc] * diag(weight0) )
beta= VV* diag(1/SS) * t(UU) * aprox[index,] * diag(weightprx)

Commentary: Mann uses SVD to carry out matrix inversions. There is an identity relating the pseudoinverse used in regression calculations to Mann’s SVD methods, that is very useful in analyzing this code. If the SVD of a matrix is represented as X=USV^T , the pseudoinverse of X can be represented by the following:
(X^TX )^{-1} X^T = VS^{-1} U^T

This can be seen merely by substituting in the pseudoinverse and cancelling.

Note that U,S and V as used here are merely local SVD decompositions and do not pertain to the “global” uses of U, S and V in the article, which I’ve reserved the products of the original SVD decomposition of the gridcell temperature network T :

[U,S,V]= svd(T,nu=16,nv=16)

Defining L as the k=ipc truncated and normalized eigenvalue matrix and keeping in mind that U is the network of retained temperature PCs, we can collect the above pseudocode as:
[UU,SS,VV]= svd(UL,nu=ipc,nv=ipc)
\hat{\beta}_{MBH}= VV * diag (SS^{-1}) *UU^T * Y * P

Applying the pseudoinverse identity to UL in the above expression, we can convert this to more familiar nomenclature:
\hat{\beta}_{MBH}= ( (UL)^T(UL))^{-1}(UL)^T YP
\hat{\beta}_{MBH}=L^{-1}(U^TU)^{-1} L^{-1}LU^TYP
\hat{\beta}_{MBH}=L^{-1}C_{uu}^{-1} C_{uy}P

In our emulations in 2005 (and also in the Wahl and Ammann 2007 emulation which I reconciled to ours almost immediately in May 2005), the matrix of calibration coefficients was calculated without weighting the target PCs and without weighting the proxies (in this step) as follows:
\hat{\beta}_{WA}=(U^TU)^{-1}U^TY = C_{uu}C_{uy}

The two coefficient matrices are connected easily as follows:
\hat{\beta}_{MBH}= L^{-1}\hat{\beta}_{WA}P

The two weights have quite different effects in the calculation. The weights L are ultimately cancelled out in a later re-scaling operation, but the weights P carry forward and can have a substantial impact on downstream results (e.g. the NOAMER PC controversies.)

Following what seemed to be the most implausible interpretation of the sketchy description, we weighted the proxies in the estimation step; Wahl and Ammann dispensed with this procedure altogether, arguing that the reconstruction was “robust” to an “important” methodological “simplification” – the deletion of any attempt at weighting proxies (a point which disregards the issue of whether such weighting for geographical distribution or over-representation of one type or site has a logical purpose.)

Reconstruction

Scrolling down a bit further, one finds the reconstruction step in the code described as:

“DETERMINE THE RECONSTRUCTED FIELDS BY INVERTING THE TRANSFER FUNCTION”

Once again, here is a transliteration of the Fortran blizzard into matrix notation first following the native nomenclature of the code:

B0 = aprox * diag(weightprx)
#this repeats previous calculation: aprox is the proxy matrix, weightprx the weights
AA< -beta
# beta is carried forward from prior step
[UU,SS,VV] -svd(t(AA))
# again the regression is done by SVD, this time on the matrix of calibration coefficients
work0<- B0 * UU * diag( 1/SS)
work0 <-work0 * t(VV)
#this is regression carried out using the SVD equivalent to pseudoinverse
x0<- work0
#this is the matrix of reconstructed RPCs

Summarizing by collecting the terms:

[UU,SS,VV] = svd(beta) )
x0= aprox * diag(weightprx) * UU* diag( 1/SS) * t(VV)

Commentary: Using our notation, the unrescaled reconstructed RPCs denoted by \tilde{U} instead of x0 are obtained:

\tilde{U}=Y*P * UU * SS^{-1} VV^T

Once again the pseudoinverse identity can be applied, this time for L^{-1}GP where G=C_{uu}^{-1}C_{uy} yielding:

\tilde{U}=Y*P * (L^{-1}GP)^T ( (L^{-1}GP) * (L^{-1}GP)^T)^{-1}  where G=C_{uu}^{-1}C_{uy}

\tilde{U}=YP^2 G^T (GP^2G^T)^{-1}L

Expressed in terms of C matrices, this expression becomes:

\tilde{U}=Y*P^2 (C_{uu}^{-1}C_{uy})^T (C_{uu}^{-1}C_{uy} P^2 (C_{uu}^{-1}C_{uy})^T)^{-1}L
\tilde{U}=Y*P^2 C_{uy}^T   (C_{uy} P^2 C_{uy}^T)^{-1} C_{uu}L

The form of the above expression is precisely identical to the form of the expression resulting from application of the conventional expression for weighted regression shown above, which was (additionally incorporating the L weighting, which is removed in a later step):

\tilde{U}=YP G^T (GPG^T)^{-1}L

However, there is one obvious difference. The Mannian implementation, which, rather than using any form of conventional regression software, using his ad hoc “proprietary” code, ends up with the proxy weights being squared.

In a follow-up post, I’ll show how the L weighting (but not the P weighting) falls out in re-scaling.

The Wahl and Ammann implementation omitted the weighting of proxies – something that they proclaimed as a “simplification of the method”. If you have highly uneven geographic distribution, it doesn’t seem like a bad idea to allow for that through some sort of stratification. For example, the MBH AD1400 network has 4 separate series from Quelccaya glacier (out of only 22 in the regression network) – the AD!000 network has all f in a network of only 14 proxies. These consist of dO18 values and accumulation values from 2 different cores. It doesn’t make any sense to think that all 4 proxies are separately recording information relevant to the reconstruction of multiple “climate fields” – so the averaging or weighting of series from the same site seems a prerequisite. Otherwise, why not have ring width measurements from individual trees? Some sort of averaging is implicitly done already. In another case, the AD1400 network has two ring width series from two nearby French sites and two nearby Morocco sites, which might easily have been averaged or even weighted through a PC network, as opposed to being used separately.

While some methodological interest attaches to these steps, in terms of actual impact on MBH, the only thing that “matters” is the weight on the bristlecones – one can upweight or downweight the various “other” series, but the MBH network is functionally equivalent to bristlecones +white noise, so upweighting or downweighting the white noise series doesn’t really “matter”

Stockwell on March 2008

David Stockwell has an interesting post here on March 2008 temperatures, noting a divergence between NH and SH temperatures, with NH temperatures rebounding and SH temperatures continuing to cool. http://landshape.org/enm/march-2008-temperatures/

IPCC Review Editors Comments Online

IPCC Review Editors have an extremely important function under IPCC procedures. In prior discussion of the Replies by WG1 Chapter Authors to Review Comments, we noted their unresponsiveness on issues that we were familiar with e.g. the deletion of the inconvenient post-1960 Briffa reconstruction results, the handling of the HS dispute. When the IPCC WG1 (grudgingly) placed the WG1 Review Comments and Replies online- url here they did not place the Review Editor comments online, despite the importance of review editors. Through the diligent efforts of David Holland, the IPCC WG1 and WG2 Review Editor comments have now been obtained and are now online for the first time here – at this point, another Climate Audit exclusive.

When you examine these review comments, as I urge you to do, please remember that this is supposed to be the most carefully reviewed document in human history, where entire stadiums of scientists have carefully weighed each word. Compare that impression to the actual review editor comments, which as you will see do not rise above a form letter for 64 of 69 Review Editor comments discussed here. Continue reading

Like a Dog on a Bone

UC observed a couple of days ago that Hadley Center, authors of the pre-eminent temperature series, have suddenly identified an “error” in how they presented temperature data. For presentation of their smoothed temperature series in a part-year situation, their methodology calculated the average of months then available and used that to estimate the current year’s temperature for presentation purposes. For their influential graphic showing smoothed temperature series, they used a 21-point binomial filter (this is reported) extrapolating the latest number for 10 years. This obviously places a lot of leverage on January and February temperatures. (UC has replicated their smoothing method; he sent me code and I’ve confirmed that we can exactly replicate their smoothing methods.)

As has been widely reported, January and February 2008 temperatures are noticeably lower than last years. Here is a plot of Hadley Center GLB monthly temperatures, showing the two 2008 months in bold points. This ties into the monthly plot at the HadCRU site here.

hadcru32.gif

These cold January and February 2008 temperatures have led to a noticeable downturn in the smoothed annual series. This has not escaped the notice of the Hadley Center, who were extremely quick off the mark to notice an “error” which resulted in graphical emphasis of a downturn (here)

We have recently corrected an error in the way that the smoothed time series of data were calculated. Data for 2008 were being used in the smoothing process as if they represented an accurate estimate of the year as a whole. This is not the case and owing to the unusually cool global average temperature in January 2008, the error made it look as though smoothed global average temperatures had dropped markedly in recent years, which is misleading.

Heading into the IPCC WG1 conference in Paris in February 2007, January 2007 was a very warm month. I thought that it would be interesting to plot the HAdCRU style result as of January 2007 and compare it to the January 2008 style (now excised from the website). The blue dots below show the effect of the CRU smoothing method used in 2007 incorporating Jan and Feb 2008 – showing the downturn, which caused the Hadley Center to notice the “smoothing error”. The black shows the present annual series – not using 2008 data – which is what is currently displayed on the Hadley Center website (prettied up and with pseudo-“error” bars.) The red dots show what their 2007 method would have yielded in February 2007, at the time of the IPCC WG1 conference.

hadcru28.gif

They noticed the “smoothing error” like a dog on a bone when temperatures went down, but didn’t notice precisely the same “error” last year, when it yielded record high results. Looks like there are some pit bulls in England as well.

One minor curiosity which some reader may be able to explain. I compared the HadCRU GLB annual series – column 2 in http://hadobs.metoffice.com/hadcrut3/diagnostics/global/nh+sh/annual with the HadCRU GLB monthly series – column 2 in http://hadobs.metoffice.com/hadcrut3/diagnostics/global/nh+sh/monthly. I calculated annual averages from the monthly version and compared them to the archived annual version with the following result. I then manually compared values for 1861 (monthly: -0.811 -0.477 -0.491 -0.375 -0.765 -0.172 -0.308 -0.173 -0.379 -0.397 -0.410 -0.191 with an average of -0.4124167; as compared to the annual of -0.568. Perhaps this is explained somewhere. I didn’t see any explanations in the website explanations – if any one sees an explanation, I’d be interested.

hadcru30.gif

Merely from looking at the monthly temperature histories, I urge readers not to draw any particular conclusions from a couple of cold months. The monthly history has many such cold downspikes and recoveries tend to be quite rapid. If the HadCRU results are an accurate history, one could just as easily look at this graph and argue that the most recent downspike was not as cold as corresponding downspikes in the 1980s. (Evaluating the HadCRU results is very difficult because their data as used is not disclosed.)

From my immediate view in Toronto, we still have banks of snow, which will still be here at the beginning of April in two days. I certainly don’t recall such a situation during my adult life – so the present downspike seems a little unusual from a Toronto perspective but I recognize that this is only one perspective.

PC Weights on a Square Region

A few days ago, I showed some plots showing distribution of weights arising from principal components carried out on data from a region arranged as a line segment (think Chile). Today I’ve done a similar analysis for a square shaped region again assuming spatial autocorrelation governed by distance. In this case, I made a regular 10×10 grid (I was prepared to do a finer grid, but the results seem clear enough without going finer) and then did a SVD decomposition of the spatial autocorrelation matrix. I then made contour maps of the coefficients for each PC (called “eigenvectors”), which showed pleasing and interesting geometric symmetries, something that one doesn’t see discussed when PC analysis is applied abstractly to tree ring networks.

Update: Hans Erren observed below that the patterns here are the well-known Chladni figures, known as eigenfunctions of the wave equation for a square plate. This seems to be a related but different derivation: since the wave equation and uniform spatial autocorrelation both generate Chladni figures, they must be doing the same thing at some level and references to such an explanation would be welcome. End update.

Continue reading

Tamino and the Magic Flute

Tamino has recently re-iterated the climate science incantation that Mann’s results have been “verified”. He has done so in the face of the fact that one MBH98 claim after another has been shown to be false. In some cases, the claim has not only been shown to be false, but there is convincing evidence that adverse results were known and not reported.

Today I’m going to look at what constitutes verification of a relationship between proxies and temperature, assessing MBH results in such a context, trying as much as possible to emphasize agreed facts. Continue reading

L.A. Confidential

Minus Kim Basinger. http://wattsupwiththat.wordpress.com/2008/03/24/how-not-to-measure-temperature-part-54-los-angeles-the-city/

As a reader at Anthony’s blog notes, there’s an interesting history here.

Note that the temperature graphic shown linked at Anthony’s site is the version used by NOAA, rather than the version used by NASA. The differences are shown in the two graphs below. The NASA adjusted version increases past L.A. temperatures by an amount increasing linearly to about 1.2 deg C, while NOAA uses a version without NASA’s urban adjustment.

While NASA’s U.S. temperature history shows warm 1930s, with 1934 vying with 1998 as warmest year of the century, NOAA, under the direction of Tom Karl, makes no corresponding effort to adjust for urbanization.

Anthony, Atmoz and ourselves have recently discussed the ineffectiveness of the NOAA Filnet adjustment to detect station changes (see Lampasas and other sites). Here we have another example of exactly how ineffective this algorithm is. Neither NOAA nor NASA pick up the re-location of the Los Angeles station (which in this case appears to result in a decrease in temperature. Both NOAA and NASA GISS failed to pick up and adjust for the re-location even though two NASA employees published an article on the relocation in an article published on the NASA website here: http://www.jpl.nasa.gov/news/features-print.cfm?feature=1273 stating:

The magnitude of change reflected in our study strongly suggests this relocation will bias long-term climatic studies.

So it’s not just Anthony Watts who’s observed that station re-locations can bias climatic studies. As we’ve observed previously, climate scientists seem much better at detecting discontinuities with downward biases (such as the most recent L.A. move) than ones with upward biases, such as Lampasas or the Y2K error.

losang72.gif

losang71.gif

PCs in a Linear Network with Homogeneous Spatial Autocorrelation

As I observed a couple of posts ago, the Stahle SWM network can be arranged so that its correlation matrix is closely approximated by a Toeplitz matrix i.e. there is a “natural” linear order. I also noted that results from matrix algebra proving that, under relevant conditions, all the coefficients in the PC1 were of the same sign; there was one sign change for the coefficients in the PC2, two in the PC3 and so on. These resulted in contour maps for the eigenvector coefficients with very distinct geometries – here’s an example of the PC2 in the SWM network showing the strong N-S gradient in this PC.

As soon as one plots the site locations and contours the eigenvector coefficients, the raggedness of the original site geometry is demonstrated. But how much does this raggedness matter? This leads us away from matrix algebra into more complicated mathematics, as the eigenvectors which contour out to the pretty gradients in the earlier diagrams now become eigenfunctions with an integral operator replacing the simpler matrix multiplication.

As an exercise, I thought that it would be interesting to see what happened in an idealized situation in which the network was a line segment with spatial autocorrelation as a negative exponential function of distance between sites.

Because all the circumstances are pretty simple, one presumes that the resulting functions are simple and I presume that there’s a simple derivation somewhere for eigenfunctions in these simple cases.

Since I don’t know these derivations, I went back to more or less brute force methods and did some calculations using N=51; N=101,… , assuming that the sites were uniformly spaced along the line segment, creating a correlation matrix for rho=0.8, carried out SVD on the constructed correlation matrix, plotted the eigenvectors and experimented with fitting elementary functions to the resulting eigenvectors. As shown below, I got excellent fits (with some edge effect) for the following eigenfunctions:

V_k(t)= sin (k \pi t) where k = 1,.. and t is over [0,1].

So it looks like the eigenfunctions are pretty simple. One can also see how the number of sign changes increases by 1 as k increases by 1.

 stahle66.gif  stahle67.gif
 stahle68.gif  stahle69.gif

Even the plots in the network illustrated yesterday show elements of these idealizations. For example, here’s the PC1 from the combined network. I think that I can persuade myself that there are elements of the sin (kt) k=1 shape from the idealized PC1 coefficient curve illustrated in the top right panel.

Next here’s the PC2 from the combined network. Again I think that I can persuade myself that there are elements of the sin (kt) k=2 shape in this example.

Obviously the functions are elementary and I’m sure that there are any number of elegant derivations of the formulas. But I think that the results are at least a little bit pretty in the context of something as humdrum as the Stahle SWM network. As one goes from a 1-D to a 2-D situation, the geometry is somewhat more complicated, but we’re still going to see well-constrained relatively elementary functions making up the eigenfunctions for squarish and rectangular shaped regions.

Also, here’s a plot of the normalized eigenvalues for N=101. Again, I’m sure that there’s a known distribution for these eigenvalues somewhere in the mathematical literature and would welcome any references.

stahle70.gif

More on Toeplitz Matrices and Tree Ring Networks

Yesterday’s results connecting eigenvector patterns in the Stahle SWM network to Toeplitz matrices and spatial autocorrelation were obviously pretty interesting. Needless to say, I was interested to test these ideas on out some other networks and see how they held up.

There is a large literature on spatial autocorrelation and there appear to be well-known mathematical approaches (that I’ll take a look at at some point.) In a quick google, I didn’t notice any discussion that was exactly on point to the discussions here (other than my earlier post at CA which ranked first in my search), but the need to discuss the relationship of tree ring chronologies and spatial autocorrelation seems pretty obvious though the connection to eigenvectors could well not have been noticed in the tree ring literature.

Today I’ll report on applying this approach to the Stahle Texas-Oklahoma network also used in MBH98, the combination of the Stahle SWM earlywood network with the TX-OK network (for the purposes of a first-cut analysis, this was more tractable than carrying both earlywood and latewood series) and then the combination when 16 Stahle sites in the NOAMER network were added in. Exactly why the Stahle sites are strewn all over the place in MBH is a question you’d have to ask Mann. The pattern of spatial autocorrelation holds up well in the expanded networks. It’s not quite as strong as in the Stahle SWM series, but is statistically highly significant. As more networks are added in, the region becomes “squarer”: whereas there is a plausible “order” for the Stahle SWM sites, the larger networks are much more rectangular. But the patterns are highly geometric. Continue reading