PLS does cherry-pick to some extent, by effectively flipping all signs as required to get a positive correlation, then forming an average of the predicted predictand across all predictors in the first stage, and then repeating this a few times on predictor and predictand residuals from each step.

PLS not quite as cherry-pickish as it sounds, however, since the “weights” it uses just ammount to univariate regression coefficients, and then it takes an unweighted average of these preidctions to form a proxy summary at each stage. A further weighting of each proxy’s prediction by its simple regression R2 would give results similar to forward Recursive Least Squares, but this is not done.

I’ll take that back — after flipping signs so that all correlations are in effect positive, PLS will flatten the predictor series that don’t happen to correlate strongly with the predictand, and beef up those that do. Higher order terms will then perfect the shape.

This procedure could be ferreting out a subtle but valid relationship, or could just be generating spurious results with a flat handle. It would take an appropriate Monte Carlo simulation to determine whether the results are spurious or not.

]]>Given that there are many possible temperature histories consistent with a given borehole T profile, might it be possible to address the problem to some extent by re-measuring the T profile of the borehole (or perhaps preferably, of another freshly drilled borehole nearby) some time later (say 50 years later) ?

Applying the inversion procedure to the two profiles yields two sets of possible temperature histories. Ignoring the last century (say), one can then exclude solutions not in the (approximate*) intersection of the two sets to obtain a smaller set of possible solutions.

*: approximate intersection : x member of A and y member of B are in the approximate intersection of A and B if the difference between x and y, for some appropriate metric, is appropriately small

]]>It could be that there is something in the PLS methodology that generates Hockey Sticks. Or perhaps sea ice was really flat before 1900, and this is reflected in most of the proxies.

]]>I am stuck with the idea that the Kinnard ‘method’ was (at least partially) chosen to ensure the compelling ‘classes-withheld’ graphs in Figure S11 of the SI. Each withheld-class reconstruction seems to reproduce the hockey stick blade with very high fidelity, and misses elsewhere.

That seems highly improbable to me, that withholding each proxy class would have a negligible impact on the blade, unless they were initially weighted within class to ensure ‘separability’…

JMHO

RR

What would your “Comparison to Kinnard HS” look like if each d18O series were flipped before averaging so as to correlate positively with Sea Ice? This sign flipping is part of what PLS does. ]]>

I’ve been reading the 2001 article in Chemometrics and Intelligent Laboratory Systems (58:109-130) that Kinnard cites, and Kinndard’s summary of the method on p. 513 of his article seems to follow Wold et al.

PLS does cherry-pick to some extent, by effectively flipping all signs as required to get a positive correlation, then forming an average of the predicted predictand across all predictors in the first stage, and then repeating this a few times on predictor and predictand residuals from each step.

PLS not quite as cherry-pickish as it sounds, however, since the “weights” it uses just ammount to univariate regression coefficients, and then it takes an unweighted average of these preidctions to form a proxy summary at each stage. A further weighting of each proxy’s prediction by its simple regression R2 would give results similar to forward Recursive Least Squares, but this is not done.

In order to evaluate the fit or “skill” of the model, it is necessary to do some sort of Monte Carlo tabulation of critical values for R2 (or Wold et al’s Cross

Validation Q2 statistic). I personally don’t like Fritts’s RE or CE statistics beloved by dendroclimatologists, and have more confidence in R2 (or Q2), but maybe that’s just my lack of understanding.

On p. 15 of the SI, Kinnard et al report that they do something along these lines, at least for the RE statistic:

Monte Carlo simulations were used to test the statistical significance of the reconstruction. The original proxies selected as predictors in each PLS model were replaced by nonsense predictors and the calibration and validation procedure repeated 300 times. The nonsense predictors were derived by randomizing the Fourier phases of the proxies, which resulted in surrogate proxies having equal mean, variance and power spectrum, but randomized phases, effectively destroying any phase relationships between the proxies and th eice extend predictand. The 95% and 99% significance thresholds on the RE statistic were derived from these Monte Carlo tests (Fig. S9).

Aside from the choice of RE over R2 or Q2, it would seem a lot more natural to me to hold the predictors (the proxies) fixed, and then to generate random predictand series (sea ice) under the null that the predictors have zero explanatory value. For this purpose it is merely necessary to model the predictand under the null, in terms of its autoregressive structure. What is relevant here is therefore not the behavior of the prediction residuals, but the behavior of the predictand itself without reference to the prodictors. Then with each simulated predictand series, repeat their PLS procedure and compute the selected test statistic.

Steve has already discussed PLS some at

https://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/

and

https://climateaudit.org/2006/06/17/more-on-burger-et-al-2006/

Unfortunately, some symbols (notably u-umlaut in Burger’s name plus Burger’s coauthor’s name) got garbled in the conversion to WordPress, but it’s a start.

Steve, way off topic so clip if you want but Chapter 4 zero order draft is available here:

http://noconsensus.wordpress.com/2011/12/10/cracking-at-the-seams/

With regard to “all the other data we have”, reconstructions are not compared to “data” but to other reconstructions which can often utilize much of the same data and similar methodology so the comparisons are not quite independent. The quantification of the “meshing” is not always specified with the evaluation being a simple offhand remark and differences between the reconstructions rarely addressed in a scientific manner. Reconstructions which do not share the same features are often discounted as being “incorrect” without sufficient justification.

It’s not a simple case of just looking at the “observations”.

]]>While the article is paywalled, the supplementary materials are not. *Nature*

The supplementary materials indicate EOF, PCS, and PCR analysis was performed. Wouldn’t the reconstruction then be a linear combination of the resulting Empirical Orthogonal Functions – not a linear combination of the proxy data itself?

The last 140 years of data isn’t dependent on proxies – we have direct observations. Doesn’t this reconstruction (say from 500AD to 1870AD) mesh with all the other data we have? I.e., foraminfera from arctic sea bed cores, bowhead whale fossils, plankton studies, etc.

What would be surprising is if it contradicted previous studies, rather than confirming them.

]]>