Juckes and the Euro Team spent a lot of time on the topic of MM normalization, stating as follows (continuing the academic check kiting initiated by claims made in Wahl and Ammann (Clim Chq 2006) using the rejected Ammann and Wahl (GRL 2006)):

Wahl and Ammann (2006) ascribe the difference between MM2005 and MBH1998 to another apparent error by McIntyre and McKitrick: the omission of the normalisation of proxies prior to the calculation of proxy principal components.

Juckes, in his usual style, raised this issue again on the blog as follows (in the same message as his claim about failing to “disclose” “machine-specific” code):

re #29: Using your code, I can show that the sensitivity you describe only exists when you use your own “arbitrary” normalisation for the calculation of the proxy PCs (ranging from a standard deviation of 0.0432 for wy023x to 0.581 for nm025). Why do you use this normalisation? Is the effective elimination of much of the data intentional?

Remarkably enough, this very issue with the specific series wy023x has already been discussed in print (Huybers Comment and Reply to Huybers). My published corpus of work is not large – indeed, I receive many criticisms for it not being larger than it is. Given that we’re talking about only 5 papers and given Juckes’ preoccupation with matters MM, you’d think that Juckes would have familiarized himself with prior discussion of this issue before making various attacks, both here and in his submission to Climate of the Past.

The discussion of the density series wy023x was initiated by Huybers as follows:

NOAMER records are standardized chronologies [Cook and Kairiukstis, 1990], reported as fractional changes from mean tree ring width or maximum ring density after correcting for the effects of increasing tree age. The variance of the chronology is a function of both environmental variability and the trees’ sensitivity to the environment. Sensitivity depends on factors such as species, soil, local topography, tree age, location within a forest, and what quantity is being measured [Fritts, 1976]. The most striking example of varying sensitivity is that the two NOAMER chronologies indicating changes in tree ring density (co509x and wy023x) have variances roughly thirty times smaller than the other chronologies indicating changes in tree ring width.

From this, Huybers concluded that the most informative method of applying PCs to tree ring networks would be dividing the chronologies as standardized in the ITRDB data bank by their standard deviations, while elsewhere noting that it might make the most sense simply to take the mean of the ring widths. Note Huybers’ recognition that ITRDB chronologies are already *standardized* – although not converted to unit standard deviation. ITRDB chronologies are in dimensionless units with a common mean of 1.

In our Reply to Huybers, we made it clear that we were not advocating covariance PCs (or PCs at all) as a method of extracting information from the ragbag North American tree ring chronologies, but had simply (and logically) applied covariance PCs as the most logical implementation of the stated MBH98 method (“conventional” principal components).

We re-emphasize that our comparison between the MBH98 method and a covariance PC1 was not presented as an attempt to “Å”Åremove the bias in MBH98′s method”, and that we take no position on the relative merits of using a mean, a covariance PC1, or even using PC analysis at all, in paleoclimate work. The onus for demonstrating validity of a statistic as a temperature proxy rests entirely with its advocate. Any valid climate reconstruction should not depend on whether a correlation matrix or covariance matrix is used in tree ring PC analysis. No variation on PC methodology can overcome the problems of using bristlecones as a temperature proxy.

In our Reply to Huybers (and elsewhere), we took the view (see citations in Reply to Huybers: Rencher 1992, 1995; Overland and Preisendorfer 1982; also North et al 1982; quotations here) that tree ring networks were already in common dimensionless units and that, under such circumstances, statistical authorities recommended the use of principal components on a covariance matrix. Jean S, who is knowledgeable in such matters, has endorsed this. Neither Juckes nor any other Team member have ever given a single countervailing statistical authority; they merely assert that use of a correlation matrix is the “correct” methodology.

However, regardless of whether correlation or covariance PCs are the “correct” methodology, in our Reply to Huybers, we provided compelling reasons why the decision in the case of the AD1400 North American network should not be based on the standard deviation of wy023x. In this network, there were only 2 density series and 68 ring width series. Both density series were also represented by ring width series at the same site (and in the case of co509x, it was additionallyhad quadruplicate use in the SWM network). We stated:

One of Huybers’ [2005] principal justifications for proposing a correlation PC1 is his observation that the covariance PC1 underweights density series, which have lower variances. But in the MBH98 network, only 2 of 70 series are density series, and both are from sites also represented in the same network with a ring width series. Indeed, the Spruce Canyon site (density series co509x and ring width series co509w) also occurs in 4 series in the MBH98 Stahle/SWM network. Accommodating these 2 density series should not be at the expense of the most appropriate treatment for the other 68.

So I would submit that Juckes’ question is asked and answered and that this was an issue that Juckes should have been familiar with before submitting his article to Climate of the Past (which does not cite either the Huybers Comment or our Reply, although both papers deal at considerable length with issues of “normalization” of the North American network).

In addition, this very issue was specifically visited by the NAS panel (also not cited by Juckes et al), which stated, referring to Huybers 2005.

Huybers (2005), commenting on McIntyre and McKitrick (2005a), points out that normalization also affects results, a point that is reinforced by McIntyre and McKitrick (2005b) in their response to Huybers. Principal components calculations are often carried out on a correlation matrix obtained by normalizing each variable by its sample standard deviation. Variables in different physical units clearly require some kind of normalization to bring them to a common scale, but even variables that are physically equivalent or normalized to a common scale may have widely different variances. Huybers comments on tree ring densities, which have much lower variances than widths, even after conversion to dimensionless “standardized” form. In this case, an argument can be made for using the variables without further normalization. However, the higher-variance variables tend to make correspondingly higher contributions to the principal components, so

the decision whether to equalize variances or not should be based on the scientific considerations of the climate information represented in each of the proxies.

In our Reply to Huybers, we had stated that the “onus for demonstrating validity of a statistic as a temperature proxy rests entirely with its advocate” – a position that is surely reflected in the conclusion of the NAS panel. For Juckes et al to assert that a calculation demonstrating the impact of a covariance matrix is an “error” is pretty insolent.

Another point – it’s pretty annoying to listen to Juckes (following Wahl and Ammann) falsely accuse us of “omitting” consideration of the impact of dividing North American chronologies by their standard deviation. In MM05(EE) we stated:

If the data are transformed as in MBH98, but the principal components are calculated on the covariance matrix, rather than directly on the de-centered data, the results move about halfway from MBH to MM. If the data are not transformed (MM), but the principal components are calculated on the correlation matrix rather than the covariance matrix, the results move part way from MM to MBH, with bristlecone pine data moving up from the PC4 to influence the PC2. In no case other than MBH98 do the bristlecone series influence PC1, ruling out their interpretation as the “dominant component of variance” [Mann et al, 2004b]

Now these sentences are expressed in terms of correlation and covariance matrices, but PC with a correlation matrix is identical to PC using a covariance matrix on series standardized by dividing by their standard deviation. PC using the covariance matrix of the Mannian network is relatively close to PC on the correlation matrix (up to the difference between the standard deviation over 1400-1980 and 1902-1980). Huybers acknowledged the first point acknowlelged. Juckes et al (and Wahl and Ammann) should have been familiar with both points. In retrospect, I’d rephrase the last sentence quoted above a little – the bristlecones do not “dominate” the correlation PC1, but they do “influence” it, a point that was discussed in the Reply to Huybers. In our Reply to Huybers, we illustrated the correlation PC1 and various other permutations and combinations as follows:

This figure clearly shows not just the covariance PC1, but the correlation PC1 – indeed, even a correlation PC1 standardized with an autocorrelation-consistent standard deviation. Our Reply to Huybers contains a detailed analysis of issues pertaining to correlation and covariance PCs. For Huybers (or Wahl and Ammann) to assert that we “omitted” a discussion of “normalized” chronologies can only mean that they are somehow unaware that correlation PCs are the same as PCs from chronologies divided by their standard deviation. Having failed to cite our prior discussion of correlation PCs, they then have the gall to allege that we committed an “error” by omitting a discussion equivalent to discussing correlation PCs. They make it worse by omitting to consider the NAS panel’s specific consideration of the topic.

Now going back to Juckes’ question – the way that he framed the question is very odd. Juckes accuses me of using

your own “arbitrary” normalisation for the calculation of the proxy PCs (ranging from a standard deviation of 0.0432 for wy023x to 0.581 for nm025). Why do you use this normalisation? Is the effective elimination of much of the data intentional?

The way that this question is expressed suggests, among other things, that Juckes is not using a principal components algorithm to do principal components, but uses svd on data matrices (an observation confirmed by inspecting the archived series). As we stated clearly and illustrated, the results follow directly from a conventional principal components analysis on a covariance matrix.

Juckes’ use of the terminology “effective elimination of much of the data” is simply realclimate rhetoric. All that happens is downweighting of bristlcones – for which realclimate and Juckes use the code word “much of the data” – they never use the word “birstlecone” in this context. In an unguarded moment, Wahl and Ammann 2006 admitted that the downweighting from use of covariance PCs is precisely equivalent to bristlecone downweighting – I’ve got a pretty graphic in my Stockholm/Holland presentation illustrating this.

At the time of our 2003 paper where the differences first emerged, we had no idea that it was the bristlecones that caused the problem. We used covariance PCs because that is “conventional” PC methodology in a network denominated in common units and that’s what Mann said that he did. Obviously he did something different than what he said he did, but that’s a long and different story. We only became aware of the role of bristlecones by tracking what Mann’s biased methodology did – gradually realizing that all the series in what Mann described as the “dominant component of variance” came from Graybill and Idso’s strip-bark bristlecone and foxtail network. Later we realized that Mann’s CENSORED directory contained a sensitivity study of what happened without the bristlecones and foxtails – so he realized what was going on long before we did.

A covariance PC emphasizes different data than the bristlecones emphasized by Mann’s data mining method. We do not argue that this index is a plausible temperature indicator – the obligation to do so rests with the proponents of PC analyses as a means of extracting temperature from tree ring networks. We can categorically say that we did not select a PC methodology with a view to downweighting bristlecones; it was only through patient detective work that we learned of the effect of bristlecones.

**Update:** For convenience, here are relevant quotations from statistical authorities on the selection of PC methodologies (keep in mind that tree ring networks are already in common dimensionless units through standardization):

Preisendorfer is cited in MBH98 as an authority for principal components. In Overland and Preisendorfer [1982], Presiendorfer stated:

In representing the variance of large data sets, the

covariance matrix is preferred” (p.4)

I have not found a comparably explicit statement in Preisendorfer [1988]. However, this text nearly always talks in terms of “covariance matrices”, rather than “correlation matrices”, such as the following quote:

The first step in the PCA of [data set] Z is to center the values z[ t,x] on their averages over the t series… If Z… is not rendered into t-centered form, then the result is analogous to non-centered covariance matrices and is denoted by S’. The statistical, physical and geometric properties of S’ and S [the covariance matrix] are quite distinct. PCA, by definition, works with variances i.e. squared anomalies about a mean. (p. 26)

Preisendorfer [1988] never suggested a change from the position of Overland and Preisendorfer [1982], which is cited approvingly on several occasions in Preisendorfer [1988]. The standard examples in Preisendorfer [1988] are data sets (e.g. sea surface temperature, sea level pressure defined over regions), which are **not **scaled to unit variance. While Preisendorfer explicitly calls for centering over the time average, he NEVER calls for scaling to unit variance on these data sets which are in common units but have differing variances. Preisendorfer discusses the covariance matrix on virtually every page, but only mentions a correlation matrix on a couple of occasions. The only occasion in which Preisendorfer calls for scaling to unit variance is when a data set is a composite of two data sets denominated in different units, e.g. sea surface temperature in one region and sea level pressure in another — which is not the case in the dataset under discussion here as discussed above.

Rencher [1995] is another prominent statistical authority (cited by Huybers), who also stands as authority for use of covariance matrices as follows:

Generally extracting principal components from S [covariance] rather than R [correlation] remains closer to the spirit and intent of principal component analysis, especially if the components are to be used in further computations. (p. 430)

This follows almost verbatim a similar comment in Rencher [American Statistician 1992, 46, p 221], which said:

“For many applications, it is more in keeping with the spirit and intent of this procedure to extract principal components from the covariance matrix S rather than the correlation matrix R, especially if they are destined for use as input to other analyses. However, we may wish to use R in cases where the measurement units are not commensurate or the variances otherwise differ widely.”

The only exception contemplated by Rencher [1992] is a circumstance in which a few series measured in different units have much larger variance than the bulk of the data, because it would dominate the analysis:

“When one variable has a much larger variance than the other variables, this variable will dominate the first component”.

This is the exact opposite situation to MBH98, where the exception is two density series that have much smaller variances, and moreover are from sites already represented by width series in the major partition of the dataset.

I have been unable to locate any third-party statistical authority that recommends use of correlation matrices in networks denominated in common units. I’m not saying that such a reference doesn’t exist anywhere, but the challenge to produce one has been outstanding for some time and so far no one has located one.

## 53 Comments

Steve, the last half of the last paragraph repeats the last indented quote from MM05(EE). Also, this question from Martin Juckes, “

Why do you use this normalisation?” is repeated in the paragraph under your Figure 1, and the enquote from MJ isn’t closed. The second “Reply to Huybers” link is missing the “/pdf/” affix. The link to Huybers in, “wy023x was initiated by Huybers as follows” doesn’t work. (You must have typed all that while being called to dinner).Also, I’m suddenly not seeing a text preview (using Firefox 1.5.08).

More to the point, though, for Martin Juckes to have missed your analysis of normalization methods, especially in the reply to Huybers, while at the same time making a strong criticism of them requires one of the following to be true:

1. He didn’t know about the analyses and criticised the method from ignorance, evidencing carelessness.

2. He knew about the analyses but didn’t read them or cite them, yet still felt competent to criticise, evidencing foolishness.

3. He read them but didn’t understand them or cite them, yet still felt competent to criticise, evidencing incompetence.

4. He read them and understood them but didn’t cite them and criticised them knowing the critique was virtually meritless, evidencing malignancy.

Had you read the Huybers analyses, Prof. Juckes?

Over and over again, this pattern emerges, of criticisms by the HT that do not come to grips with the actual analytical thrust of the M&M work. The description of pathological science seems more and more to fit. E.g., “

statistically marginal phenomena … with a low signal-to-noise ratio, are easy to misinterpret..” One might add, given the analyses in ClimateAudit, ‘especially when they are an autocorrelated time-series.’Here’s an even more telling symptom of pathology from Turro’s article: “

The red flag of pathology should thus appear any time a researcher offers resistance to the challenge of reproducibility, claiming that only a certain special system (or even certain investigators) can generate the anomalous result.” Doesn’t *that* resonate! Only HT members can generate hockey sticks because only they insist on back-dooring bcps and substituted Yamals. And they withhold both data and methods so as to actively obstruct attempts at reproduction. And if Hughes’ 2002 Sheep Mountain update really does invalidate the previous (and continued?) uses of the older series, as you surmise, Steve (se post 67 here), then they are suppressing new data so as to avoid public (but not private) falsification.This last sort of behavior, if truly present, goes beyond the description of pathology that includes, “

These are cases where there is no dishonesty involved, but where people are tricked into false results by the lack of understanding about what human beings can do to themselves in the way of being led astray by subjective effects, wishful thinking, or threshold interactions.” Where the suppression and obstruction is active and conscious, thereisdishonesty involved. The line beyond pathology is reached, and we are in the arena of fraud, which in science is the equivalent of criminal behavior.Here’s a dumb question which may or may not have been answered before somewhere on this blog: my impression is that when the Hockey Team

temperature calibrated the proxy data, they were using global or NH temperature data. Is this true? If so, wouldn’t it have been more natural to use regional temperature data fitted individually to each proxy? I mean, one of the major claims of the HT is that despite great regional temperature variation over the 1000 year period, it all magically cancels out. For this to be true one should think that a similar regional variation should be observed in the temperature data, and if so this mandates an individual, regional correlation.

Professor Juckes’ behavior seems to me an indication of just how hard the shattering of the Hockey Stick has hit some quarters. Despite denial that it happened or claims that it doesn’t matter anyway, word has leaked out to mainstream climate scientists. Most of the mainstream group are not, as far as I can reckon, Global Warming Fundamentalists and the Warmers are worried.

PAt, for some reason, my final edits on the post didn’t take and some quotes that I had on hand got stuck at the end leading to a rather disjointed conclusion. THanks for drawing this to my attention, I’ve edited this and tidied it up.

Steve, I’m curious if rotation was done at any time by MBH, as this is common in PCA and FA.

Dear Onar,

Incredibly they went for the magical solution. From Wahl and Ammann (2006), page 36 which Steve cited here:

Yes, my jaw dropped when I read that as well.

John,

one should think bristlecones are a special radio receiver that picks up a global climate signal. But to be on the safe side: has anyone actually looked at the regional temperature (and possibly precipitation) data for the bristlecones to see if they match what the bristlecones are doing? I mean , there must be a reason why Hughes refers to them as a “mystery.” To me that sounds like an admission of voodoo science.

Steve #6, that is unbelievable. It so unscientific and so illogical, it makes you wonder how a person could actually put that on paper without some part of their brain going “wait, you can’t say that.”

So it is clear that bristlecones gave the biggest best signals that eliminated the 1400 – 1499 part of the MWP and showed the blade the best on the hockey stick.

In the charts above, Figure a) includes bristlecones and Figure b) excludes them. What do the bristlecones look like on their own. Shouldn’t there be a Figure b1) bristlecones only? and then it would be good to compare that to actual temperature measurements of 1880 onward. I’m sure it has been done before but I haven’t seen it and it would sure add to points being made in this commentary.

IOW, this way we don’t have to care about spatial sampling error. And we can quite freely overfit using CVM, INVR and other related methods (those that are keen to underestimate past temperature variations).

Onar, yes, I looked at the correlations of both foxtails and bristlecones to local temperature – it is negligible as Graybill oberved years ago. They also have negligible relation to local precipitation – see our Reply to VZ. Christy recently produced detailed station data for this data and the result holds even more strongly at a detail level.

Osborn and Briffa 2006 claimed a correlation of Esper’s foxtails to gridcell temperature of 0.18 or so, using HadCRU data. I was unable to verify this – I got 0.04 or something like that. See correspondence with Science. I asked Science to make Briffa produce the temperature data that he used. He used a different temperature data set than the one he said that he used – one whose values commenced in 1888 rather than 1870. He said that the HadCRU gridcells were no good before 1888. I asked what station data were used in the CRUtem version and which in the Had CRU version, but Briffa refused to provide the information.

Lamarche and Fritts in their earlier works posited that bristlecones responded to N-S movements in the weather systems. There’s a pretty good discussion in Fritts’ text. But they were puzzled by the growth pulse and were the originators of the CO2 fertilization hypothesis in Lamarche, Fritts et al 1984, the precursor to Graybill and Idso 1993. Hughes was a coauthor of Biondi et al 1999 which is worth re-reading. Biondi was a NAS panelist.

Onar:

I’ve tried, but they still look like trees to me. If there was ever an expression of “voodoo science” then the mysterious abilities of bristlecone pines to respond to global temperature without any response to local temperature or precipitation would be it.

Except that Bob Park, who wrote a book on stuff like this, thinks this is all perfectly fine for some bizarre reason.

Remember Juckes’ claim that the code for our GRL article was not available. In a series of ungracious retractions, Juckes said that that he meant to say our EE article. This claim was still false. Juckes then blamed a stale webpage for his supposed inability to locate the code.

Ironically, when I was looking up what Wahl and Ammann said about PC standardization – upon which Juckes relies even though it is check kiting – I found a quotation from Wahl and Ammann 2006 which specifically referee to the MM05 (EE) SI as follows:

Obviously I don’t agree with these particlar comments, but they stand as evidence that the North American Team had no difficulty locating our code.

Now here’s something else. I just took at look at the original Wahl and Ammann version submitted in May 2005 to Climatic Change. It contains the following identical sentence:

So despite my apologies on the matter, it looks like I actually did archive MM05b source code on a timely basis since Wahl and Ammann refer to the SI in May 2005. AS to my response to Juckes’ inquiry, it looks like I wanted to annotate the archive to comment on the further information on MBH methodology that had emerged – partly through the Wahl and Ammann code and partly through the archiving in response to the House Energy and Commerce Committee. Unfortunately, I updated the script in March 2006 without leaving the original version in place. Maybe Wahl and Ammann have a copy of what I originally archived. In any event, it looks like I actually did archive the code promptly in a place where Wahl and Ammann were able to locate and cite it in May 2005.

Ok, so this post is not really on topic, but might elevate the quality of the posting. Firefox 2.0 is available and it has a built in spell checker that works when posting to a site like this.

Maybe Steve wants to attend this workshop?

C

Many of the comments in this thread put forward and well articulate the lack of the rather simple explanations and evidence that are rather obviously required to bolster the claims made of the temperature proxy reconstructions. I think about these items each time there is a rather comprehensive discussion of the HS and progeny reconstructions and then proceed to go through in my mind the litany of motivations and explanations for the defending the participating scientists’ actions.

I suppose the motivations evolve from the original intent to put together a statistically consistent and verifiable proxy temperature reconstruction that was sufficiently comprehensive to make conclusions about past global temperatures that could be useful in predicting future global temperatures and used to calibrate and/or verify climate models. The motivations here would be strictly based upon a better understanding of the involved science. There had to be some major disappointments in the initial analysis of the data and the limits of the methodology in obtaining any clear cut conclusions. I think, for a more complete understanding of what transpired next, one almost has to admit to a mix of these scientific motivations with those personal motivations of many of the participating scientists to be able to make definite conclusions for advocating policies with regards to AGW. I am not informed of the Mannian approach prior to the hockey stick, but I find the published articles of modern Mann entirely void of any second thoughts, precautionary statements or countervailing arguments about his “scientific” conclusions that are invariably reinforcing the theory of AGW.

Under the pressures of publishing their work (proxy temperature reconstruction workers) and being able to do it in an environment with a large AGW consensus, I would guess that data snooping was the order of the day and rationalized by way of AGW being already established and that new evidence would/should support and reinforce previous evidence.

After MM, NAS and Wegman, one has to seriously question much of the remaining field of climatology for not articulating what legitimately needs to be done to support what would appear to skeptics and neutrals to be premature conclusions. I keep looking and failing to find a more nuanced explanation for this lack of effort other than that most scientists in the field consider AGW as settled as the theory of relativity and that, while the first attempts at temperature reconstructions may have failed the statistical acid tests, they evidently feel a need to doggedly defend the results for policy reasons until better data and with valid statistical reconstructions can “scientifically” prove AGW.

There are a number of major and obvious weaknesses in these reconstructions that have been pointed to by MM, NAS and Wegman and on a continuing basis at CA. Have all these weaknesses ever been listed in one place here at CA with constructive suggestions on how they could be strengthened? How can one, and particularly one in the field of climatology, for example, seriously accept the concept of temperature tele-connections without some attempt at explaining how it might operate? Combine the foregoing problem with the diminishing values of R^2 in the verification process of temperature reconstructions or not even bothering with the process and how can serious climate scientists

notconsider these problems reconstruction killers?Re #1: Very interesting, but how about answering the question posed? McIntyre and McKitrick (2005) commented, as you quote, on the impact of this choice on the first PC of the North American tree ring network. They do not comment on the impact on the reconstruction. In particular, they do not point out that the main cause of difference between their reconstructiomn and that of Mann et al. (1998) is due to the change in normalisation. This issue is not addressed in the GRL reply to Huybers either.

The reply to Huybers also fails to address the effective elimination of most of the data by this change in normalisation. It is implied that it only affects 2 series (“Accommodating these 2 density series should not be at the expense of the most appropriate treatment for the other 68″), but this is untrue. The two series highlighted by Huybers are extreme examples, but many more are affected.

In the words of the NAS panel, could you say what the “the scientific considerations of the climate information represented in each of the proxies” for your choice of normalisation of the proxies going into the North American tree ring network?

The other as yet unanswered question concerns why the importance of this choice of normalisation was not discussed in McIntyre and McKitrick (2005, Energy and Environment), given the dependence of the results on this choice.

The reason I did not use the word Bristlecone when asking about your “effective elimination of much of the data” in McIntyre and McKitrick (2005, Energy and Environment) is because I was not talking about the Bristlecones. The sensitivity of you results to exclusion of the Bristlecones depends on the prior elimination of other data through an arbitrary (or, possibly, not arbitrary but yet to be disclosed?) choice of normalisation.

Finally, the GRL reply to Huybers says “Bristlecone sites are a well-known examples of CO2 fertilization”. This is inaccurate: there has been much speculation, but no demonstration of CO2 fertilization at these sites.

RE #16: Dr Juckes. I have been hanging here, waiting for you to respond to all of the destructive attacks on your integrity. We concerned folk NEED you to rise to the game, respond to the scurrilous attacks, and SHOW the world why there is a serious problem that we MUST get to grips with.

RE #17: Forget it, Concerned of Berkely. The good doctor is not about to expose his brilliance to the unwashed deniers. It’s all about head games – not science.

In the words of the NAS panel, could you say what the “the scientific considerations of the climate information represented in each of the proxies” for your choice of normalisation of the proxies going into the North American tree ring network?

Surely that’s your task? Steven M. has already referred to the issue of whether you choose one procedure over another; how do you justify your procedure for normalisation?

Re: #15

The current discussion on the thread “Juckes and 99.98% Significance” would appear to me to be covering these weaknesses — and by statistics heavy weights — and providing some suggestions for future work. As an interested layperson I appreciate these discussions, but judge that I and others in my position would benefit from periodic summaries of the points being made. I get a distinct impression that many of the statistical experts regularly posting at the CA blog judge that much of the methodology (and physical understandings) used in temperature reconstructions is nearly totally insufficient to make any conclusions about past millennial temperature variations. While this is the view that I have carried away from these discussions, I acknowledge a need to be cautious, as a skeptic, of reaching for any hasty conclusions.

I guess what surprises me and where I need a better understanding is the valiant defense waged by many climate scientists of these reconstructions and their apparent unwillingness to get out of the Mannian rut.

#16. Once again, Juckes, instead of answering any of the questions posed by Jean S or myself or others about the methodlogy in Juckes et al, is throwing around more accusations.

Juckes says:

Martin, Martin. another false accusation – this is rather a specialty of yours. Let me repeat some points posted up here http://www.climateaudit.org/?p=929. Results from PC calculations using a correlation matrix are precisely equivalent to results using networks in which the chronologies have been divided by their standard deviation. In addition, the covariance matrix of the network after dividing by the short-segment standard deviation is not hugely different from the correlation matrix. In MM05(EE), we specifically stated that results using PCs from a correlation matrix were about halfway in between the two cases as follows:

In a recent discussion here , I provided the following graphic reconciling this comment to Wahl and Ammann Scenario 5 as below..

I don’t think that anyone can seriously argue that the verbal description in MM05(EE) is not a reasonable description of the situation in this graphic. Now in retrospect, an additional illustration might have helped, but the description in MM05(EE) reports the situation accurately.

Now, Martin, you go on to say:

Martin, Mertin. We’re used to realclimate assiduously avoiding the term “bristlecone” and using code words like “much of the data”. But even Wahl and Ammann agree that the PC permutations boil down to bristlecone weighting.

Now, Martin, are you making another implied accusation here? An “arbitrary but yet to be disclosed? choice of normalization”? This is becoming very tiresome. The use of a covariance matrix has already been reported and backup code was provided. ITRDB chronologies are already “standardized” – you might consult your coauthors Briffa or Esper on that topic. However, they do not have uniform standard deviations. As I’ve mentioned on many occasions, any statistical references that I’ve seen recommend principal components using a covariance matrix if the network is denominated in common units (as is the case here.) I’ve challenged you and others to provide a third-party statistical reference by a statistical authority requiring or even recommending division by the standard deviation in such circumstances. I’ve given several citations the other way. BTW you see that “orther data” besides bristlecones is “eliminated” by the use of covariance PCs? Pray tell – what is it?

In MM05 (EE), we gave a detailed discussion of potential problems with bristlecones and specifically did not link our caveats about the bristlecone growth pulse to CO2 fertilization – citing Hughes and Funkhouser’s statement that it was a “mystery”. In this article and in the Reply to Huybers, we cited the IPCC warnings about potential compromise of tree ring chronologies by CO2 fertilization. In our GRL article, we stated:

IPCC SAR had stated:

Now, Martin, I don’t think that anyone reading both MM05 (EE) and Reply to Huybers would misunderstand the point in Reply to Huybers, but it would have been better to say “potential CO2 fertilization”, as it might be nitrate fertilization, phosphate fertilization, nonlinear interaction of precipitation and temperature or nonlinear interaction of P+T+ fertilization that caused the “mystery” growth pulse. The salient point is very clear – that for someone like Mann whose method requires that there be a linear response, it is necessary to demonstrate this and he failed to do so.

You state: .

Martin, Martin – we are not recommending any PC method for the North American network. We stated:

In MM05 (EE), we specifically stated that in the AD1000 network, bristlecones were the issue – that they dominated the PC! through their longevity and not through a mathematical artifice. As a pointed out elsewhere, your demonstration that the AD1000 PCs are relatively insensitive to different standardizations is reaosnable enough – we’d already directed attention to this point in MM05 (EE).

Notice how Dr Juckes is trying to pigeon-hole Steve M as a backer of the “CO2 hypothesis”. I wonder for what purpose?

The problem is that Steve M has moved on to a more sophisticated alternative hypothesis involving nonlinear interactions among limiting factors (T,P,N, and, yes, C). I note that Dr Juckes has been unwilling to discuss this topic in a substantive way. If his reconstruction is strongly dependent on bcps it is hard to justify this omission. We agree that “trees are not thermometers”. Given that, it is important, I think, to discuss limitations in the proxies used to feed these reconstructive models.

Dr Juckes, what are the consequences of choosing a linear-additive response model given a growth process that is nonlinear-multiplicative? Dr Juckes, who are the tree physiolologists and biometricians on your team?

I’ve asked these questions before. I won’t ask again. The next time you see them will be CoPD.

If the results depend entirely on what type of renormalization process is used, then is a scientific result obtained? It sounds like more data and statistical process selection again.

The scientific process demands that the results be replicated. If they cannot be replicated without using some unrecognizable and unaccepted data transformation, then you do not have a scientific result.

You have an accident, you have a spurious conclusion, you have to discard your result and start over with something else. You don’t keep trying to prove the “accident” over and over again using the same flawed process.

#22. bender, actually, I wouldn’t say that I’m a “backer” of any particular hypothesis – though, as you say, if I were to guess, I would say that there are multiple causes for the “mystery” – there are usually are. As we’ve said over and over, it’s the job of Mann and others to demonstrate the validity of bristlecones as a linear temperature in the face of one specialist study after another raising caveats, most recently the NAS panel, but previously IPCC 2AR. bender, I know that this is not news to you, but I just want to re-iterate this.

Re #24 That was part of my point, really. Juckes wants to paint Steve M into a CO2 corner. But Steve M need not choose any corner. The spiked bcp response is truly “mysterious”. The safest bet is to do what NAS said: avoid the bcps. (Which only begs the question: what about cedars, junipers, foxtails, larch, etc?)

But this is not about teams. It is about proxies. Trees as thermometers. If it were me being backed into a corner, I would declare myself a proponent (not a believer, not a backer) of the nonlinear-nonadditive model. The question is: what do the teams make of that model. Seems to me the linear-additive model might be too crude an approximation to resolve the trends and historical differences that they are seeking to resolve.

Re: 16

Concerned:

The reason why Dr. Juckes is unable to refute Steve is because Dr. Juckes is wrong.

I’ve followed the AGW hypothesis for 30 years. I’ve been developing energy efficiency and renewable projects for 20 years, and thus have a direct economic incentive to see AGW proven correct. I’ve been following Steve’s work since before Climate Audit existed. And as incredible as it sounds, Steve is right and the paleoclimate community following in the wake of MBH 98 is wrong. Steve has demonstrated this very convincingly. It’s that simple.

This fact in and of itself does not invalidate the AGW hypothesis. For one thing, it is possible that modern warming is anomolous; Steve has merely demonstrated that the bulk of the recent work supporting that idea is insufficiently robust to support that argument; in other words, we don’t really know. And there are other lines of evidence supporting the theory. But given how resistant this clearly flawed corpus of work has been to correction, or even criticism, it certainly causes me to wonder about the other pillars of the AGW hypothesis and how well they would stand up to similar scrutiny.

Regards;

re 16:

Through extensive web research I think I found Dr. Juckes audio response to Steves questions here

Re 21: Sorry I missed the fact that you had given an answer to some points on a later page. I guess your question about which trees were effectively eliminated means it was unintentional?

The ITRDB chronologies are standardised by a range of different techniques. The arbitrariness comes from the lack of a well defined methodology.

You say that “we are not recommending any PC method for the North American network” — would you accept that your assertions about sensitivity to Bristlecone pines, which are based on a particular PC method which nobody recommends, are on a dodgy foundation?

According to Vitousek etal (1997) the impact of nitrate fertilisation has been growing rapidly since the mid 1970s, so it is not a signal we would expect to strongly influence the 1856 to 1980 calibration period. The particular reason for concern about CO2 is because of the lack of independence between CO2 and temperature. Do you really believe that there is a significant correlation between, say, phosphate fertilization, and the temperature variability from 1856 to 1980? Once there is a reasonable suspicion, then it would have to be taken into account.

What is this “mystery” growth pulse? Are you referring to the fact that trees don’t behave like thermometers? Not exactly a mystery.

Re 23: Yes, the choice of normalisation is important, which is why I believe that McIntyre has no justification in simply taking what comes out of the archive. Ideally, we would use prior information about the sites to estimate the information content in each chronology. Such prior information is not archived, so the next best option is to take uniform weighting.

Re 26: The work does not “follow in the wake of MBH 98″, nobody else has adopted the techniques of MBH98 to produce estimates of past climate, apart from McIntyre and McKitrick trying to demonstrate their unreliability. The technique we used is much closer to that of Jones et al. (1998). McIntyre’s position is entirely circular: the proxies can’t be trusted –> statistical significance can only be achieved by cheating –> all scientists are cheats –> the proxies can’t be trusted. It is certainly a self consistent position.

Martin, this is really pathetic. You throw out accusations and now say:

Our published corpus is not particularly large and the point was a contentious one, so that this is a pretty lame excuse. Similar false claims are in your manuscript. You now know these claims are incorrect. In carrying out a principal components on the North American tree ring network using a standard algorithm, we neither “intentionally” nor “unintentionally” downweighted bristlecones. We simply did the calculation and saw what happened and reported the results. The various methodological permutations and combinations equate to upweighting and downweighting of bristlecones but that was something that we learned in the course of the replication effort.

The NAS panel has stated that strip-bark examples should not be used (which excludes all the Graybill sites in question) so that any method that downweighted bristlecones would be more consistent with NAS panel recommendations.

First, you are in no postion to speak for scientists in general. Second, Steve has never

suggested that scientist in general are cheats. He may have hinted that certain scientistw are cheats, and

what I’m seeing you are becoming very quickly part of that group. If I were a co-author of yours, I’d be made

it pretty sure that all the concerns raised here (and elsewhere) were properly treated and that there would

be no more undisclosed “standardisations” etc. greatly affecting your results.

Finally, it’s time to wake up Martin. In your paper you are essentially claiming that averaging

18 (magic)

proxiesyou get a NH temperature reconstruction with a standard error 0.15K.C’mon, why don’t do the same with

real measurements: choose 18 instrumental(temperature, maybe few precipitation too) measurements series and put

Jones, Hansen, etc. out of work! Even better, do the same thing but use the satellite temperature

series as the reference: then in the future there would be no need for any satellite measurements. Think

how much we could save!

#28. Juckes said:

This is a ridiculous statement. Has the level of reasoning in Team-world really sunk to this? Shameful.

Lawyers with courtoom experience will have seen more than a few “expert” witnesses come to court armed with a Ph. D., a pet theory and a marked degree of hubris, only to be demolished in competent cross examination by well-prepared counsel. It’s invariably not a pretty sight. Dr. Juckes performance on this blog constantly reminds me of the “caught in the headlights” behaviour of such witnesses.

My father used to rail against the attitudes of some (but not all) of the PhDs he worked with. I did not completely understand the sort of hubris he was highlighting until I personally experienced things similar to what he had experienced in my own career.

This thread is actually quite embarassing to read.

Part of me, even with my own personal experience in science and engineering, does not want to believe that a learned man such as Dr. Jukes could really be behaving like such a demagogue. Dr. Jukes, may we never meet, and may neither of us ever have to work for the other or with each other. It would be intolerable.

Juckes stated:

I’ve observed on the blog that there are massive cancellations in the MBH98-99 linear algebra and the two-stage maximization procedure described their reduces to Partial Least Squares regression in the one-dimensional case (a technique called “inverse regression” by the Team). Mann et al did not understand the equivalence nor have subsequent Team authors. Partial Least Squares regression with one factor is equivalent to weighting by the partial correlation coefficient – a technique used in Hegerl et al 2006, where it is described as new technique – and obviously used in Juckes.

So when Juckes says that “nobody else has adopted” MBH98 regression methodology, this is not correct. Hegerl et al 2006 used a form of “inverse regression”, which is equivalent to Mannian methods net of the linear algebra. Mann and Jones 2003 appear to have used a form of inverse regression although available information does not permit replication of their results. Juckes et al illustrate many cases using “inverse regression”, which is equivalent to Mannian regression (up to a mysterious re-weighting of partial correlation coefficinets in MBH99.) The MBH99 case is easier to demonstrate this than MBH98 – as Jean S drew to my attention. The MBH99 reconstruction can be shown to be exactly linear in the proxies, but the tweaking of the weights is still a mystery. I’ll do a post up on this. Also the method that replicates MBH99 does not replicate MBH98 AD1400. The MBH98 step remains a mystery. (Wahl and Ammann haven’t replicated MAnn’s results – they get the same results, apples and apples, as we did.)

Re:#34

Your observation in the first paragraph (along with links to the math) is worth posting to the CoP article discussion website.

Actually the entire topic linking Partial Least Squares regression and inverse regression deserves an article by itself.

Re: #36

Yes — I didn’t want to start sounding like TCO. :)

PhDs vary all over the map in their ability to reason. It is a mark of effort in specialization, not of ability in argumentation.

In my professional experience, the brightest PhDs with whom I have worked played down the fact that they had a doctorate.

Mine too.

Dr. Juckes,

I understand that it is difficult to face harsh criticism of your work. It is a real test of one’s character. It is tempting to just lash out and belittle your critics instead of answering their questions one point at a time and patiently explaining things until they understand. You failed this test Dr. Juckes.

Re 34: Yes, McIntyre, as explained in the manuscript. It is evaluated and we reject it on the basis that it produces more eratic results than the composite. This is consistent with your work. Our estimate of past climate does not use inverse regression. I’m not sure where the idea that we are trying to defend their method came from.

Re 41: Perhaps you should read the blog before lashing out? (Just a suggestion, entirely up to you of course).

#28. Juckes asks:

Martin, Martin – the term “mystery” was used to describe 20th century bristlecone pine growth rates by Hughes and Funhouser 2003. We’ve cited this frequently. In MM05(EE), we cited them as follows:

Here is a more extended quote from Hughes and Funkhouser 2003:

You will surely agree that it is decidedly odd for MBH to be asserting that bristlecone growth is essential for temperature reconstruction to a general scientific audience, while Hughes, speaking to specialists, says that it is a “mystery”.

In response to your question, the “mystery” is not that trees don’t behave like thermometers; the “mystery”, according to MBH coauthor Hughes, is what caused increased growth in bristlecones in the 20th century.

In passing, I note the following comments from MM05(EE) on CO2 fertilization which seem worth repeating:

39:

It is because any PhD with a modicum of intelligence realizes that he is an expert on only that tiny, tiny part of science which he has studied in depth. A PhD generally indicates a person who has a lot of persistence, but not necessarily anything else. Anyone who wants to test their general intelligence should try to outwit a Southern tree-farmer on a log purchase.

He hasn’t read the literature! That explains a few things.

Re 43: OK. So the fact that some trees have positive, unresolved growth anomalies is not particularly surprising. Would you agree that it is also not particularly surprising that some trees have negative, unresolved growth anomalies?

Re #46

But which is which and which is when?

Assume E(epsilon)=0 for all t? Proof by assumption!

Dr. Juckes,

I read this blog extensively, and in fact have read all the threads you are involved in completely from top to bottom. I did not make a snap judgement, but observed over a long time period before coming to a conclusion. I then stated my opinion clearly, and without malice. I did not “lash out” as you claim. In fact, your one-sentence response, if anything, proves my point.

Fortunately for you and everyone else on this blog, I do not beat dead horses. There will be no futher direct communications from me to you, as that would be pointless. I may from time to time refer to you in the third person, however I will continue to use the honorific ‘Dr.’ instead of the inappropriately (to me) familar ‘Martin’ or the brusque (to me) ‘Juckes’.

I think this may be an example of what can happen when a specialist in an area, X, drifts a little too far afield, thinking he’s got the tools needed to publish in a different area, Y. Dr Juckes may be a very good scientist, but that doesn’t mean he’s got what it takes to publish whatever he likes wherever he likes. Steve M is very hard to take down because he knows what he’s doing. I sense Dr Juckes, from the start, underestimated Steve M’s aptitude & experience.

#36

Not a bad idea. Key words: Multivariate calibration, Overfits and Spurious correlations.

Downloaded some PJ Brown’s papers on multivariate calibration (1,2). If I understood it right, the INVR (Juckes et al) is a solution to

controlled calibrationproblem (but Brown adds sample residual covariance weighting).For

natural calibrationsolution, Brown uses term ‘inverse regression’, and it seems that it is the method we (at least I) have here referred as ‘direct regression’. Some random thoughts:1) Temperature vs. proxy calibration is not controlled calibration, in the sense that we can’t make sure that calibration temperatures include the whole range of temperatures. However, I think Brown’s ‘inverse regression’ would be more problematic in the proxy case. (I did some simulations.)

2) Should we use multivariate calibration or univariate calibration? i.e. Juckes et al A1 or A2 (before the CVM-scaling!!). Not much difference, except that in multivariate case there is a clear danger of overfitting. (This result is based on my simulations, not a proven fact)

3) There seems to be a general assumption that calibration data is very accurate. For example, in (2)

GLS=Generalized Least Squares

BLP=Best Linear Predictor

REFS:

1) P.J. Brown (1982) Multivariate Calibration. Journal of the Royal Statistical Society. Series B, Vol. 44, No.3. pp 287-321

2) R. Sundberg and P.J. Brown (1989) Multivariate Calibration With More Variables Than Observations. Technometrics, August 1989, Vol.31 NO. 3. pp 365-371

UC – once one starts looking at the topic from the point of view of multivariate analysis, you get a cleaner perspective and much less sympathy for claims that any one method is the “right” method. For example, there’s quite a bit known about Partial Least Squares regression, especially in chemometrics, and various interesting relationships to other methods. Stone and Brooks link Partial Least Squares, Ordinary Least Squares and Ridge Regression by varying one parameter. By varying another parameter, you can incorporate canonical correspondence analysis.

Steve, did you use ‘less than’ symbol? Something is missing from your post.

Proper statistical model would help in selecting the method.

I think that overfitting in multivariate calibration would be a good topic for a paper, and could be submitted to non-climate science journal as well (if someone is afraid of publication biases in climate science).

Re #49 **I think this may be an example of what can happen when a specialist in an area, X, drifts a little too far afield, thinking he’s got the tools needed to publish in a different area, Y. Dr Juckes may be a very good scientist, but that doesn’t mean he’s got what it takes to publish whatever he likes wherever he likes. Steve M is very hard to take down because he knows what he’s doing. I sense Dr Juckes, from the start, underestimated Steve M’s aptitude & experience.**

What is happening in this thread and by many Hockey Team members on other websites is that they are using their “paid” positions and titles of their workplace as an authority. Misquoting and making false claims are like going for coffee. Steve has to rely on his expertise. And he has demonstrated it here over and over again demonstrating that there are many who do not have enough staistics to do the papers.