Really wish that Gavin or the like would come over here more. Like to learn from him and think he would push his own thinking. Sometimes, I worry that those guys let their politics get in the way of being curious Feynman type scientists. (I occasionaly gig Steve for that too…)

]]>He should have just been truthful re R2 from the beginning.

]]>The logical conclusion seems to be that Mann’s reconstruction should be withdrawn and ignored and any other study that used bristlecone pines or other dubious time series should also be withdrawn. As far as I can tell this would leave only Moberg’s study and an earlier one by Hu (1997) using borehole data as potentially valid reconstructions. Both of these reconstructions show the MWP and LIA, great natural variability of climate, and a warming commencing around 1600 that has continued into the present.

**Steve: ** I do get a little weary of them attributing our emulation of MBH98 results without bristlecones as “our” reconstruction. Howver, re-stated in this way (which is useful), they have in effect said (And all parties agree) that an MBH98 reconstruction **without** bristlecones is “without statistical and climatological merit”. They claim that a reconstruction **with** bristlecones, on the other hand, has unique statistical and climatological merit, even if the bristlecones are contaminated with CO2 fertilization. The short discussion in MBH99 is bait-and-switch. They raise the issue of CO2 fertilization, but ddo not adjust any MBH98 results, but suggest to readers now that they did. The CO2 adjustment in MBH99 is bogus – it adjusts 19th century values and argues that the CO2 fertilization effect was “saturated” in the 20th century.

There’s lots of hair on Moberg, which I hope to get to.

]]>In answer to your question:

7 c.Did you calculate the R2 statistic for the temperature reconstruction, particularly for the 15th Century proxy record calculations and what were the results?

My answer is:

~~A(7C): The Committee inquires about the calculation of the R2 statistic for temperature reconstruction, especially for the 15th Century proxy calculations. In order to answer this question it is important to clarify that I assume that what is meant by the “R2″ statistic is the squared Pearson dot-moment correlation, or r2 (i.e., the square of the simple linear correlation coefficient between two time series) over the 1856-1901 “verification” interval for our reconstruction. My colleagues and I did not rely on this statistic in our assessments of “skill” (i.e., the reliability of a statistical model, based on the ability of a statistical model to match data not used in constructing the model) because, in our view, and in the view of other reputable scientists in the field, it is not an~~

adequate measure of “skill.” The statistic used by Mann et al. 1998, the reduction of

error, or “RE” statistic, is generally favored by scientists in the field. See, e.g.,

Luterbacher, J.D., et al., European Seasonal and Annual Temperature Variability, Trends and Extremes Since 1500, Science 303, 1499-1503 (2004).

RE is the preferred measure of statistical skill because it takes into account not

only whether a reconstruction is “correlated” with the actual test data, but also whether it can closely reproduce the mean and standard deviation of the test data. If a reconstruction cannot do that, it cannot be considered statistically valid (i.e., useful or meaningful). The linear correlation coefficient (r) is not a sufficient diagnostic of skill, precisely because it cannot measure the ability of a reconstruction to capture changes that occur in either the standard deviation or mean of the series outside the calibration interval. This is well known. See Wilks, D.S., STATISTICAL METHODS IN ATMOSPHERIC SCIENCE, chap. 7 (Academic Press 1995); Cook, et al., Spatial Regression Methods in Dendroclimatology: A Review and Comparison of Two Techniques, International Journal of Climatology, 14, 379-402 (1994). The highest possible attainable value of r2 (i.e., r2 = 1) may result even from a reconstruction that has no statistical skill at all. See, e.g., Rutherford, et al., Proxy-based Northern Hemisphere Surface Temperature Reconstructions: Sensitivity to Methodology, Predictor Network, Target Season and Target Domain, Journal of Climate (2005) (in press, to appear in July issue)(available at:

ftp://holocene.evsc.virginia.edu/pub/mann/RuthetalJClimate-inpress05.pdf). For all of

these reasons, we, and other researchers in our field, employ RE and not r2 as the

primary measure of reconstructive skill.

As noted above, in contrast to the work of Mann et al. 1998, the results of the

McIntyre and McKitrick analyses fail verification tests using the accepted metric RE.

This is a key finding of the Wahl and Ammann study cited above. This means that the

reconstructions McIntyre and McKitrick produced are statistically inferior to the simplest possible statistical reconstruction: one that simply assigns the mean over the calibration period to all previous reconstructed values. It is for these reasons that Wahl and Ammann have concluded that McIntyre and McKitrick’s results are “without statistical and climatological merit.”

YES AND THE RESULT WAS A BIG FAT NOTHING AS YOU CAN SEE FROM THE SOURCE CODE

Sincerely

Dr Michael Mann

University of Virginia