Cross-Validation R2 Source Code Reference

Just for reference: here’s the code excerpt where Mann calculates the cross-validation R2 statistics and then writes it to file. You can see the original code at ftp://holocene.evsc.virginia.edu/pub/MANNETAL98/METHODS/multiproxy.f.{Update –
http://www.meteo.psu.edu/holocene/public_html/shared/research/MANNETAL98/METHODS/multiproxy.f
] Search down using corrnhem or verif1,out. There is no "if" as to whether he calculated the cross-validation R2 statistic.

The following Fortran code calculates the cross-validation R2 statistics for global (corrglob**2) and NH (corrnhem**2) and then saves these results to the file verif1.out. The Fortran code is very pedantic for a simple operation, but you should be able to see what’s going on. (I’m old enough that I actually learned Fortran when I was young; I never imagined that anyone would still be using it.)
amean1 = zero
amean2 = zero
amean3 = zero
amean4 = zero
varverglob = zero
varvernhem = zero
varcalglob = zero
varcalnhem = zero
corrglob = zero
corrnhem = zero
do iy=iymin,iymax
amean1 = amean1+globv(iy)
amean2 = amean2+nhemv(iy)
amean3 = amean3+globc(iy)
amean4 = amean4+nhemc(iy)
end

do amean1 = amean1/float(iymax-iymin+1)
amean2 = amean2/float(iymax-iymin+1)
amean3 = amean3/float(iymax-iymin+1)
amean4 = amean4/float(iymax-iymin+1)

do iy=iymin,iymax
varcalglob = varcalglob + (globc(iy)-amean3)**2
varcalnhem = varcalnhem + (nhemc(iy)-amean4)**2
varverglob = varverglob + (globv(iy)-amean1)**2
varvernhem = varvernhem + (nhemv(iy)-amean2)**2
corrglob = corrglob + (globv(iy)-amean1) $ *(globc(iy)-amean3)
corrnhem = corrnhem + (nhemv(iy)-amean2) $ *(nhemc(iy)-amean4)
end do

corrglob = corrglob/sqrt(varverglob*varcalglob)
corrnhem = corrnhem/sqrt(varvernhem*varcalnhem)

Later on…
open (unit=9,file=’corrs-verif1.out’,status=’unknown’)
write (9,*) ‘globe: ‘,corrglob,corrglob**2
write (9,*) ‘nhem: ‘,corrnhem,corrnhem**2

This entry was written by Stephen McIntyre, posted on Jul 23, 2005 at 10:37 AM, filed under MBH98, Source Code and tagged verification-r2. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

8 Comments

John A

Posted Jul 23, 2005 at 11:35 AM | Permalink

Dear Senator Barton,

In answer to your question:

7 c. Did you calculate the R2 statistic for the temperature reconstruction, particularly for the 15th Century proxy record calculations and what were the results?

My answer is:

A(7C): The Committee inquires about the calculation of the R2 statistic for temperature reconstruction, especially for the 15th Century proxy calculations. In order to answer this question it is important to clarify that I assume that what is meant by the “R2” statistic is the squared Pearson dot-moment correlation, or r2 (i.e., the square of the simple linear correlation coefficient between two time series) over the 1856-1901 “verification” interval for our reconstruction. My colleagues and I did not rely on this statistic in our assessments of “skill” (i.e., the reliability of a statistical model, based on the ability of a statistical model to match data not used in constructing the model) because, in our view, and in the view of other reputable scientists in the field, it is not an
adequate measure of “skill.” The statistic used by Mann et al. 1998, the reduction of
error, or “RE” statistic, is generally favored by scientists in the field. See, e.g.,
Luterbacher, J.D., et al., European Seasonal and Annual Temperature Variability, Trends and Extremes Since 1500, Science 303, 1499-1503 (2004).
RE is the preferred measure of statistical skill because it takes into account not
only whether a reconstruction is “correlated” with the actual test data, but also whether it can closely reproduce the mean and standard deviation of the test data. If a reconstruction cannot do that, it cannot be considered statistically valid (i.e., useful or meaningful). The linear correlation coefficient (r) is not a sufficient diagnostic of skill, precisely because it cannot measure the ability of a reconstruction to capture changes that occur in either the standard deviation or mean of the series outside the calibration interval. This is well known. See Wilks, D.S., STATISTICAL METHODS IN ATMOSPHERIC SCIENCE, chap. 7 (Academic Press 1995); Cook, et al., Spatial Regression Methods in Dendroclimatology: A Review and Comparison of Two Techniques, International Journal of Climatology, 14, 379-402 (1994). The highest possible attainable value of r2 (i.e., r2 = 1) may result even from a reconstruction that has no statistical skill at all. See, e.g., Rutherford, et al., Proxy-based Northern Hemisphere Surface Temperature Reconstructions: Sensitivity to Methodology, Predictor Network, Target Season and Target Domain, Journal of Climate (2005) (in press, to appear in July issue)(available at:
ftp://holocene.evsc.virginia.edu/pub/mann/RuthetalJClimate-inpress05.pdf). For all of
these reasons, we, and other researchers in our field, employ RE and not r2 as the
primary measure of reconstructive skill.
As noted above, in contrast to the work of Mann et al. 1998, the results of the
McIntyre and McKitrick analyses fail verification tests using the accepted metric RE.
This is a key finding of the Wahl and Ammann study cited above. This means that the
reconstructions McIntyre and McKitrick produced are statistically inferior to the simplest possible statistical reconstruction: one that simply assigns the mean over the calibration period to all previous reconstructed values. It is for these reasons that Wahl and Ammann have concluded that McIntyre and McKitrick’s results are “without statistical and climatological merit.”
YES AND THE RESULT WAS A BIG FAT NOTHING AS YOU CAN SEE FROM THE SOURCE CODE

Sincerely

Dr Michael Mann
University of Virginia
Douglas Hoyt

Posted Jul 23, 2005 at 11:55 AM | Permalink

If it is true that McIntyre and McKitrick’s results are “without statistical and climatological merit” as Mann argues, and it is also true that McIntyre and McKitrick results have duplicated Mann’s results using the same data sets and same methods, then it logically follows that Mann et al.’s results are “without statistical and climatogical merit.”

The logical conclusion seems to be that Mann’s reconstruction should be withdrawn and ignored and any other study that used bristlecone pines or other dubious time series should also be withdrawn. As far as I can tell this would leave only Moberg’s study and an earlier one by Hu (1997) using borehole data as potentially valid reconstructions. Both of these reconstructions show the MWP and LIA, great natural variability of climate, and a warming commencing around 1600 that has continued into the present.

Steve: I do get a little weary of them attributing our emulation of MBH98 results without bristlecones as “our” reconstruction. Howver, re-stated in this way (which is useful), they have in effect said (And all parties agree) that an MBH98 reconstruction without bristlecones is “without statistical and climatological merit”. They claim that a reconstruction with bristlecones, on the other hand, has unique statistical and climatological merit, even if the bristlecones are contaminated with CO2 fertilization. The short discussion in MBH99 is bait-and-switch. They raise the issue of CO2 fertilization, but ddo not adjust any MBH98 results, but suggest to readers now that they did. The CO2 adjustment in MBH99 is bogus – it adjusts 19th century values and argues that the CO2 fertilization effect was “saturated” in the 20th century.

There’s lots of hair on Moberg, which I hope to get to.
Chuck Noblett

Posted Jul 23, 2005 at 10:05 PM | Permalink

In 1998, Mann could never have dreamed this would be examined as it has been.

He should have just been truthful re R2 from the beginning.
TCO

Posted Sep 21, 2005 at 11:20 AM | Permalink

The answer is tendentious and non-responsive. Mann decides to answer a different question than what was asked. What was asked is “did you do this”, not “should one do this”. He is so…ducking and weaving. Someone who behaves like this, is likely tendentious in science work as well. Very troublesome. I think that is why some of the “team” is distancing themselves from Mann.
Dave Dardinger

Posted Sep 21, 2005 at 11:31 AM | Permalink

And that, perhaps explains somewhat why RealClimate hasn’t attracted any sort of active audience such as this site has. The Hockey Team has to be so careful about what they will or won’t talk about that what’s left is a discussion of Hurricane Katrina and other pop-culture subjects.
TCO

Posted Sep 21, 2005 at 11:37 AM | Permalink

This site is looser. I respect the intellect and am interested in the comments of the moderators there. Some of the audience is pretty simpleminded fanboy (we have some on our side too).

Really wish that Gavin or the like would come over here more. Like to learn from him and think he would push his own thinking. Sometimes, I worry that those guys let their politics get in the way of being curious Feynman type scientists. (I occasionaly gig Steve for that too…)
John G. Bell

Posted Sep 21, 2005 at 12:14 PM | Permalink

People may fail you. Devote yourself to truth and beauty or some like ideals and support people to the extent they share your respect for these concepts. I always thought an oath to the queen lame. An oath to preserve and protect the constitution, rational. To the extent I do the fanboy, it feels wrong. Like making a chess move that I know is inferior. It is to fail oneself.
TCO

Posted Sep 21, 2005 at 12:21 PM | Permalink

That’s why we got rid of the monarchy and heriditary nobility.