If anyone feels like sticking needles in their eyes, I’d appreciate assistance in trying to figure out Mannian verification statistics. Even when Mann posts up his code, replication is never easy since they never bothered to ensure that the frigging code works. Or maybe they checked to see that it didn’t work. UC’s first post on the matter wondered where the file c:\scozztemann3\newtemp\nhhinfxxxhad was. We still have no idea. This file is referred to in the horrendously written verification stats program and it may be relevant.
With UC’s help, I’ve been able to replicate quite a bit of the CPS program (the EIV module remains a mystery.)
I’ve been testing verification stats with the SH iHAD reconstruction. I mentioned previously that Mannian splicing does not always use larger proxy networks if they get “better” RE stats with fewer proxies. This Mannian piece of cherry picking is justified in the name of avoiding “overfitting” although it is actually just the opposite. It reminds me of the wonderful quote from Esper 2003 (discussed here):
this does not mean that one could not improve a chronology by reducing the number of series used if the purpose of removing samples is to enhance a desired signal. The ability to pick and choose which samples to use is an advantage unique to dendroclimatology.
Mining promoters would like a similar advantage, but, for some reason, securities commissions require mining promoters to disclose all their results.
Mann’s reconstruction archive in 2008, as with MBH98, only shows spliced versions – some habits never change, I guess. But in the SH iHAD run, the AD1000 network remains in use right through to the 20th century, with all proxies starting later than AD1000 being ignored – all in the name of not “overfitting”. But the long run of values from a consistent network is very handy for benchmarking and, with much help from UC’s Matlab runs, I’ve managed to very closely replicate the SH iHAD reconstruction from first principles, as shown below – this graphic compares a version archived at Mann’s FTP site with my emulation.
You can upload an original digital version of this reconstruction (1000-1995) as follows:
A digital version of the “target” instrumental is also at Mann’s website and can be downloaded as follows:
The reported verification statistics for the SH iHAD reconstruction are also archived and can be downloaded as follows (load the package indicated). BTW this is a nice package for reading Excel sheets into R.
test=read.xls( “temp.xls”,colNames = TRUE,sheet = 14,type = “data.frame”,colClasses=”numeric”)
name1=c(“century”, c(t( outer(c(“early”,”late”,”average”,”adjusted”),c(“RE”,”CE”,”r2″), function(x,y) paste(x,y,sep=”_”) ) )) )
# century early_RE early_CE early_r2 late_RE late_CE late_r2 average_RE average_CE average_r2 adjusted_RE adjusted_CE adjusted_r2
# 1000 0.0746 -1.663 0.3552 0.7194 0.1475 0.303 0.397 -0.758 0.3291 0.397 -0.758 0.3291
Given digital versions of the reconstruction and the “target”, it should be simplicity itself to obtain standard dendro verification statistics. But, hey, this is hardcore Team. First, Mann does some Mannian smoothing of the instrumental target. Well, we’ve managed to replicate Mannian smoothing and can follow him through this briar patch.
library(signal) # used for smoothing and must be installed
cutfreq=.1;ipts=10 #ipts set as 10 in Mann lowpass
smooth=ts( mannsmooth(target,M=npad,bwf=bf ) ,start=1850)
Now the “early miss” verification stats using a simple (and well-tested) program to do the calculations:
rbind( unlist(stat[ stat$century==1000, grep(“early”,names(stat)) ]), unlist(verification.stats(estimator=estimate,observed=smooth,calibration=c(1896,1995),verification=c(1850,1895))[c(2,5,4)]
# early_RE early_CE early_r2
#[1,] 0.0745930 -1.6633600 0.3551960
#[2,] 0.2883958 -0.8940804 0.1432888
And for the “late-miss” stats:
rbind( unlist(stat[ stat$century==1000, grep(“late”,names(stat)) ]),
# late_RE late_CE late_r2
#[1,] 0.719441 0.1474790 0.3030280
#[2,] 0.804556 0.4111129 0.4549566
These should match the early_ and late_ values, but don’t. The inability to replicate the r2 values is particularly troubling, since these are not affected by the various scaling transformations. I simply haven’t been able to get the reported verification r2 values using many permutations.
Since the reconstruction ties together both to digital and graphic versions, perhaps the archived instrumental version http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/data/instrument/iHAD_SH_reform is not the same as the c:\scozztemann3\newtemp\shhinfxxxhad .
The code for the verification stats is at
http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeveri/veri1950_1995sm.m and http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeveri/veri1850_1895sm.m. They seem to have learned their programming style from Hansen, as the code is replete with steps that don’t seem to have any function, unhelpful comments made less helpful in places by inaccuracy and, most of all, by an almost total lack of mathematical understanding and organization in implementing the code.