Gavin Schmidt and others have claimed that the M08 usage of the Tiljander sediments didn’t ‘matter’, because they could “get’ a series that looked somewhat similar without the sediments. They’ve usually talked around the impact of the Tiljander series on the no-dendro reconstruction. But there are two pieces of information on this. A figure added to the SI of Mann et al 2008 showed a series said to be a no-tilj no-dendro version, about which Gavin said that it was similar to the original no-dendro version, thereby showing the incorrect M08 use of the Tiljander series didn’t “matter”. However, Gavin elsewhere observed that the SI to Mann et al 2009 reported that the withdrawing the Tiljander series from the no-dendro network resulted in the loss of 800 years of validation – something that is obviously relevant to the original M08 claim to have made a “significant” advance through their no-dendro network.
To better understand Gavin’s seemingly inconsistent claims, I re-examined my M08 CPS emulation – I had previously replicated much of this, but this time managed to get further, even decoding most of their (strange) splicing procedures. As I’ve done for MB98, I was able to keep track of the weights for individual proxies in the reconstruction – something not done in Mann’s original code, though obviously relevant to the reconstruction. This was not a small project, since you have to keep track of the weights through the various screening, rescaling, gridding, re-gridding steps – something that can only be done by re-doing the methodology pretty much from the foundations. However, I’m confident in my methods and the results are very interesting.
The first graphic below shows the NH and SH reconstructions on the left for the AD1000 network for the two calibration steps considered in M08: latem ( calibration 1850-1949) and earlym (calibration 1896-1995) for the “standard” M08 setup. On the right are “weight maps” for the latem and earlym networks. (The weight map here is a but muddy – I’ve placed a somewhat better rendering of the 4 weight maps online here.) The +-signs show the locations of proxies which are not used in the reconstruction. Take a quick look and I’ll comment below.
First, there are obviously a lot of unused series in the M08 with-dendro reconstruction. Remarkably, the exclusions are nearly all dendro series. Out of 19 NH dendro chronologies, 16 dendro chronologies are not used; only three NH dendro chronologies are used: one Graybill bristlecone chronology (nv512) in SW USA; Briffa’s Tornetrask, Sweden and Jacoby-D’Arrigo’s Mongolia, all three of which are staples of the AR4 spaghetti graphs. Only one of 10 Graybill bristlecone chronologies “passes” screening.
In other words, nearly all of the proxies in the AD1000 network are “no-dendro” proxies. I.e. , the supposedly improved “validation” of the with-dendro network arises not because of general contribution of dendro chronologies to recovery of a climate signal, but because of the individual contribution of three dendro series with the other 16 series screened out.
Secondly, the reconstructions are weighted averages of the individual reconstructions. The latem and earlym reconstructions don’t appear at first glance to have remarkably different weights, but have noticeably different appearances. In the NH, the earlym 20th century is at levels that correspond to levels that were precedented in the MWP, while the latem reconstruction has higher values in the 20th century than the MWP – BUT a marked divergence problem. This divergence problem results in a very low RE for the latem version (about 0), while the earlym version has a RE of 0.84. The earlym SH reconstruction has MWP values that are much higher than late 20th century values, while the latem SH reconstruction has MWP values that are lower than late 20th century values.
Values of the latem RE statistic appear to be an important determinant of Mannian-style “validation” – more on this later.
As a fourth point – back in 2008, I’d noted that the M08 algorithm permitted the same proxy to have opposite orientation depending on calibration period and that at least one proxy did this. Note the Socotra (Yemen) speleothem in the weight map. This has opposite orientations in the two reconstructions – something that seems hard to justify on a priori reasoning and which appears to have a noticeable impact on the differing appearance of the two reconstructions.
In the SH, there are obviously only a few relevant proxies. The big weight comes from Lonnie Thompson’s Quelccaya (Peru) data, with other contributions from Cook’s dendro series in Tasmania and New Zealand (the Tasmania series being an AR4 staple) and from a South Aftican speleothem.
Other NH series include Baker’s Scottish speleothem (used upside down from the orientation in Baker’s article), Crete (Greenland) ice cores – an AR4 staple, the old Fisher Agassiz (Ellesmere Island) melt series (used in Bradley and Jones 1993), the Dongge (China) speleothem, the Tan (China) speleothem. A number of these proxies have been discussed at CA.
One of the large issues in respect to MBH98-99 was the impact of bristlecones. Eventually, even Wahl and Ammann conceded that an MBH-style reconstruction did not “verify” prior to 1450 at the earliest without Graybill bristlecones. However, for the most part, the Team avoided talking about bristlecones, most often trying to equate no-bristle (or even no-Graybill) sensitivities with no-dendro sensitivity. Over-generalizing criticisms of bristlecone chronologies to criticism of all dendro chronologies. M08 adopted the same tactic – discussing no-dendro, rather than no-bristle (which was the actual point at issue.)
I’ve done experiments calculating M08 style CPS reconstructions with first no-tilj and then with no-tilj no-bristle. At this point, no=tilj should be the base case for an M08 style result – as there is no scientific justification for including this data in an M08 style algorithm: it doesn’t meet any plausible criteria for inclusion.
Below are results for the no-tilj no-bristle case. At first glance, the shape of the recons looks fairly similar to the M08 case. In detail, there are some important differences: for example, the divergence problem in the no-tilj no-dendro latem reconstruction is much more pronounced than in the M08 reconstruction where the huge ramp of the Tilj sediments and the bristlecones mitigates the divergence problem considerably.
These differences arise with relatively little difference in the relative weights of the other proxies.
M08 has a very unique methodology for splicing reconstruction steps – one which you definitely can’t read about in Draper and Smith. First they calculate RE statistics for latem and earlym reconstructions. In the figure below, I’ve plotted latem and earlym RE statistics for the different steps under three cases:
(1) M08 from their archive, shown as a line
(2) no-tilj (using my emulation of M08) shown as + signs.
(3) no-tilj no-bristle shown as “o”. As noted above “nobristle” in this context only involves one series (nv512.)
The latem RE stat decreases quite dramatically without the Tilj and nv512 data sets.
This sharp decline in latem RE statistic ends up affecting the rather weird M08 “validation” method. From the earlym and latem RE statistics, Mann calculated an “average” RE statistic – this is another ad hoc and unheard of method. If the “average” RE statistic is above a benchmark that looks like it’s about 0.35 (note that this benchmark is a far cry from the benchmark of 0 used in MBH98 and Wahl and Ammann 2007 – one that we criticized in our 2005 articles – more on this on another occasion), the series is said to “validate”. If the addition of more data in a step fails to increase the average RE (and the average CE), then the earlier version with fewer data is used. This is “justified” in the name of avoiding overfitting, but this is actually an extra fitting step based on RE statistics.
In any event, the reason why the no_tilj no-bristle ( and afortiori, no-tilk no-dendro) fails to “validate” prior to AD1500 or so is simply that the latem RE statistic becomes negative due to the divergence problem – without the Tilj series and nv512. (I haven’t studied the EIV/RegEM setup, but I suspect that the same sort of thing is what’s causing its failure as well.)
In a way, the situation is remarkably similar to the MBH98 situation and bristlecone sensitivity. One point on which Mann et al and ourselves were in agreement was that the AD1400 MBH98 reconstruction failed their RE test without bristlecones. (And also earlier steps.) In the terminology of M08, without bristlecones, they did not have a “validated” reconstruction as at AD1400 and thus could not make a modern-medieval comparison with the claimed statistical confidence.
Ironically, the situation in M08 appears to be almost identical. Once the Tilj proxies are unpeeled, Mann once again doesn’t have a “validated” reconstruction prior to AD1500 or so, and thus cannot make a modern-medieval comparison with the claimed statistical confidence. (BY saying this, I do not agree that his later comparisons mean anything.; however, they don’t “matter’ for the modern-medeival comparison.)