NASA blogger Gavin Schmidt as part of his ongoing attempt to rehabilitate Mannian paleoclimate reconstructions, characterized here as dendro-phrenology, has drawn attention to a graphic posted up at Mann’s website in November 2009. In this graphic, Mann responded to criticisms that his “no-dendro” stick had been contaminated by bridge-building sediments despite warnings from the author (warnings noted by Mann himself but the contaminated data was used anyway.) I’ll show this figure at the end of the post, but first I’m going to show the “raw materials” for this “reconstruction” and my results from the same data.
I’m going to show a lot of plots of “proxies” today. The intuitive idea of a proxy is that the thing being measured (tree ring width, sediment thickness, ice core O18, etc) has a linear relationship with a temperature “signal” plus low-order red noise. Therefore, if the temperature “signal” is a hockey stick, the various proxy plots should look like a hockey stick plus low-order red-noise. I encourage readers to look at the no-dendro no-Tilj data for Mann’s November 2009 example with that in mind. If the topics were being discussed by proper statisticians, the properties of the “noise” would be discussed, rather than ignored.
To illustrate the calculation, I’ve picked the AD1000 Mann 2008 data set as an example since it covers the MWP. I’ve used the late-miss version (calibration 1859-1949) to work through, since it will give a look at any potential “divergence problems” in non-dendro data.
There were 29 “proxies” in the data set- 11 sediments, 2 “documentary” (both Chinese), 9 speleo and 7 ice core. Eleven of these were annually resolved; the other 18 were “decadal” resolution. 22 were NH; 7 SH.
The first step in Mann’s algorithm is determining the orientation of speleo and documentary proxies through their after-the-fact correlation to instrumental data. (The orientation of other proxies is presumed to be known a priori). In this network, there were 11 speleo+documentary proxies and 5 of 11 were flipped. (Interestingly, it is possible in Mann’s algorithm for the same proxy to have opposite “significant” orientations depending on the calibration period.)
The next step is to screen out proxies that do not have a “significant” correlation to gridcell temperature. Although we’ve heard much invective against the meaningful of r^2 statistics from Mann, Schmidt and others in the context of MBH98, Mann then uses correlation (r) to screen series in Mann et al 2008. (Perhaps it is the squaring of the correlation statistic that Schmidt takes exception to.)
There were 16 proxies that “passed” Mannian significance: – 3 of 11 sediments, both “documentary (Chinese), 7 of 9 speleo and 4 of 7 ice cores. Seven of 11 annually resolved passed; nine of 18 decadally resolved passed. 12 of 22 NH passed; 4 of 7 SH passed.
In the figure below, I’ve plotted all 22 NH “proxies” (standardized), coloring the “rejected” proxies in green. I don’t think that anyone can reasonably look at these 22 series and say that the individual “proxies” can be reasonably interpreted as different linear transformations of a Hockey Stick plus low-order AR1 red noise or that the individual proxies look much like one another. They are a hodge-podge to say the least. This is the problem of proxy inconsistency that I’ve talked about frequently and that Ross and I reported in our comment at PNAS in Mann 2008. Mann either didn’t understand or pretended not to understand the problem, which is fundamental to the entire enterprise of proxy reconstructions and readily apparent merely by plotting the “proxies”.
While “ex post screening” by correlation is accepted as a given by realclimatescientists, ex-post screening by correlation is not a statistical procedure that is recommended or discussed in Draper and Smith or standard statistical texts. The tendency of this procedure to produce sticks from red noise is well known in the technical blogosphere (Jeff Id, David Stockwell, Lubos Motl and myself have all more or less independently noticed and reported the phenomenon, with David publishing a short note in an Australian mining newsletter that Ross and I cited in our PNAS comment. However professional climate scientists appear unaware of the effect and it remains unreported in the PeerReviewedLiterature.
The top left proxy (192) is an interesting one. It is Baker’s speleothem record from Scotland that was discussed at CA in early 2009 and here as an interesting example of Upside-Down Mann. In the orientation applied in Mann’s no-dendro no-Tiljander reconstruction endorsed by Gavin Schmidt, Scotland is shown as having experienced the unique phenomena of the Medieval Cold Period and Little Warm Age – bizarro Hubert Lamb, as it were.
The “proxies” show little evidence of an overall pattern, let alone a Stick.
Next, here is a summary plot of the 12 NH “proxies” that “pass” Mannian screening, this time showing flipped proxies shown in red. The top left proxy is still the speleothem with the Scottish Medieval Cold Period and Little Warm Age. This is the same as the above graphic where proxies are accepted. The proxy with the hockey stick shape here is Fisher’s Agassiz, Ellesmere Island melt series, a proxy which has been around for a long time, used in Bradley and Jones 1993, for example.
In hte next step in Mann 2008 CPS, the series are Mann-smoothed (Butterworth filter plus Mann endpoints). The smoothed series are then re-standardized on the (short) calibration period. The smoothing of the ternary series in the third column ( a Chinese documentary series) has an interesting effect.
The proxy series are then averaged within a gridcell. You’ll notice that some gridcells are identical. This results because the Mann algorithm contains what I called (in 2008) a “stupid pet trick” – if Mann transcribed the location of a proxy as being exactly on the border of a gridcell (e.g. 25E), the proxy is allocated to both gridcells, in effect doubling the weight of the proxy. In the case of the Socotra stalagmite, the stalagmite is not actually located at 25E and the doubling occurs only because of a transcription error – not that the doubling makes any sense in the first place.
The gridded data are then re-centered and re-scaled to match the mean and standard deviations of the corresponding gridcell instrumental data – thereby yielding an estimate of the gridcell temperature. The 12 NH gridcells are shown below.
Of the resulting 14 gridcells, 8 are north of 30N. Mann attempts to balance the weights through an odd Mannian mechanism of re-gridding the north of 30N cells into 10×10 cells, averaging the data within each gridcell. This reduces the number of gridcells from 14 (NH – 12) to 10 (NH – 8). The series with the Scottish Little Warm Age survives these various operations pretty much unscathed. These again are a sort of temperature estimate.
Mann then does a weighted average of the gridcells – weighting each by the cos (latitude) – to yield a NH (and SH) estimate.
The figure below shows the No-dendro No-Tilj for the AD1000 network, using Mannian methods endorsed by Gavin Schmidt.
Now here is the version at Mann’s website, which looks nothing like my emulation with the 29 proxies (16 screened) from the AD1000 no-Tilj no-dendro network.
What accounts for the difference? I’m pretty sure that this calculation is pretty close to the M08 calculation for the corresponding step. I’ve groundtruthed my R-emulation against Matlab intermediates calculated by UC and Jean S in 2008. Because the CPS calculations are, at the end of the day, weighted averages, the composite is going to bear some relationship to the proxies – hence the methodical plotting of intermediates at each step to benchmark the calculation. So while there’s always the possibility of a misstep in emulating Mannian calculations, I don’t see how such a misstep would alter the general shape of the AD1000 CPS calculation (since the general shape can be discerned in the average at each stage.)
Here’s where I think the difference lies. Mann’s graphics all show the results of spliced reconstructions rather than what you get with proxies going back to AD1000. The provenance of the network used in Mann’s November 2009 revision of a figure in his SI isn’t described as clearly as it might be. My interpretation of the figure is that the network includes 71 Luterbacher gridded European series which use instrumental temperature data.
It is my surmise that in its latter portion, the stick-ness of the “new” no_tilj no-dendro reconstruction derives from splicing the Luterbacher gridcell data (using instrumental data) onto the horrible no-dendro reconstruction. I’m not 100% sure of this, but that’s my surmise. I’ll experiment with the splicing steps on another occasion.
Make a stick, make a stick, Michael Mann
Make us a stick as only you can
Flip it and smooth it and pick it to be
In the report for IPCC.
Update Aug 1, 2010: Script is http://www.climateaudit.info/scripts/mann.2008/benchmark_manniancps_blog_20100730.txt . I added a couple of operations at the end to calculate the CPS from the composite shown in the post and to calculate verification stats. The script here is used to step through; it is wrapped in a function manniancps that reconciles perfectly through the regrid and very closely to the composite.