As it happens, the image Gore presented actually was Mann’s Hockey Stock, spliced together with the post-1850 instrumental record as if they were a single series, and not Thompson’s ice core data at all. See Al Gore and “Dr. Thompson’s Thermometer”.
In fact, Thompson and co-authors really had published an admittedly similar-looking series in Climate Change in 2003, that was supposed to have been the source of the Gore image. (See Gore Science “Advisor” Says That He Has No “Responsibility” for AIT Errors.) But despite Gore’s claim that it measured temperature, Thompson had made no attempt to calibrate it to temperature. Instead, it was an uncalibrated composite Z-score series based on oxygen isotope ratios for 6 ice cores, going back to 1000 AD.
In 2006, Thompson and co-authors published what should have been an even better series, based an additional seventh series and going back 2000 years, in the prestigious Proceedings of the National Academy of Sciences. This “Tropical Composite” Z-score series (TCZ) is shown below. Although again it is not actually calibrated to temperature, it does have the distinctive “Hockey Stick” shape, with only a faint Little Ice Age, and only a trace of a Medieval Warm Period. (Click on images for larger view.)
Limited data for Thompson’s article was posted at the time of its publication on the PNAS website. Although TCZ and two regional sub-indices go back 2000 years as decadal averages, data on the component ice core data only go back to 1600 AD, as 5-year averages. Nevertheless, this data should be adequate to determine how TCZ was constructed from the underlying series, and then to determine if it can be validly calibrated to temperature, and if so what the significance of the relationship is and what the confidence interval is for the reconstruction, if only back to 1600AD.
Figure 2 below shows Thompson’s posted data for the 4 Himalayan cores that were used, and Figure 3 shows his data for the 3 Andean cores. In both figures, the posted 5-year averages of δ18O isotope depletions have been aggregated to decadal averages for comparison to TCZ. Surprisingly, the raw data does not suggest the distinctive “Hockey Stick” shape of TCZ in Figure 1, with its record high in the decade of the 1990s. In particular, only two of the series extend completely into this decade, and of these only Quelccaya ends on a record high.
Thompson’s data on the PNAS website does include decadal data for the past 2000 years on two regional sub-indices, a Himalayan Composite Z-score index (HCZ) presumably based on the 4 Himalyan cores, and an Andean Composite Z-score index (ACZ) presumably based on the 3 Andean cores. These are shown in Figure 4 below, for 1600-2000 for comparison to Figures 2 and 3.
Thompson et al. (2006) do not indicate how TCZ was derived from the underlying data, but it should be possible to back whatever formula was used out from the published data and sub-series.
In my EE paper, I first regress TCZ on a constant, HCZ and ACZ, and find that the constant is essentially zero, while the weights on HCZ and ACZ are both essentially 0.5. The t-statistics on the HCZ and ACZ coefficients were an off-the-charts 567.49 and 805.64, respectively. The R2 is 0.999985, or essentially unity, while the standard deviation of the regression errors is 0.0028. Since the three indices are only tabulated to 2 decimal places, rounding error alone could account for a standard deviation of s = 0.01/sqrt(12) = 0.0029, so the regression is essentially a perfect fit to within rounding error.
The formula for TCZ was therefore simply TCZ = (HCZ + ACZ)/2. It is not clear why one would want to average Z-scores in this manner, but at least it is a well-defined and replicable formula.
If HCZ and ACZ were any linear function of their component series, it should similarly be possible to use a linear regression to back these relationships out of the data with comparable R2 and s.
In a regression of HCZ on Guliya, Puruogangri, Dunde and Dasuopu, R2 was only 0.7673, far less than expected, and s was 0.2897, or 100-fold larger than it could have been due to rounding error alone. Two of the series, Guliya and Dunde, were insignificantly different from zero (t = 0.41 and 1.14, resp.), and so may as well not have been used at all. The other two, Puruogangri and Dasuopu were significant (t = 3.29 and 6.90, resp.), but their t-stats fall far short of those that should have been obtained with any exact linear relationship.
If the individual isotope ratio series were calibrated to temperature before aggregation into the regional indices, one would expect unequal, and possibly even zero, coefficients on the series, so that there is nothing per se wrong with not using Guliya and Dunde. Nevertheless, if the index really was based on these series, whatever coefficients were used should have come through in this regression.
In order to check if time-varying coefficients were used, I split the full sample of 37 decades for which all series were available into four subperiods of size 10, 9, 9, and 9. I found that even Puruogangri was not significant at any level worth mentioning, except in the last subperiod (1890-1979). In the third subperiod (1800-1879), not a single one of the four slope coefficients had a t-statistic greater than 1 in absolute value, and the hypothesis that all four were zero could not be rejected with an F-test (p = 0.399).
The results for the Andean composite ACZ were less incoherent, but still unacceptable. R2 was 0.9871, which is big for an ordinary regression, but far less than expected for what should have been an exact fit. The error standard deviation s was 0.1032, which is still 30-fold larger than can be explained by rounding error alone. Although the t-statistics were not close to those in the first regression, a clear pattern emerged that Quelccaya and Huascaran received approximately equal weights, while Sajama received about half their common value.
I naturally first attempted to clear up this apparent discrepancy informally, by e-mailing Lonnie Thompson and most of his coauthors on Jan. 23, Jan 26, and again on Feb. 6, 2008, but received no reply from any of them. An abstract of my note, with details in an online Supplementary Information, was then submitted as a letter to PNAS, but was rejected on the grounds that PNAS does not publish corrections to articles more than three months old! Energy and Environment kindly agreed to publish my comment instead.
It is conceivable that TCZ in the posted data was constructed from an already obsolete or less reliable version of the ice core data. If so, Thompson et al. should provide PNAS a corrected version of TCZ, HCZ and ACZ that is actually based on the posted ice core data. Or, if the posted ice core data was already obsolete or considered less reliable when TCZ was constructed from better data, they should instead provide a corrected verion of it, so that the relationship of TCZ to its actual component series can be confirmed. In either case, the ice core data should be extended back to 0AD to include all the data that was used in constructing TCZ, in order to permit replication of the pre-1600 portion of this now-questionable series, as well as its calibration to instrumental global temperature.
It should be noted that already in 2007 Steve McIntyre appealed to PNAS to ask Thompson to post complete data for this paper, but with no success. See “More Evasion by Thompson” and preceding threads. Thompson’s ice core data is a potentially valuable source of information about pre-instrumental global climate. It is most unfortunate for science that he has not posted complete and consistent versions of all his data. [See, in particular, “More on Guliya” and “IPCC and the Dunde Variations” and preceding posts. — HM]
Until Thompson either corrects the TCZ series or provides corrected ice core data that is consistent with it, no credence can be placed in this supposedly improved version of “Dr. Thompson’s Thermometer”. And until PNAS insists that he post consistent and complete data, it can have no more credibility than that famous science humor publication, The Journal of Irreproducible Results.
Update based on May 1, 2009 Comment #123 below:
It turns out that the root of the problem is that the decadal average Z-scores shown in Thompson's Figure 6 and my Figure 1 above, and tabulated back to year 0 in Data Set 3, are not the decadal averages of the 5-year average Z-scores shown back to 1600 in Thompson's Figure 5 and in his Data Set 2:
Evidently different data was used for Figure 6 (which Thompson compares approvingly to the MBH99 HS) than for Figure 5, even though the article and SI give no indication that this was the case.
The problem is not isolated to the data for just one region, since the two regional indices have the same problem:
It turns out that the Data Set 2 composite scores really can be constructed as linear combinations of the Data Set 2 core data, with all coefficients for the appropriate regions highly significant as if from an exact fit to within rounding error. The included coefficients are similar but unequal, as if z-scores were computed from the d18O data and then averaged for each region.
However, the problem cannot just be that Figure 6 was computed from the same data but using standard errors for the 2000-year period instead of the 400-year period of Figure 5, since then there would still be an exact (though different) linear relation between the Figure 6 composites and decadal averages of the Figure 5 core data. As I show in my EE paper, no such linear relation exists.
So the problem still persists — which is the right data to use to construct an ice core temperature proxy? The data of Figure 5 (which at least is identified by core number in DS2), or the (unspecified) data of Figure 6?
Since Figure 5 is not representative of the data that was used to construct Figure 6, Thompson has failed to provide, as required by the PNAS data policy, sufficient data to replicate his results. At a minimum he should identify the archived cores that were used to construct Figure 6, and provide detailed data for any unarchived cores that were used.