Jean S pointed out the following quote from the NAS Report and suggested that this be discussed:
The first systematic, statistically based synthesis of multiple climate proxies was carried out in 1998 by M.E.Mann, R.S.Bradley and M.K. Hughes (Mann et al. 1998); their study focused on temperature for the last 600 years in the Northern Hemisphere. The analysis was later extended to cover the last 1,000 years (Mann et al. 1999), and the results were incorporated into the 2001 report of the Intergovernmental Panel on Climate Change (IPCC 2001)."
Jean S drew attention to the following comment from my report on Hughes at NAS:
"Hughes said that the calculation of a global or hemispheric mean was a ‘somewhat tedious afterthought’ to these spatiotemporal maps. He cited Groveman 1979; and Bradley and Jones 1993 as originating the practice."
Here’s a discussion of Groveman and Landsberg 1979; I’ll discuss Bradley and Jones 1993 on another occasion. The conclusion is that MBH is NOT the "first systematic, statistically based synthesis of multiple climate proxies", as both the studies mentioned by Hughes would fit that description. I’ll discuss what the distinctive "contribution" of MBH actually was on another occasion.
MBH98 itself did not claim to be the "first" multiproxy study; they described themselves only as providing a “new statistical approach” and actually cited Bradley and Jones and several other studies (but not Groveman and Landsberg 1979):
A variety of studies [Barnett et a 1995; Mann et al 1995; Bradley and Jones 1993; Hughes and Diaz 1994; Diaz and Pulwary 1994] have sought to use a “Åmultiproxy’ approach to understand long-term climate variations, by analysing a widely distributed set of proxy and instrumental climate indicators1,5–8 to yield insights into long term global climate variations. Building on such past studies, we take a new statistical approach to reconstructing global patterns of annual temperature back through the 15th century, based on the calibration of multiproxy data networks by the dominant patterns of temperature variability in the instrumental record."
Jean S observed that Groveman and Landsberg 1979 is not cited in Bradley and Jones (1993), but is cited in Bradley and Jones (1995) as follows:
"The resulting time series [Bradley and Jones 1993] back to 1400 provides the best reconstruction of Northern Hemisphere "summer" conditions currently available. Many other review papers have intercompared reconstructions in the past (e.g. Williams and Wigley, 1983), but only one other composite series for the last few centuries had been published (Groveman and Landsberg, 1979). However, this series combines a number of records that are poorly calibrated in terms of climate response, leading to a composite series that is difficult to interpret."
Crowley, who, as one of the peer reviewers for NAS, presumably passed on the above statement by the NAS Panel, considered both the Groveman and Landsberg 1979 and Bradley-Jones 1993 NH reconstructions in Crowley and Kim (GRL 1996).
Groveman and Landsberg 1979 was recently considered in Thejll and Schmidt 2005 in the context of distinguishing OLS and Cochran-Orcutt methods.
The Groveman and Landsberg NH reconstruction has also been used in some solar correlation studies and was considered last year byThejll and Schmith 2005.
Jean S concludes that Groveman and Landsberg is "pretty much similar to HT methods, so this should get the "pioneering status" out from the hockey team."
Groveman and Landsberg 1979
So what did Groveman and Landsberg 1979 do? They questioned whether the NH temperature could be reconstructed from a small subset. They used 20 series, of which all but 3 were instrumental series. (The only other proxy study to use a lot of instrumental temperature series as “proxies” is, ahem, MBH98, which used 12 instrumental series.) Most of the G-L instrumental series are attributed to Bozenkova et al (1976 – Meteorologiya I Gidrologiya 7, 27-35). The three G-L proxy series were Alaska tree ring widths (Karlstrom 1961, Ann NY Acad Sci 95, 290), Finnish tree ring index (Siren 1961 — Comm Inst Forestalis Fenniae 54) and Tokyo winter temperatures (Gray 1974 — Weather 29, 103).
Their multivariate method can hardly be recommended. They do separate calibrations for each step in which the network changes. For each step, they do a multiple linear regression of the NH temperature index against the proxies, retaining “significant” values. In the 1872-1880 step, 8 “predictors” are selected for the reconstruction. In the earliest step (1579-1658), 3 predictors are available and used. Thjell and Schmith say that 28 such intervals occur." Groveman and Landsberg describe the procedure as follows:
A number of long time series showing the highest correlations with the NH temperature for the simultaneous periods of record 1881-1960 were selected…Selections from these series were combined in multiple linear regressions maximizing the explained variance. These regression equations were then used to reconstruct earlier hemispheric temperatures. For the first interval prior to 1881 (1872-1880) eight were chosen. The multiple correlation coefficient of this regression ws r=0.882, corresponding to an explained variance of nearly 78%, each independent variable contributing a significant portion of the variance…
The choice of independent variables is not only dictated by their statistical significance but also by the length of record prior to 1881, to permit as much reconstruction backward as possible. This procedure resulted in other combinations of variables to reduce the error of estimate and enhance the correlation in various time segments of the reconstruction…With a shrinking data base from 8 independent variables (1872-1880) to three (1579-1658( the standard error of estimate rose from 0.110 to 0.162 . The presently available data prior to 1579 would explain less than 50% of the variance and hence no further reconstruction was attempted.
Whether one puts any credence in the standard error of their estimates (I don’t), they did this 20 years before MBH98 put standard errors to their reconstruction. They concluded:
This reconstruction has demonstrated feasibility of estimating a mean hemispheric temperatures by using a statistical approach.
I’ve only had access to Groveman and Landsberg 1979, which reports coefficients from only one step (1872-1880). This is interesting because it illustrates an important point — the lack of regularity in the coefficients:
Table 1. Groveman and Landsberg Regression Coeffficients 1872-1880 Step
|Finland tree rings||0.0003|
I don’t know what units the Finnish tree rings are in. But note that two instrumental series (Vienna, De Bilt have negative coefficients) It’s obviously unreasonable in some sense that Vienna temperatures should be believed to have a negative correlation with NH temperature, while Innsbruck temperatures have a positive correlation. This shows that we’re simply getting spurious fits and showing why it’s not a good idea to do multiple inverse regressions. Please note the behavior of these coefficients as I’m going to discuss this in the context of VZ pseudoproxies.
It’s hard to figure out why Thejll and Schnmith re-visited Groveman and Landsberg 1979, other than perhaps because it was non-controversial. They re-examine the GL regressions and observed that the residuals from the regression were autocorrelated. Their illustration of the GL reconstruction and their version using Cochrane-Orcutt methods is shown below.
From Thejll and Schmith, 2005.
In my opinion, the inconsistent OLS regression coefficients are conclusive evidence of overfitting. (MBH also has some negative weights for instrumental series!) I don’t know what the Thejll and Schmith coefficients and don’t plan to do any more work on this data set. I’ll posting up on OLS methods as applied to a VZ pseudoproxy network, where inconsistent coefficients are also obtained. Keep the inconsistent G-L coefficients in mind. I’m surprised that this study has been used in solar correlation studies. To the extent that studies of forcing factors have relied on Groveman and Landsberg 1979, they need to be re-visited and no longer quoted as the series itself has poor statistical properties.