Osborn and Briffa  , published today in Science, cannot be considered as an “independent” validation of Hockey Stick climate theories, because it simply re-cycles 14 proxies, some of them very questionable, which have been repeatedly used in other “Hockey Team” studies, including, remarkably, 2 separate uses of the controversial bristlecone/foxtail tree ring data.
Also even more remarkably, they have perpetuated the use of Mann’s erroneous principal components method in one of their key proxies.
Peer reviewers and editors at Science have failed to ensure compliance by Osborn and Briffa with journal data archiving policies, a frequent defect in paleoclimate reviewers for Science, as data for the study is not archived, nor is much of the source data.
Of the source data which is archived, some is password protected, presumably for international security. Within the available record, many peculiar inconsistencies can be observed affecting both this study and Esper et al , a study previously published in Science also with a non-existent data archive.
Not only are the 14 proxies used in O&B not independent of prior studies, in fact, they are composed entirely of proxies repeatedly used in previous studies. Astonishingly, 2 of the 14 proxies (2 of only 10 in the Medieval Warm Period) are bristlecone/foxtail pines, despite the fact that these are precisely the proxies that have most been called into question in connection with the work of Mann et al.
Nor are Osborn and Briffa independent authors. Both are members of a group of scientists, self-identified as the Hockey Team. They have both recently co-authored a reconstruction with Mann, Bradley and Hughes [Rutherford et al, 2005], and their close associate, Philip Jones, has co-authored still other studies with Mann. Rutherford et al.  is cited, but footnote (13) fails to disclose their co-authorship.
Briffa is lead author on millennial reconstructions for the IPCC 4th Assessment Report, presently under review. Von Storch (see http://sciencepolicy.colorado.edu/prometheus/archives/climate_change/000486hans_von_storch_on_b.html ) has queried the propriety and wisdom of IPCC using review authors who are engaged in controversy in the literature on a personal basis and end up reviewing their own work (as happened with Mann in IPCC TAR).
The IPCC has failed to ensure that the assessment reports, which shall review the existing published knowledge and knowledge claims, should have been prepared by scientists not significantly involved in the research themselves. Instead, the IPCC has chosen to invite scientists, who dominate the debate about the considered issues, to participate in the assessment. This was already in the Second Assessment Report a contested problem, and the IPCC would have done better in inviting other, considerably more independent scientists for this task. Instead, the IPCC has asked scientists like Professor Mann to review his own work. This does not represent an “independent” review.
Here we have another instance — this time with Briffa. The IPCC practice seems particularly unwise in this case, since the offering from Osborn and Briffa is weakly argued.
The O&B article vividly illustrates the weakness both of peer reviewing and editorial decision-making in the paleoclimate area at Science, the prominent journal presently reeling from the Hwang stem cell scandal. It highlights a failure to implement their own policies on data archiving and failures to verify claims in the article itself.
Data Archiving and Versions
On paper, Science has exemplary data archiving policies (see http://www.sciencemag.org/feature/contribinfo/prep/gen_info.dtl#datadep), which seem to require paleoclimate authors to provide an archive sufficient to replicate their results:
Science supports the efforts of databases that aggregate published data for the use of the scientific community. Therefore, before publication, large data sets … must be deposited in an approved database and an accession number provided for inclusion in the published paper.
In multiproxy paleoclimate studies, it is essential to archive the data as used even if the data appears to be in the public domain (since versions can vary), but this is not done here. This failure is exacerbated because O&B rely on 5 series from a study previously published in Science [Esper et al, 2002], where Science also failed to require data archiving, and on 1 proxy [Yang et al., 2002] which relies on 2 ice cores by L. Thompson also published in Science (Dunde, Guliya), on which no information was archived prior to my requests to another journal (Climatic Change). I have been trying for a considerable period of time to get Science to require these authors to archive the data from the earlier studies, but these efforts have so far proved unsuccessful. The difficulties of trying to track down “grey” versions are vividly illustrated just with reference to Science publications.
Yang et al.  relied on a “grey” 50-year smoothed version of the Dunde and Guliya series. Unfortunately, these versions are dramatically inconsistent with the 10-year smoothed versions archived at my request last year (see http://www.climateaudit.org/?p=327). In order to reconcile the versions, one needs to examine sample information, which Science has thus far been unable to obtain. Because of various problems with the Yang et al  composite, Jones and Mann  decided not to use it. It is rather a surprise, to say the least, to see it re-surface here in O&B.
The versions of the Esper series are also impossible to sort out. Esper et al  did not provide an ITRDB identification for the Tyrol series; the identification number provided by O&B has data extending only to 1827 — which is inconsistent with Figure 2 and which would make validation of this series impossible. For the Quebec tree ring series (series 4), O&B cite Schweingruber (cana169) as a primary source for Esper et al , while Esper et al  acknowledge Payette and Ilion for data (who studied other sites). O&B say that the Quebec series ends in 1947, while the data in cana169 goes to 1989. There are many other similar problems.
Much of the underlying tree ring width measurements data is unarchived, affecting the following series: Yamal, Tornetrask, Taimyr, Icefields, Boreal, Upper Wright. Some of this information has been generated by the European Union “SOAP” project, financial support from which is acknowledged in the article. Osborn and Briffa head up this project. Civilians in the climate wars will undoubtedly be astonished to think that password security would be applied to tree ring data. However, this is the case (see http://www.cru.uea.ac.uk/cru/projects/soap/ ). Briffa has refused my requests for access to the password protected tree ring data.
Proxy Quality Control
O&B assert that they carried out quality control on the proxy records to ensure that each proxy was correlated to gridcell temperature. They singled out Soon and Baliunas  for allegedly failing to carry out such quality control procedures, although the same criticism should equally be brought against Mann et al  who not only used precipitation series as temperature proxies, but even used French precipitation series for American gridcells (“The rain in Maine falls mainly in the Seine”). One can only imagine the vituperation that would have issued had Soon and Baliunas committed a similar blunder, and we likewise wonder at how the supposed quality control procedures in place for Mann et al.  failed to identify such an obvious blunder.
In McIntyre and McKitrick [2005a, 2005b], Ross McKitrick and I pointed out the extraordinary dependence of the MBH98 (and MBH99) reconstruction on bristlecones/foxtails and pointed out many reasons why world temperature history should not be based on their ring widths. It is astonishing, therefore, to see the bristlecone/foxtails dominating not just one, but two proxies in O&B. Since 4 proxies do not extend back to the MWP, they make up 2 of 10 in the MWP.
It is beyond astonishing that O&B series 1 uses the discredited MBH principal components methodology [see McIntyre and McKitrick, 2005a, 2005b and endorsements of this aspect of our criticism in von Storch and Zorita, 2005 and Huybers 2005]. Jones and Mann  used the MBH98 principal components methodology, together with a curious and undocumented splice.
Although O&B reported that they carried out quality control on the Esper sites (resulting in the rejection of 4 sites for not having correlation to temperature–Mackenzie, Gotland, Jaemtland and Zhaschiviersk), they do not report similar quality control being carried out on either the PC1 from Mann and Jones /Jones and Mann  and simply repeat their claims of a decadal temperature correlation of 0.52. Jones and Mann  also claimed an annual correlation of 0.20 for this proxy.
This claim is obviously at odds with Lamarche et al.  and Graybill and Idso , who stated that the post-1900 pulse in bristlecone growth was uncorrelated to temperature. In this case, the quality control for the 6 sites can be readily checked as the 6 chronologies are all publicly archived, as is the HadCRU2 temperature dataset. (Collations are provided at http://www.climateaudit.org/data/osborn06/). Our check yielded very different results.
For the period 1870-1980, only one of the six sites had a even a slight positive correlation to gridcell temperature on an annual comparison (Sheep Mountain: 0.03), while the other sites all had negative correlations ranging from –0.14 to –0.36. The correlation of the MBH98-type PC1 to a weighted average of gridcell temperatures (weighted identically as the PC1) was –0.12. The undocumented “adjustment” of the PC1 in Jones and Mann  increased the correlation somewhat, but only to 0.03. These results are obviously at odds with the quality control claims.
The results, decadally smoothed, were little better: only one positive relationship (Sheep Mountain 0.06) while the others had negative correlations ranging from –0.11 to –0.32. For the MBH98-type PC1 against the weighted gridcell average similarly smoothed, the correlation was –0.35 (“adjusted -0.26).
So that there is no misunderstanding of the mismatch, the “fixed” PC1 and the gridcell temperatures are illustrated in Figure 1 below.
Figure 1. Red: Mann and Jones “fixed” PC1 (O&B series 1); black — weighted average of gridcell temperatures. Top — annual data; bottom — decadally smoothed with 13-year Gaussian filter.
It is difficult to contemplate how these particular sites could have survived the quality control procedures supposedly employed by O&B.
Figure 2 illustrates another ironic aspect to O&B proxy #1. The top panel shows the archived PC1 from Jones and Mann  (used by O&B), while the next two panels show the PC1s as calculated using the two possible options of the covariance matrix and the correlation matrix. For networks already expressed in common units, standard references [Rencher 2002], Overland and Preisendorfer 1982] recommend the covariance matrix method, but Huybers  argued for the correlation matrix. In both networks, the influence of Sheep Mountain is reduced and neither has a strong hockey stick. Early 20th century levels are as elevated as later.
Figure 2. PC1s for the AD200 North American network. Top — Jones and Mann ; middle — covariance matrix; bottom — correlation matrix.
However, the most fundamental problem with studies of this type is their failure to define proxy selection procedures on an ex ante basis. If white spruce or larch ring widths for treeline sites are believed to be a valid temperature proxy, then these chronologies should be collected and collated and reported. Authors should not ex post check on their correlation to local temperature and cherry pick sites with a “favourable” response, while failing to report or de-selecting sites with a seemingly “unfavourable” response. Why are the 4 rejected Esper sites no good?
The reason for not relying on correlation-based tests for proxy selection is that it easily fools the eye. The stock market is notoriously hard to forecast. Stock-pickers long dreamed of finding reliable ways of forecasting the stock market by finding data series that are easy to forecast, but which reliably correlate to the stock market, therefore yielding a reliable forecasting tool. But whenever they select series based on correlation statistics, they turn out to have lousy out-of-sample forecast properties for the stock market. Econometricians realize that autocorrelated series frequently yield what are called “spurious regression” results. The same problem bedevils paleoclimatology. There are hundreds, if not thousands, of tree ring sites. Briffa  has reported that the vast majority of these sites have failed to record increases after 1960, contrary to the hypothesis that there is a linear relationship between temperature and ring width.
In the O&B study, which builds on prior studies with similar selection procedures, from this large population in which ring widths and density (mostly) decline after 1960, a few sites are selected which nearly all have strong (and in some cases) very strong post-1960 growth: the bristlecones, Yamal, Sol Dav (Mongolia). The selections are hardly random. In determinations of statistical significance, the selection procedure needs to be modeled — an effect familiar to econometricians. [Ferson et al 2003].
While O&B purport to test for statistical significance using a Monte Carlo analysis, they fail to model the selection process. Their failure to do so makes their estimation of statistical significance totally worthless.
There are many other questions as to the validity of the O&B proxies. Their foxtail series (#3) — one of the key contributors to their exceedance statistics — is attributed to Lloyd and Graumlich , who state, contrary to O&B, that:
[the period] 950 to 550 years BP [...was] a period of warm temperatures (relative to the present) in which at least two severe, multidecadal droughts occurred
Lloyd and Graumlich discuss declines in treeline during the past millennium. Evidence of this decline can be seen in this recent photo of a subfossil medieval tree in alpine tundra well above the present treeline. Whatever the cause of the changes in treeline may be, O&B have presented no evidence that the foxtail ring width chronology (which cannot be identified in the citation anyway) is a valid proxy for long-term temperature changes.
Figure 3 (Original Caption): A dead trunk above current treeline from a foxtail pine that lived about 1000 years ago near Bighorn Plateau in Sequoia National Park
Similar questions arise in respect to their Siberian proxies. O&B have selected the Yamal series, which has strong 20th century growth contributing markedly to their exceedance statistics. In contrast to the selective picking of sites employed in O&B, Naurzbaev et al.  (which included MBH co-author Hughes) carried out a comprehensive study of 34 larch sites along a latitudinal transect and 22 larch sites along an altitudinal transect. They concluded:
Trees that lived at the upper (elevational) tree limit during the so-called Medieval Warm Epoch (from A.D. 900 to 1200) show annual and summer temperature warmer by 1.58 and 2.3 deg C, respectively, approximately one standard deviation of modern temperature. Note that these trees grew 150–200 m higher (1–1.28C cooler) than those at low elevation but the same latitude, implying that this may be an underestimate of the actual temperature difference.
That such different conclusions can seemingly arise from consideration of Siberian tree ring chronologies points to the need to avoid ex post selection criteria and the need to develop and consistently apply ex ante selection criteria.
Undoubtedly, more issues will arise with further study and, if and when the various data sets are archived. But some points on peer review are obvious.
First, it is clear that Science peer reviewers simply do not check any of the calculations in an article. Errors can arise in many ways without any fraud being involved and checking is always worthwhile. If Science peer reviewers are not going to check calculations themselves, then they (or Science editors) should ensure that authors comply with Science’s archiving obligations. Better yet, Science should adopt procedures that are best practices at economics journals, requiring authors to archive code as a pre-condition to publication.
In a recent publication in which both Briffa and Mann were coauthors (Rutherford et al. 2005), the authors have, to their credit, archived code for their calculations. Unfortunately for them, the code shows that they incorrectly collated the instrumental records into their calculations using MBH98 proxies, a simple error which unfortunately affects all their subsequent calculations in respect to MBH98 proxies incorrect (see http://www.climateaudit.org/?p=519 ). However, such incidents should not discourage authors from archiving their code (NB: the code for this review will be archived (see http://www.climateaudit.org/scripts/osborn06.replication.txt).
In the course of recently acting as a reviewer for the IPCC 4th Assessment Report, I asked the IPCC secretariat and the authors of certain proxy studies for data. The IPCC refused to provide this data to me on the grounds that the function of a reviewer was simply to ensure that IPCC properly recorded the findings as published in journals and not to carry out independent quality control or checking of papers relied upon by the IPCC — which they said was the province of the journals. Given that the journals are not discharging this function, the IPCC policy of denying data is simply breathtaking in its insouciance and a recipe for another fiasco like Mann et al .
Esper, Jan, Edward R. Cook, Fritz H. Schweingruber, 2002, Low-Frequency Signals in Long Tree-Ring Chronologies for Reconstructing Past Temperature Variability, Science 295, 2250 — 2253.
Ferson, W., S. Sarkissian and T Simin, 2003. Spurious regressions in financial economics, Journal of Finance, 58(4), 1393-1413;
Graybill, D.A., and S.B. Idso. 1993. Detecting the aerial fertilization effect of atmospheric CO2 enrichment in tree-ring chronologies. Global Biogeochemical Cycles 7:81-95.
Grissino-Mayer, Henri D., 1996. A 2129 year annual reconstruction of precipitation for northwestern New Mexico, USA. In Dean, J.S., Meko, D.M., and Swetnam, T.W., eds., Tree Rings, Environment, and Humanity. Radiocarbon 1996, The University of Arizona, Tucson: 191-204.
Huybers, P. (2005), Comment on “Hockey sticks, principal components and spurious significance” by McIntyre and McKitrick, GRL, L20705, doi: 10.1029/2005 GL023395.
Jones, P. D., and M. E. Mann (2004), Climate over past millennia, Rev. Geophys., 42, RG2002, doi:10.1029/2003RG000143.
LaMarche, V.C., D.A. Graybill, H.C. Fritts, and M.R. Rose., 1984. Increasing atmospheric carbon dioxide: tree ring evidence for growth enhancement in natural vegetation. Science 225:1019-1021.
Lloyd, A and L. Graumlich, Ecology 78, 1199.
Mann, M.E., Jones, P.D., Global surface temperature over the past two millennia, Geophysical Research Letters, 30 (15), 1820, doi: 10.1029/2003GL017814, 2003.
Mann, M.E., R.S. Bradley and M.K. Hughes (1998), Global-scale temperature patterns and climate forcing over the past six centuries, Nature, 392, 779-787.
McIntyre, S. and R. McKitrick (2005b), The M&M Critique of the MBH98 Northern Hemisphere Climate Index: Update and Implications, Energy and Environment, 16, 69-99.
McIntyre, S. and R. McKitrick, (2005a), Hockey Sticks, Principal Components and Spurious Significance, GRL, 32, L03710, doi:10.1029/2004GL021750.
McIntyre, S., and R. McKitrick (2005c), Reply to comment by Huybers, GRL, 32, L20713, doi:10.1029/2005GL023586.
Naurzbaev, Mukhtar M., Malcolm K. Hughes, Eugene A. Vaganov, 2004. Tree-ring growth curves as sources of climatic information, Quaternary Research 62, 126– 133
Osborn, Timothy J. and Keith R. Briffa, 2006, The Spatial Extent of 20th-Century Warmth in the Context of the Past 1200 Years, Science 311, 831-834.
Rutherford, S., Mann, M.E., Osborn, T.J., Bradley, R.S., Briffa, K.R., Hughes, M.K., Jones, P.D., Proxy-based Northern Hemisphere Surface Temperature Reconstructions: Sensitivity to Methodology, Predictor Network, Target Season and Target Domain, Journal of Climate, 18, 2308-2329, 2005.
Soon, W. and Baliunas, S., 2003: Proxy climatic and environmental changes of the past 1000 years. Climate Research, 23, 89-110.
von Storch, H., and E. Zorita (2005), Comment on “Å”ÅHockey sticks, principal components, and spurious significance”. by S. McIntyre and R. McKitrick, Geophys. Res. Lett., 32, L20701, doi:10.1029/2005GL022753.
Yang Bao, Achim Braeuning, Kathleen R. Johnson and Yafeng Shi, 2002, General characteristics of temperature variation in China during the last two millennia. GRL 10.1029/2001GL014485.