I was one of the more industrious reviewers for IPCC AR4. In my Review Comments, I made frequent reference to Yamal versus the Polar Urals Update, expressing concern about the rationale for using Yamal rather than Polar Urals, an issue that is once again in play. Keith Briffa was the section author and can be deemed to be the author of the responses.
As context, Review Comments for the 2nd Draft were submitted around April 2006. In late 2005, I became aware that there was additional information for the medieval period for the Polar Urals site used in MBH and Jones et al 1998 (see here (Aug 2005) for my first mention of the Polar Urals update). In Feb 2006, D’Arrigo (Wilson) et al (2006) and Osborn and Briffa (2006) were published. Both studies used Yamal and their use of Yamal was very much on my radar. See here for my first mention of the “Yamal Substitution” (plus other Feb 2006 archived posts) an While Osborn and Briffa 2006 attracted more attention, D’Arrigo et al was the more professional report. But it handled the Polar Urals-Yamal dichotomy in a very strange way – something that I’ll return to in another post.
I made a diligent effort at the time to get Science to require Briffa to disgorge his Yamal measurement data, but they refused. They argued that Osborn and Briffa 2006 did not use the Yamal measurement data, but only the chronology and I should contact the “original authors” for the measurement data. The source of the chronology was, of course, Briffa 2000. I wrote Tim Osborn and asked him for the data and he said that he didn’t have it. So I wrote Keith Briffa and he stonewalled me. I wrote back to Science rather crossly about the nonsense. Concurrently, Science was working on getting Esper’s data for Polar Urals and, after about 25 emails, it drifted in around April 2006 when the AR4 Review Comments were due.
I was also well aware that merely using the Polar Urals update rather than the Yamal series affected the Briffa 2000 composite – a point illustrated here in early June 2006, a post including the following illustration of the impact. Because the medieval networks of other studies overlap almost entirely with the Briffa 2000 network, similar results hold for (say) D’Arrigo et al 2006.
Figure 1. Briffa 2000 reconstruction together with sensitivity showing same data with Polar Urals Update. Black – Briffa 2000 reconstruction; red- using Polar Urals Update instead of Yamal. See http://www.climateaudit.org/?p=693
Some of this background is familiar to readers, some of it I’ll return to another post – today I’m reviewing it because it is context for the April 2006 Review Comments.
AR4 Review Comments
Chapter 6 Second Draft Review Comments are online here.
One of the issues that we’ve discussed here on many occasions is the lack of independence between the various multiproxy studies, both in terms of authorship and in terms of proxy selection. The latter lack of independence means that, if there should be some disqualifying issue with proxies used in multiple studies e.g. strip bark bristlecones and/or Yamal, numerous studies will be jointly affected. This point was endorsed by Wegman. In the 1st Draft, there was a coy mention of the non-independence – that the spaghetti graphs were “not entirely independent”:
As with the original TAR series, these new records are not entirely independent reconstructions inasmuch as there are some predictors (most often tree-ring data) that are common between them, but, in general, they represent some expansion in the length and geographical coverage of the previously available data. (1st Draft)
This seemed far too coy to me. In the 1st Draft Comment, I stated (response in italics):
6-1351 A 29:20 29:22 The non-independence should be discussed. This includes non-independence of authors and more detailed discussion of non-independence on proxy series. [Stephen McIntyre]
Rejected – the data series are discussed and point on authors not valid.
I can’t find any place where the “data series are discussed” and the point on authors is “valid” as there is substantial overlap in authors groups (Jones, Bradley, Mann, Briffa, …) The sentence was unchanged in the 2nd Draft:
As with the original TAR series, these new records are not entirely independent reconstructions inasmuch as there are some predictors (most often tree-ring data) that are common between them,…
This time I expanded on my remark on the 1st Draft, observing that it was misleading to use the term “not entirely independent”, when the overlap was actually massive. I drew particular attention to the difference between the Polar Urals and Yamal seres.
6-1167 B 30:1 30:1 You allude to the fact that these reconstructions are “not entirely independent inasmuch as there are some predictors that are common”. This is a very misleading description. For the medieval period, there is massive overlap in all the cited studies. The six series of Briffa (2000) together with bristlecones/foxtails are used in only slightly varying combinations in all of the cited studies. If there are problems with only a few canonical series (as arguably has already been demonstrated with the birstlecones/foxtails) then the entire corpus of studies may fall. Problems can be observed elsewhere e.g. the Yamal series and the Polar Urals Update have very different properties with the Yamal series being a big contributor to HS-ness while the Polar URals series has a strong MWP. The Polar Urals Update correlates better to gridcell temperature than the Yamal series and one cannot help but suspect that the decision to use the Yamal series in all studies except Esper has been done with one eye on the MWP-modern relationship.
[Stephen McIntyre (Reviewer's comment ID #: 309-62)]
The Author Response stated that they “accepted” the comment, while denying that there was anything untoward about the repetitive use of Yamal rather than Polar Urals:
Accepted – text revised to stress overlap in early centuries of the last millennium. However, please note that the reviewer’s “suspicion” is unfounded.
In fact, they didn’t “accept” the comment at all. The coy language remained unchanged; they merely qualified it as particularly applying to the “early centuries” – a change that was totally unresponsive to the concern that overlap was massive and that it was not fairly described by the AR4 language:
As with the original TAR series, these new records are not entirely independent reconstructions inasmuch as there are some predictors (most often tree ring data and particularly in the early centuries) that are common between them,
I repeated my concern about the impact of Polar Urals versus Yamal in my comment about the AR4 characterization of Briffa (2000). The text read:
Briffa (2000) produced an extended history of interannual tree-ring growth incorporating records from sites across northern Fennoscandia and northern Siberia, using a statistical technique to construct the tree-ring chronologies that is capable of preserving multi-centennial timescale variability. Although ostensibly representative of northern Eurasian summer conditions, these data were later scaled using simple linear regression against a mean Northern Hemisphere land series to provide estimates of summer temperature over the past 2000 years (Briffa et al., 2004).
Once again, I observed that the Briffa 2000 was not robust to the use of Polar Urals as opposed to Yamal and asked that this be disclosed. This was rejected.
6-1172 B 30:7 30:7 Briffa (2000) used seven sites which recur repeatedly in the other studies. Briffa substituted a ring width series from Yamal for the updated Polar Urals series (later used in Esper et al 2002). If the Polar Urals Update from Esper is used in Briffa instead of Yamal, then the MWP in the reconstruction is higher than shown. The Polar Urals Update has a better correlation to gridcell temperature than the Yamal series (I have so far been unable to confirm the correlation to gridcell temperature of this series reported in Osborn and Briffa 2006 and suspect that it is wrong.) You need to disclose that this result is sensitive to the choice between using Yamal or the Polar Urals update. [Stephen McIntyre (Reviewer's comment ID #: 309-67)]
Rejected – these are speculative remarks by the reviewer who incorrectly assumes that there has been a biased ‘selection’ of data and processing by Briffa versus Esper.
I raised a similar concern in connection with Esper et al (2002):
6-1173 B 30:14 30:14 The medieval network of Esper et al 2002 is closely related to that of Briffa 2000. Esper took tree-ring data from 14 sites, but only 7 extended to the medieval period. These 7 sites included 5 of 7 sites from Briffa (2000), plus 2 foxtail sites in California. Foxtails interbreed with bristlecones and may be subject to the same problems as the controversial bristlecone sites of Mann et al 1999. There is no legitimate basis for using TWO nearby foxtail sites, and probably not even one. Their relative MWP-modern level in their reconstruction does not appear to be robust to the presence/absence of these two foxtail sites.
[Stephen McIntyre (Reviewer's comment ID #: 309-68)]
Rejected – these remarks are speculative and the current text reviews published papers and is not intended to ‘second guess’ their content.
And raised the Yamal substitution again in connection with Mann and Jones (2003), again rejected.
6-1179 B 30:27 30:27 Mann and Jones use a 3-series average of Tornetrask, Yamal and Taimyr, which is not robust to the Yamal substitution. [Stephen McIntyre (Reviewer's comment ID #: 309-74)]
Noted – no revision to text necessary.
I note that the next comment following mine was accepted:
6-759 A 30:28 30:28 Missing full stop at end of line
[James Crampton (Reviewer's comment ID #: 50-27)]
I raised the same point in connection with D’Arrigo et al 2006:
6-1184 B 30:41 30:41 D’Arrigo et al in their medieval portion use almost exactly the same network as Briffa 2000 and the other studies: Tornetrask, Yamal, Taimyr, Jasper, Mongolia (plus in their case Coastal Alaska – only one “new” series”) [Stephen McIntyre (Reviewer's comment ID #: 309-78)]
Noted – no revision to text necessary.
6-1196 B 30:48 30:48 D’Arrigo et al (2006) used only 6 sites in the medieval period, of which all but one overlap the sites of Briffa (2000) used in the other studies. They use the Yamal substitution and their conclusions of relative modern-medieval warmth may not be robust to that. [Stephen McIntyre (Reviewer's comment ID #: 309-91)]
Noted – no revision to text necessary.
Later on, AR4 contains a summary of uncertainties:
Changes in proxy records, either physical (such as the isotopic composition of various elements in ice) or biological (such as the width of a tree ring or the chemical composition of a growth band in coral), do not respond precisely or solely to changes in any specific climate parameter (such as mean temperature or total rainfall), or to the changes in that parameter as measured over a specific “season” (such as June-August or January-December). For this reason, the proxies must be ‘calibrated’ empirically, by comparing their measured variability over a number of years with available instrumental records to identify some optimal climate association, and to quantify the statistical uncertainty associated with scaling proxies to represent this specific climate parameter. All reconstructions, therefore, involve a degree of compromise with regard to the specific choice of ‘target’ or dependent variable. Differences between the temperature reconstructions shown in Figure 6.10b are to some extent related to this, as well as to the choice of different predictor series (including differences in the way these have been processed). The use of different statistical scaling approaches (including whether the data are smoothed prior to scaling, and differences in the period over which this scaling is carried out) also influences the apparent spread between the various reconstructions
This summary failed to mention what I felt were among the most relevant uncertainties in the proxy reconstruction project – the fact that similar proxies in the same region could give very different results – a fact that I thought should be clearly disclosed. They refused:
6-1205 B 31:41 31:41 You need to state clearly that proxy series from nearby sites may give very different results e.g. Yamal and the Polar Urals update. [Stephen McIntyre (Reviewer's comment ID #: 309-100)]
Rejected – this would imply a greater instability than current evidence supports.
And I also stated that they should clearly disclose the lack of robustness. Again they refused.
6-1202 B 31:28 31:28 You should add that these reconstructions are all based on a few selected proxies and that results would be different if other plausible selections were made, such as the updated Polar Urals series being used instead of Yamal or if bristlecones are not used. [Stephen McIntyre (Reviewer's comment ID #: 309-97)]
Rejected – these are the currently available reconstructions – the reviewer’s remark is a moot point.
And the overall problems with proxies – a comment that was not rejected in print, but nothing was done about it either.
6-1113 B 0:0 0:0 You should have a clear description of the potential problems with millennial proxy reconstructions: tree rings are well dated but may not be accurate thermometers; reconstructions from nearby sites may differ dramatically and overall results may be undul[y affected.][Stephen McIntyre (Reviewer's comment ID #: 309-10)]
Noted, issue dealt with with respect to specific comments on this section and the methods chapter
Finally, there was a considerable discussion about Box 6.4 Figure 1, which was a spaghetti graph of proxies (taken from Osborn and Briffa data.) The original 2nd Draft version is shown below. This graphic is exceptionally hard to read even by spaghetti graph standards. I’ve done a plot underneath showing that the spaghetti strands with the biggest blades .
Figure 1. IPCC AR4 2nd Draft Box 6.4 Figure 1 Original Caption: (a) The heterogeneous nature of climate during the MWP is illustrated by the wide spread of values exhibited by the individual records that have been used to reconstruct NH-mean temperature. Individual, or small regional averages of, proxy records used in various studies (see Osborn and Briffa, 2006), (collated from those used by Mann and Jones (2003), Esper et al. (2002) and Luckman and Wilson (2005) but excluding shorter series or those with an ambiguous relationship to local temperature). These records have not been calibrated (though all show positive correlations with local temperature observations), but have been smoothed with a 20-year filter and scaled to have zero mean and unit standard deviation over the period 800–1995.
The biggest blades in this graphic are Yamal, Mann’s PC1, foxtails, a non-full-length Van Engeln instrumental/documentary series and Yang’s China composite (which is driven by Lonnie Thompson) are Yamal, Mann’s PC1, foxtails, a non-full-length Van Engeln instrumental/documentary series and Yang’s China composite.
Once again, I discussed problems with the Yamal series, criticizing its use here as follows:
6-1145 B 29:14 29:14 The beige series which has the strongest closing uptick in Box 6.4 Figure 1 is the Yamal series. When I plotted this series smoothing with a 30-year gaussian filter, I was unable to exactly replicate the uptick shown in this version. I checked the relationship of this series to gridcell temperature and was completely unable to replicate the claimed (0.49) correlation to temperature, obtaining only a correlation of 0.12. The authors here have used data from Yamal, while they used gridcell data from Polar Urals. There is an updated version of the Polar Urals series, used in Esper et al 2002, which has elevated MWP values and which has better correlations to gridcell temperature than the Yamal series. since very different results are obtained from the Yamal and Polar Urals Updated, again the relationship of the Yamal series to local temperature is “ambiguous”. [Stephen McIntyre (Reviewer's comment ID #: 309-41)]
See response to comment 6-1143 and note that the Polar Urals and Yamal series do exhibit a significant relationship with local summer temperature.
On to 6-1143, where I had objected to the continued use of Mann’s PC1 (criticized in the Peer Reviewed Literature.)
6-1143 B 29:14 29:14 One of the most prominent series on the right hand side of Box 6.4 Figure 1 is Mann’s PC1, which uses his biased PC methodology. It is so weighted that the series is virtually indistinguishable from the Sheep Mountain bristlecone series discussed in Lamarche, Fritts, Graybill and Rose (1984). These authors compared growth to gridcell temperature and concluded that the bristlecone growth pulse could not be accounted for by temperature, hypothesizing CO2 fertilization. Graybill and Idso (1993) also stated this. One of the MBH coauthors Hughes in Biondi et al 1999 said that bristlecones were not a reliable temperature proxy in the 20th century. IPCC Second Assessment Report expressed cautions about the effect of CO2 fertilization on tree ring proxies, which were not over-ruled in IPCc Third Assessment Report. At a minimum, the relationship is “ambiguous”. In addition, I tested the correlation of this series with HadCRU2 gridcell temperature and obtained a correlation of 0.0. Osborn and Briffa say that they themselves did not verify the temperature relationship for this data. Why not? At any rate, in this example, the authors have not excluded an important series with a well-known “ambiguous” relation to temperature. [Stephen McIntyre (Reviewer's comment ID #: 309-39)]
These comments were rejected for very weak reasons to say the least:
Rejected – the purpose of this Figure is to illustrate in a simple fashion, the variability of numerous records that have been used in published reconstructions of large-scale temperature changes. The text is not intended to give a very detailed account of the specific limitations in data or interpretation for each. Furthermore, though there is an ambiguity in the time-dependent strength of the response of Bristlecone Pine trees to temperature variability, there is other evidence that these trees do display a temperature response . Right or wrong, Mann and colleagues do apply an adjustment to the western trees PC1 in their (1999) analysis to account for possible CO2 fertilization. Other authors ( Graumlich et al ., 1991) assert that the recent rise in some high elevation conifers in the western U.S. could be explained as a temperature response (she can not confirm the LaMarche et al findings). The issue is clearly complex , as will be noted in a new paragraph on tree-ring problems that will be added to the text.