I’ve been re-visiting some proxy data; I noted last summer that Rob Wilson had archived a considerable amount of B.C. data in Aug 2007 and noticed that he subsequently archived the data versions as used in Wilson et al 2007 at NCDC here in Sept 2007. (Not all of Rob’s data is archived as he isn’t in control of all the data that he’s been involved with; for example, Brian Luckman of the University of Western Ontario is holding out on archiving data, notwithstanding commitments of the IAI program discussed here.
Wilson et al 2007 attempts to argue that the Divergence Problem is not necessarily as bad as it seems. They calculate a composite over the period 1750-2000 (so it’s relatively up-to-date) and, perhaps responding somewhat to criticisms from CA about the overuse of stereotypes, set as one of their criteria that the proxy not have been included in prior compilations.
There are many aspects of the methodological description that are better than has been traditional in the field, though the descriptions are far from perfect. Unfortunately, despite many promising features, efforts to replicate results using published descriptions and data quickly foundered.
Here are their criteria for selecting the 15 sites:
1. To ensure the independence of this study, only TR proxy series that had not been used in previous reconstructions of ENH temperatures would be considered.
2. As undertaken by D’Arrigo et al.  each TR series must have acceptable replication (~10 series within each site chronology) from 1750 to present.
3. Nonpublished TR data must extend to 1995 or beyond.
4. The TR proxy series must correlate at >0.40 against an optimal seasonal parameter of ‘‘local’ gridded mean temperature data from the CRU3 [Brohan et al., 2006] land only data set. Even if a series correlates at <0.40, but the inferred association is significant at the 95% confidence limit, it was still rejected from further analysis.
5. No significant autocorrelation (as measured using the Durbin-Watson statistic) must be observed in the model residuals from regressing the TR proxy time series against the local seasonal temperature data. If a significant divergence exists between the TR data and local temperatures, this would be expressed as trend in the model residuals and hence they would be autocorrelated.
Unfortunately, they didn’t say how many series were examined to whittle down to the final 15: did they examine all the series at ITRDB? If not, how did they create a short list? In some cases, it sounds as though there was some sort of pre-screening. For example, they say of the Mongolia data used:
These two series were chosen from data presented by D’Arrigo et al.  as they expressed the strongest response to summer temperatures.
Without knowing how the population was originally established, it’s hard to test the selection from first principles.
I thought that it would be interesting to look at the 20th century (1920-2000) trends for all 15 series in the network, which could be done easily using the data archived at NCDC here. This resulted in the following barplot of trends for each of the 15 sites. 7 of 15 sites showed a decreasing trend in the 1920-2000 period; there was certainly not a consistent pattern of 20th century increase.
To the extent that there was an overall increase, this was obviously strongly impacted by the two series marked here as “COL” and “MON”. “COL” is a bristlecone series from San Francisco Peaks, Arizona collected by students of Malcolm Hughes and “MON” is a composite of two Mongolian series collected by Jacoby (but not from the Sol Dav site used over and over again.)
The “COL” site (San Francisco PEaks AZ) was based on data from Salzer and Kipfmueller, discussed passim at CA in the past; it had also been collected by Graybill and is in included in Mann’s PC1 – to which, however, it was not one of the stronger contributors. Salzer and Kipfmueller’s article is new, but my guess is that they incorporated the earlier Graybill data, so strictly speaking, it may not be completely “independent” of the Graybill data. I’ll return to this on another occasion after I find out a little more about this site. Examination of this site is also hampered by the fact that Salzer and Kipfmueller have not archived measurement data, though they have archived their temperature reconstruction (which matches the Wilson et al 2007 version for the period after 1750 up to re-scaling and re-centering.)
Today I’ll discuss the Mongolian data, the site which showed the second-largest increase. Again, Mongolian data has been discussed here in the past, but not the two series in this study. Wilson et al describe their handling of this data :
This record was derived from combining two temperature-sensitive elevational tree line RW records from Mongolia–Khalzan Khamar (Larix sibirica; ITRDB code: MONG009, 49.55N, 91.34E [2500 m]) and Horin Bugatyin Davaa (Pinus sibirica; MONG009 [MONG008], 49.22N, 94.53E [2229 m]). These two series were chosen from data presented by D’Arrigo et al.  as they expressed the strongest response to summer temperatures. The chronologies were detrended using standard methods, and after normalizing to the common period, averaged to derive a mean series. The period replicated with >10 series is 1636–1998. This composite record is independent of the Solongotyin Davaa record used in D’Arrigo et al.  and correlates well (r = 0.70) with gridded June–July temperatures (Table 1).
There is a typo here as Horin Bugatyin Davaa is MONG008, rather than MONG009, but this is easily spotted. Site chronologies are archived at NCDC. I had previously downloaded these series and made time series from them. Following the above recipe, I determined the “common period” (1641-1998), calculated the mean and standard deviation for each series over the “common period” and used these to standardize the series, then averaged them using the following garden-variety script:
#this is the composite for use in calculations
I also excerpted the HAdCRU3 gridcell for 47.5N, 92.5W, calculating the
Jul-Aug Jun-Jul average, according to the information in Table 1 ( using the NetCDF version of HadCRU3 that I downloaded last summer) as follows:
tmong=read.hadcru3(lat=49, long= 92)
seasavg=function(x,index) round(ts(apply( array(x,dim=c(12,length(x)/12))[index,],2,mean),start=floor(tsp(x)) ),2)
[Note: This paragraph has been revised to reflect comments by Rob Wilson below as my original calculations were done on July-August instead of June-July.] I calculated the correlation between the Mongolia tree ring composite and the gridcell series, hoping to replicate the reported value of r= 0.70. I obtained a value of 0.47, above the cut-off of 0.40 but much lower than the reported 0.70. Using July-August temperatures, I obtained a correlation of only 0.20. Here is the re-plotted graphic:
Here is my simple calculation:
# 0.467527 # was 0.1995141 for Jul-Aug
Here is a comparison of the plot in Wilson et al 2007 and my replication of their figure, plotted as follows:
Obviously, the versions plotted in Wilson et al 2007 differ from the versions that I calculated, especially for the tree ring data, but even for the temperature data. I did a regression attempting to relate the Mongolian version used in Wilson against the two archived chronologies; this should yield a perfect match, but didn’t. The 20th century (1920-2000) trend in the RW data in this calculation is negligible – it’s about the same as YUS.
Figure 1. The right panel shows my replication using June-July temperatures. My first effort used July-August and is shown at the foot of the post.
It’s possible that Wilson did not actually use the chronologies archived by Jacoby as MONG008 and MONG009, but instead re-processed the measurement data to produce his own chronologies. [Update: Wilson says below that he used chronologies supplied by D'Arrigo and he is in the process of investigating.] The methodology only says:
The chronologies were detrended using standard methods…
Presumably Jacoby’s archived chronology was also produced using “standard methods” – however, the results are obviously different. If, as seems to be the case, there are a number of different “standard methods”, referees in this field should not accept this sort of uninformative statement and should instead require authors to state what they did in terms that enable replication.
But regardless of the replication issue, the procedure that I used – which seems like a garden variety implementation of the described methodology – ended up with a different looking series, so the methodology is obviously not very “robust”.
In the selection protocols, they report the calculation of a Durbin-Watson statistic of 1.69, safely within the safe range of 1.5-2. In my calculation using a standard package as shown below, the DW statistic was 1.26, outside the minimum of 1.5, well into the danger zone and also outside acceptance criteria. The adjusted r2 statistic for this fit was 0.22 (0.024 relative to the Jul-Aug average).
fm=lm(mong~tmong,data=Z);summary(fm) #adj r2 #0.2055 #was 0.024 for Jul-Aug
dwtest(fm) #DW =1.26 #was 1.3181 for Jul-Aug
I’ve only done this gridcell so far. I’ll spot check a few others, including the Colorado one and see what happens.
Update: Here is a figure from D’Arrigo et al 2000 illustrating one version of the Horin Bugatyin Davaa series.
Here is a plot showing (top panel) the archived Horin Bugatyin Davaa chronology at ITRDB (mong008.crn). To double-check this, I calculated the chronology using a Jacoby-style using the measurement data archived for HBD (mong008.rwl) and obtained a very close match to the archived crn chronology. So whatever the screw-up is, the screw-up is not just the crn version, but the measurement data as well. Rob notes that the ITRDB version is shorter than the version illustrated in D’Arrigo et al 2000. The problem here is that the ITRDB version was archived in April 2004, some time after the original publication.
For comparison, I also examined the Mongolian sites at ITRDB for any other sites higher than the 2240 m of the HBD site. The mong001 site (Hovsgol Nuur) did not have any trend. For some reason, it wasn’t mentioned in the D’Arrigo et al 2000 study or Wilson et al 2007 study.
Wilson, R., D’Arrigo. R., Buckley, B., Büntgen, U., Esper, J., Frank, D., Luckman, B., Payette, S. Vose, R. and Youngblut, D. 2007. A matter of divergence – tracking recent warming at hemispheric scales using tree-ring data. JGR – Atmospheres. VOL. 112, D17103, doi:10.1029/2006JD008318 pdf data