Wilson et al 2007 (previously discussed here) considers a Kyrgyzstan series that has numerous issues – the usual provenance problems unfortunately occur once again. But over and above that, it uses multiple **inverse** regression, a procedure used all too casually by dendros. In this case, the procedure flips over one of the ring width series and results in the reconstruction having a substantially higher 20th century trend than any of the constituent series. The form of multiple inverse regression is a little different than Mannian inverse regression and arguably even worse. Also when I replicated the recon using ITRDB chronologies, I got quite different (higher) results in the 18th century and no 20th century trend.

Here is the methodological description in Wilson et al 2007. (In fairness to the authors, they’ve at least given readers a fighting chance by giving ITRDB identification numbers. I’ve been trying for nearly 3 years now to get Hegerl, Crowley et al to identify the series that they used in the Urals and Mongolia without any success.) Wilson et al:

A search through the ITRDB, found surprisingly few temperature sensitive TR data sets that came up to at least 1995. Kyrgyzstan was one region where such chronologies were found (sampled and measured by the Swiss Federal Institute for Forest, Snow and Landscape Research). In this region, both RW and MXD data were obtained from two spruce (Picea shrenkiana) sites Sarejmek (ITRDB code: RUSS152, 41.36N 75.09E) and Tschongkys (RUSS164 42.11N 78.11E). Chronologies were computed using standard techniques. The period covered by at least 10 series in each chronology is 1689–1995. To account for the varying coherence between each chronology, the chronologies were not averaged to derive site mean series, but rather the RW and MXD chronologies were utilized separately as potential predictor series in a stepwise multiple regression against gridded temperatures. The final optimal model was calibrated against June–July mean temperatures, with the final series being a linearly weighted combination of the Sarejmek MXD and RW data as well as the RW data from Tschongkys. The final Kirgistan reconstruction explains 36% (r = 0.61. Table 1) of the gridded temperature variance.

I downloaded the russ152w, russ152x, russ164w and russ164x chronologies from ITRDB. [UPDATE: When I originally downloaded the chronologies in 2004, there was an error in the russ152w chronology which purported to show values well into the 21st century. In my original collation, I picked out the STD version from the Schweingruber data set (which has three versions), but when I re-visited this matter a few days ago, I inadvertently picked out the first version, which, as pointed out by Craig Loehle and Rob Wilson below, was not the STD version. I’ve re-done the collation, this time picking out the STD version of the russ152w chronology and done calculations with this. Rob Wilson kindly sent me his versions which enabled a prompt reconciliation – his measurement data set did not use 4 cores in the russ152w.rwl but the match is now close. Subsequent paragraphs have been re-stated to reflect this reconciliation on the basis. ]

I did a usual reverse engineering regression in which I regressed the Wilson KYR reconstruction against the constituent chronologies, using both the 4-series and 3-series combinations. Given that the final series is supposedly a “linearly weighted combination” of the russ152w, russ152x and russ164w series, the R^2 from this regression should be 1 or very close to it. The adjusted r^2 is 0.92 (improved from 0.78 previously reported). Note the negative coefficient on the russ164w series: this series has been flipped over in the reconstruction.

(Intercept) 0.04224 0.01783 2.369 0.0186 *

russ152w 0.61866 0.02265 27.310 <2e-16 ***

russ152x 0.66404 0.02672 24.854 <2e-16 ***

russ164w -0.54274 0.02154 -25.202 <2e-16 ***

russ164x -0.03597 0.02724 -1.321 0.1879

—

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1Residual standard error: 0.2789 on 241 degrees of freedom

(135 observations deleted due to missingness)

Multiple R-squared: 0.9254, Adjusted R-squared: 0.9242

F-statistic: 747.5 on 4 and 241 DF, p-value: < 2.2e-16

Using the ITRDB versions available to me, I attempted to replicate Wilson’s stepwise methodology as follows. First I did a multiple inverse regression of CRU gridcell temperature (42N, 77E) against all 4 chronologies:

fm=lm(cru~.,data=Z[,1:5])

summary(fm)

This yielded an adjusted r2 of 0.27 using the russ152w STD chronology(versus 0.36 with the russ152w RAW) with three “significant” coefficients, including a flipped russ164w.

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.48103 0.08636 -5.570 1.85e-07 ***

russ152w 0.33073 0.11410 2.899 0.004533 **

russ152x 0.44390 0.13098 3.389 0.000977 ***

russ164w -0.38271 0.11170 -3.426 0.000864 ***

russ164x 0.03474 0.12550 0.277 0.782466

—

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1Residual standard error: 0.9197 on 109 degrees of freedom

(267 observations deleted due to missingness)

Multiple R-squared: 0.298, Adjusted R-squared: 0.2722

F-statistic: 11.57 on 4 and 109 DF, p-value: 7.3e-08

I then used the stepwise regression function in R (*step*):

fm0=step(fm)

This yielded a model selecting the three series noted up by Wlson, again with a flipped orientation for the russ164w series.

(Intercept) russ152w russ152x russ164w

-0.4800288 0.3244294 0.4696676 -0.3770913

Using the archived series, I ended up with a “reconstruction” that has a correlation of 0.54 (0.62 with russ152w RAW) to the gridded series, somewhat lower than the reported correlation of 0.61 (although I got 0.62 for the Wilson version against present CRU3). The emulation has a fairly similar appearance to the Wilson version, both being shown below, but there are also some notable differences.

Top – Excerpt from Wilson et al 2007 Figure 3; bottom – emulated from ITRDB data.

The next graphic shows a plot of the residuals between the re-stated Wilson emulation and the archived Wilson version, which remain very noticeable even after the reconciliation accomplished so far.

Here is a re-stated plot of the 4 candidate constituent series, plotted from data downloaded from ITRDB with the russ152w STD now shown.

I calculated the 1920-2000 trends for scaled versions of each of the 4 constituent series (scaling in order to make comparison to CRU easier). Three of the 4 series had **negative** trends in the 1920-2004 period; only russ152w had a positive trend in this period ( 0.006366476 SD Units/year). My emulation of the Wilson method yielded a reconstruction with virtually no trend ( 0.002264408 SD Units/year), but the Wilson reconstruction had an observable positive trend in the 1920-2004 period (0.008109656 SD units/year), which was greater than the trend of any of the constituent series.

Finally, as an exercise, I calculated ARMA(1,1) coefficients for the constituent series. For the russ152w STD series, the ARMA(1,1) coefficients were much reduced from the russ152w RAW version (which was close to a random walk), but the coefficients were both highly significant. For many series, I’ve found that an ARMA(1,1) model is highly significant relative to an AR1 model and always results in a much higher AR1 coefficient combined with a negative MA1 coefficient. This is characteristic of a wide variety of climate series.

# ar1 ma1 intercept

# 0.6708 -0.4867 0.0019

#s.e. 0.1298 0.1505 0.0784

**UPDATE:**

Rob Wilson sent me the chronologies that he used for Kyrgyzstan and differences arise almost entirely from differences between Rob’s re-calculated chronologies and the ITRDB chronologies, which end up imparting very substantial differences to the overall result.

First here is a graphic comparing the 1750-1995 trends for Wilson’s Kyrgyzstan composite and 10 Schweingruber Kyrgyzstan chronologies, of which Wilson used 4 (3 after stepwise).

Plots of all 10 Kyrgyzstan STD chronologies as archived at ITRDB are shown below, and the lack of trend is obvious.

Rob sent me the chronologies as he calculated them for the sites that he used. Below are plots comparing Rob’s chronologies to ITRDB chronologies – in both cases, Rob’s chronologies have a greater 20th century increase than the ones at ITRDB.

Rob said that he had “no idea what detrending methods were used to derive the ITRDB version – hence I would never use it”. Fair enough. But this is the same problem that I’ve complained about over and over again in this trade. Appendix A6 says only:

Chronologies were computed using standard techniques.

As Rob observes below, Appendix 3A discussing Scandinavia states:

These chronologies were detrended using so-called standard techniques (either negative exponential or regression functions of negative or zero slope).

and in an email, Rob stated to me that this applied to Kyrgyz as well. I’m sure that ITRDB would also say that they used so-called “standard techniques”, but their definition obviously differs somewhere from Rob’s. They have relatively recent (June 2005 report) on the chronology here . If Rob is unable to determine from June 2005 meta-data how the chronology was done, thereby making it unusable, then ITRDB need to do a better job; this is a facility for dendros and shouldn’t they take some responsibility for the defective metadata?

## 10 Comments

One of the dendro analysis steps is to detrend each series for the nonlinear effect of tree aging. It looks like russ152w needs that step, but the others are flat and don’t need this detrending. Was detrending done before archiving the data you plot here? Did Wilson do detrending? What happens if you detrend a flat series, does it make it go up at the end (recent) period?

There is a report on the russ152w chronology here ftp://ftp.ncdc.noaa.gov/pub/data/paleo/treering/measurements/correlation-stats/russ152w_gap.txt, which states that the chronology passes their QC tests. The russ152w version has a similar shape to Schweingruber’s own chronology at ftp://ftp.ncdc.noaa.gov/pub/data/paleo/treering/updates/wsl/raw-data/chronos/jmekpcsh_hcr.txt.

Having said that, I did my own chronology calculation using a Jacoby style and got a very different looking result, one without the high early values – more like what must underpin the Wilson recon. So the measurement data can support a different chronology than the one actually archived.

This poses a couple of questions – if there is a defect in the archived Schweingruber and ITRDB chronologies, what caused the problem, how prevalent is the problem and why did QC procedures fail to pick it up?

Second, who actually calculated the chronology version used by Wilson (leaving aside the question of whether the recalculation has merit relative to the ITRDB version)? Why was this particular chronology re-calculated, while the Salzer and Kipfmueller chronology wasn’t? When they noticed the problem with the ITRDB chronologies, did they notify ITRDB? If not, why not?

Steve quoting Wilson et al:

Isn’t this the same kind of cherry-picking Jacoby did with the cedar data, except on a bigger scale? See “A few good series” threads: http://www.climateaudit.org/?p=29, http://www.climateaudit.org/?p=570. Out of 1000 series unrelated to temperature, we would expect to be able to find about 10 that are individually “highly significant” at the 1% level.

Or is this your point?

Dear Steve,

as a quick answer, I would agree with Craig L, that your version of russ152w is a simple mean of the non- detrended raw data. I am sure if you used your so called Jacobian version, you will get results much closer to mine.

As a rule, if I can get get my hands on raw data, I will process my own chronology. That was not always possible for the JGR paper. As for published reconstruction (e.g. Salzer and Kipfmueller) I would used the archived reconstruction.

Standard methods – as you know fine well – are the use of negative exponential functions or linear regression functions of negative or zero slope.

and yes – my paper was the ultimate cherry-pick and it was a joy to try and make cherry pie. I usually am only good at cooking curries.

The paper was trying to address divergence issues at large scales and I needed to work with proxies that showed no divergence at the local scale.

Rob

Why not a collaborative effort using the recent Almagre data?

I must have misunderstood Rob Wilson’s last sentence.

Was he being petulent or just making a joke (albeit a somewhat lame one)?

Rob emailed me data versions of the series that he used (uploaded to http://www.climateaudit.org/data/wilson) and I re-did the calculations improving the reconciliation but leaving a number of questions unresolved. Schweingruber archives 3 different chronologies together. I originally collated ITRDB crn data in 2004 and picked off the most appropriate version at that time for chronologies that were then archived correctly. Some series were not archived correctly – something that I pointed out at the time to Schweingruber and to Bruce BAuer and russ152w was one of them. (They were shown as going out into the 21st century so it wasn’t too hard to tell that there was a problem.) When I re-visited this, I re-collated the russ152w series which had been corrected in the mean time, but this time, I picked up the first chronolology instead of the second chronology. I’ve re-done the post using the russ152w STD chronology as the most appropriate – apologies for the error. I’ve noted the update in the post and noted prior results, but have re-stated the post considerably. I’m doing this in the light of valid criticisms.

This doesn;t entirely resolve things however, as I still get noticeable differences. I’ll take a more detailed look at Rob’s chronologies and try to resolve these. I notice that his russ152w measurement data set excludes 4 trees relative to the russ152w.rwl data set at ITRDB – I’m not quite sure why or what impact this has.

In addition, Rob has not commented on the flipping of the russ164w series in the course of the linear regression. If there is a relationship between temperature and ring widths, then there should be a consistent relationship between the coefficients. This sort of sign flipping is overfitting pure and simple and creates a false fit between the chronologies and the temperature data, which, in this case, enhances the trend.

I’ve updated this post based on information that Rob Wilson sent me offline. There are material differences between Rob’s RW chronologies and the ITRDB chronologies, both presumably done “using standard techniques”, but Rob’s versions have a much more pronounced 20th century increase. Although the ITRDB chronologies were finalized as recently as June 2005, Rob says that the information is insufficient for him to know how they did their chronologies.

Steve,

shame on you for not passing on ALL the info I sent you.

As stated in Appendix A3 for Northern Scandinavia

This approach was used for all data I had raw data for.

Very easy to replciate if you took the trouble to do so.

Rob

Rob, I’ll clarify the point; I wasn’t trying to pick a scab on this particular issue and apologize for any slight. In the paper, Appendix A3 says, as Rob points out:

The handling of each of the 15 sites is quite different. Data is archived for some of the sites (N Scandinavia, Kyrgyzstan, one Mongolia site, Nepal, Idaho, B.C., Yukon North, Wrangells), but is not archived for others (Alps, Tatra, W Siberia, one Mongolia, Tien Shan, Colorado, Yulon South, N Quebec). Rob’s descriptions are relatively good by dendro standards.

In some cases, Rob re-calculated chronologies; in other cases, he did not. For example, he didn’t re-calculate the Salzer and Kipfmueller chronology. In Appendix A7 (Kyrgyz) he said:

I think that the paper makes it sound like ITRDB/Schweingruber used standard techniques; in his email to me, Rob said that:

So readers can be assured that Rob’s email said that these methods were used for Kyrgyz as well. I more or less know what Rob’s doing and I can get results that look sort of like his, but so far I can’t replicate his results. His Sarejmek data set excludes 4 cores that were included in the ITRDB data set, but differences between his chronology and the ITRDB chronology cannot be attributed to this difference and the Tschongkyz datasets match. As to methodology, I’d like to see methodologies that say that he used COFECHA Option 2 (or whatever) – that’s all that I’m saying here. But I’ll clarify the point in the post for future readers.