The relationship of bristlecone/foxtails to gridcell temperature is something that I’ve discussed at length, but, surprisingly, I’ve never illustrated it at the blog. This is a type of relationship that, in some ways, is well suited to blogs. It’s simple to discuss; it’s important. It would be amply illustrated and discussed in business feasibility studies for equivalent issues, but one tends to simply cite the statistics rather than illustrate them in detail in an academic article.
I was reminded of this relationship both by re-visiting Osborn and Briffa and by an interesting recent presentation by John Christy. Despite the claims of Osborn and Briffa, I am simply unable to verify their claims of a significant relationship. Their statistical presentation appears to have neglected the most elementary considerations, such as calculating a t-statistic as a method of assessing significance in light of degrees of freedom.
Read and weep.
In Christy’s recent presentation sponsored by the George Marshall Institute, he happened to discuss temperature changes in the California Sierra Nevadas, well known to readers of this blog, as the home of the foxtails (the bristlecone cousins) and just across the Owens valley from the most important bristlecone site (Sheep Mountain). As I’ve mentioned on many occasions, the foxtails and bristlecones are not just relevant to MBH, but, together with the Yamal substitution, are the important proxies for establishing HS-ness in 7 main studies (in addition to MBH, also Crowley and Lowery, Esper et al 2002, Mann and Jones 2003, Rutherford et al 2005, Osborn and Briffa 2006 and, almost certainly, Hegerl et al 2006).
Wouldn’t it be remarkable if all of these studies had neglected the simple precaution of verifying that there was a statistically significant relationship of these essential proxies to gridcell temperature?
Here is a location map from Christy et al 2006.. In this case, the mountain sites appear to be on the west side of the Sierra Nevadas, while our interest is on the east side. I’ll try to see if there is some east side information, but for now, I’ll illustrate the west side information.
Christy compares valley temperatures with mountain site temperatures. To put this into context, I’m first going to show the Had CRU gridcell series.
Figure 2. HadCRU gridcell annual temperatures for gridcell 37N, 117W.
If you compare this to Christy’s "valley sites" in the top panel of the figure below, then you see that some main features are common to both: for example, the high values starting in the late 1920s; the sequence of declining values in the early 1960s; the secular upward drift. Christy contrasted this upward trend in valley temperatures with no change in the Sierra sites in the graph below, which he attributed to a land use effect: irrigation.
Figure 3. From Christy et al 2006 illustrating difference between valley and mountain temperatures. Time period differs from other graphics.
I’m going to compare this graphic to gridcell temperature series and proxy series using an 1870-2000 period. When I was doing this, I was able to re-size the various figures for comparison. This hasn’t come out very successfully on the blog (and I may re-size this later, but you’ll have to be content with a little squinting back and forth for now. Next here is a plot of the foxtail data over the same period. Even if you compare it to the gridcell data, you notice some obviously different patterns.
Both series have some locally high values in the 1930s, but the foxtails appear to be a little later. In the early 1960s, there is a decline in temperatures and they stay at fairly low levels through the 1970s. In contrast, there is a huge pulse in foxtail growth, with no analogue in the temperature record. In the late 19th century, there is also a pulse in foxtail growth, and looking this time at the gridcell temperature, there is no trend whatever in gridcell temperature.
This is reflected in the basic statistics. Using the gridcell series (and the failures will be exacerbated by the mountain data), the gridcell correlation for the foxtails is 0.048. Quoting simple correlations is a pernicious habit of the Hockey Team: for statistical significance, it’s far more relevant to look at t-statistics (especially when we discuss decadal correlations as we will in a minute or two.) Here the r2 is 0.002; th adjusted r2 is -0.006; the t-statistic for proxy on temperature is an insignificant 0.52 (without even worrying about autocorrelation) and the DW is 0.986.
Below the foxtails, I’ve shown the same plot for the MBHPC1 (used in Mann and Jones 2003; Rutherford et al 2005, Osborn and Briffa 2006 and, I’ll bet dollars to doughnuts, in Hegerl et al 2006. In the long version, this is little more than Sheep Mountain which has about 80% of the weight. This is just across the valley from the foxtails.
Here the correlation is -0.005; r2 is 0.000; adjusted r2 is -0.052; t-statistic is -0.009 and DW is 0.869.
These results obviously re-state the position of Lamarche et al 1984 and Graybill and Idso 1993 (and even Biondi et al 1999) that this growth is not related to gridcell temperature.
While I’ve discussed bristlecones primarily in the context of MBH, as noted above, they continue to be used, most recently in Osborn and Briffa 2006, which purports to verify a relationship to gridcell temperature for each proxy. Let’s look at their values and statistical analysis and see how they stack up. Osborn and Briffa report a correlation of foxtails to gridcell temperature of 0.18, whereas I was only able to replicate 0.045. Osborn and Briffa explained the difference as being due to their use of CRUTEM data rather than HadCRU data (although HadCRU was cited in the article). CRUTEM data began only in 1888 while HadCRU data started in 1870. Osborn and Briffa explained by email that the earlier values in HadCRU2 were "spurious" as there were no land stations available (although GHCN has land stations commencing by 1870.) It appears likely that both data sets have not incorporated all available data. Obviously this is a very unsatisfactory "explanation" and raises as many questions as it purports to explain.
Osborn and Briffa did not even do a simple t-test although this is available merely by obtaining the correlation as the regression coefficient between two scaled series, yielding the t-statistic in the process. By at least doing a regression, one can also get the DW statistic which shows extreme autocorrelation in the residuals. These methods have been available for nearly a century.
The situation with the MBH PC1, astonishingly still in use, is even more peculiar. Osborn and Briffa said that they did not test this series, relying instead on previous results from Jones and Mann. How hard would it have been to do this? It took me about 10 seconds. Or maybe they didn’t like the result of 0.00. (Not reporting adverse statistics is again an all too frequent Hockey Team practice.) Osborn and Briffa quote Jones and Mann as having a "decadal" correlation of 0.52 for the PC1.
As a first test, I calculated decadal averages in decades starting in year 6 and repeated the above calculation. Because there are only 9-10 degrees of freedom, one expects that some apparently high correlation might not be accompanied by statistical significance. In this case, I got a correlation of 0.38 (as compared to 0.56 cited in Jones and Mann 2004, quoted in Osborn and Briffa 2006). Looking back, they did their calculation on a 1901-1980 period, while I did it for the full period. I’ve not bothered trying to see what I get for 1901-1980 as I don’t think that it matters anyway. The adjusted r-squared is only 0.056; the t-statistic is a mere 1.29 (not significant) and DW is a ghastly 0.36.
Jones and Mann obtained their results using a gaussian filter of length 13 years. I know that the justification is an attempt to get "low-frequency" results, but then you need to allow for the reduced degrees of freedom somehow. For a size-up, decadal averaging seemed a more sensible way of keeping track of degrees of freedom. Anyway, I then did the same thing using their filtering and still got awful results. The adjusted r-squared was -0.002; the t-statistic was only 0.85 and the DW an unbelievable 0.06. The price of smoothing is that you lose degrees of freedom, the chances of having reportable statistical significance vanish quickly and results are ambiguous.
This is not high or fancy statistics. This is freshman statistics, probably high school statistics. People criticize me for being sarcastic, but here I’m just trying to make a statement of fact. These are about the most elementary statistics that it is possible to conceive. And yet here we have an article prominently featured in Science that has neglected these elementary precautions. Given that Briffa is IPCC lead author for the millennial paleoclimate section, one might hypothetically wonder whether Briffa, like Mann, might take the opportunity to publicize and feature his own results. It’s fun to speculate about these things.
All of these statistics were calculated with the HadCRU data. It looks like Christy’s data would yield similar or lower results, although it’s hard to imagine worse results.