Juckes and the Esper and Jones CVMs

The Euro Hockey Team (EHT) present a variety of results using CVM methods on proxies from 5 canonical Team studies: Jones Briffa 1998; MBH99; Esper 2002; Moberg 2005 and Hegerl 2006. CVM (take the average of scaled proxies and re-scale the average to match the instrumental record) seems to me to be a less dangerous method than inverse regression. However, the supposedly "robust" results are all from very small subsets, which have been "data snooped" in advance. I’ve replicated 4 of the CVM results so far – the non-MBH ones.

The EHT purport to test for robustness – but their idea of robustness is to test the impact of removing one proxy after they’ve already ensured that there are, say, two foxtails plus Yamal. On numerous occasions, I’ve stated that medieval-modern levels in Team studies are highly dependent on the presence/absence of a few stereotyped series: bristlecones/foxtails, the Yamal substitution, Thompson’s Dunde ice core. The EHT present 5 CVM results and these provide a nice template for showing this one more time. Today I’ll do quick robustness analyses of the CVM-versions of Esper 2002 and Jones 1998 and do the others later.

Here is their key spaghetti graph of CVM reconstructions for the 5 studies plus the Union (All-Star?) reconstruction.


Euro Team Fig. 4. Reconstruction back to AD 1000, calibrated on 1856 to 1980 northern hemisphere temperature, using CVM, for a variety of different data collections. The MBH1999 and HPS2000 NH reconstructions and the Jones et al. (1998) instrumental data are shown for comparison. Graphs have been smoothed with a 21-year running mean and centred on 1866 to 1970. The maximum of the “Union” reconstruction in the pre-industrial period (0.25 K, 1091 AD) is shown by a cyan cross, the maximum of the instrumental record (0.841 K, 1998 AD) is shown as a red.

Esper Version
The figure below isolates the Esper composite, which consists of the 5 Esper series available to 1000 – two foxtail series, a Tornetask version, the Polar Urals Update (only used by Esper) and Taymir. It’s surprising that they didn’t include Mongolia in their Esper composite as it theoretically goes back to 1000 and has strong 20th century growth. There is something weird going on with Esper’s Mongolia version. This is the only one (out of 14 chronologies) that Esper didn’t produce when requested by Science. Hegerl et al, who rely on Esper versions, mention in passing that Mongolia was unavailable and performed a re-collation of 9 sites – Sol Dav/Tarvagatny Pass (mong003) will be one of them, but the others are unknown at present. It looks like Esper might have misplaced his Mongolia chronology as used. If not, the fact that it continues to be Missing In Action is really quite remarkable.

First, I’d like to demonstrate that I’ve replicated their result sufficiently to permit a sensitivity analyis. In the figure below, the archived EHT composite was plotted in black, with my emulation in red overlying it. The difference is plotted in blue. Because the emulation is very close, you can barely see the black. The correlation between the two versions is good to more than three 9’s. An obvious observation when this series is disentangled from the spaghetti is that, by itself, I don’t think that anyone would call it a "Hockey Stick". 20th century values are somewhat elevated, but not in an exceptional range – and this is with 2 foxtail series.


Figure 2. Replication of EHT Esper composite. Black – Archived by EHT; red – emulation; blue – difference.
Fig. 4. Reconstruction back to AD 1000, calibrated on 1856 to 1980 northern hemisphere temperature, using CVM, for a variety of different data collections. The MBH1999 and HPS2000 NH reconstructions and the Jones et al. (1998) instrumental data are shown for comparison. Graphs have been smoothed with a 21-year running mean and centred on 1866 to 1970. The maximum of the “Union” reconstruction in the pre-industrial period (0.25 K, 1091 AD) is shown by a cyan cross, the maximum of the instrumental record (0.841 K, 1998 AD) is shown as a red cross.

The obvious question is what happens if you don’t use the foxtails – after all, these results are supposed to be fantastically "robust". The figure below shows the EHT Esper composite on the left and a Esper variation on the right without the two foxtail series. As you see, without the foxtails, 11th century values are higher than modern values and, indeed, the series ends up at about its long-term average. So the amount of 20th century elevation is very small even in the EHT Esper version and even this limited 20th century "warmth" vanishes without the foxtails. The 1856-1980 correlation of the EHT Esper composite to their instrumental temperature record is 0.59 – higher than the level of 0.43 for the non-foxtail composite. However both results would be "significant" under the statistical methods presented by the EHT.


Figure 2. Left – EHT Esper composite (same as previous figure); Right – variation without foxtails.

Jones et al 1998 Version

Jones et al 1998 is shown in red in the EHT spaghetti graph (their Figure 4 excerpted above.) The EHT Jones composite uses 6 series – three NH and 3 SH: Polar Urals (the Briffa 195 version); Tornetrask (the Briffa 1992 version – which has an amusing "adjustment" discussed last year – see the Jones et al category); West Greenland ice core (a slightly different version than MBH); Cook’s Tasmania tree ring chronology and the Lara-Villalba tree ring chronology from Rio Alerce, Argentina.

Once again, I was able to make a nearly exact replication of the EHT Jones composite as shown below (using the same color scheme as above). The appearance of this composite is obviously significantly different than the Esper composite – with more high-frequency. Again one would not be inclined to label this paticular series as a Hockey Stick.

EHT Jones composite. Black -archived version; red- emulation; blue difference.

In this case, I’ve done a routine sensitivity analysis in which I’ve replaced the West Greenland stack used in Jones et al 1998 with the slightly later version used in MBH99 (I’ve verified that the Jones version is an earlier version with the originator of the data.) In the Euro All-Star Team, in their unaccountable policy of using older data, they use the Jones version. There’s not much difference between the two series, but I’ve used the later series. I’ve also used the updated Polar Urals version (from Esper) instead of the Briffa Polar Urals series (there are other problems with this series in the 11th century which I wrote about last year. I’m convinced that some of the tree cores in the Briffa 1995 study have been misdated.) Anyway, merely by using the updated Polar Urals version, the relative medieval-modern levels are changed. In this case, the 1856-1980 correlations for the variation (0.37) are a very slight improvement on the correlations for the Juckes version (0.36) – so any "significance" attributed to the Juckes version is shared equally by the version with a higher MWP.


EHT Jones composite – Left – As archived; Right – variant in which Polar Urals update substituted for 1995 Briffa version.

Team Euro Code

Team Euro’s CPD submission online here has done a commendable job archiving data in their supplementary information online here . It’s by no means perfect, but it’s a big improvement in practices in the field. So while they’ve chosen to unfairly disparage my own efforts in respect to source code, perhaps one can construe the marked improvement in their own practices as a type of compliment.

They state that they will provide source code for the calculations after the review period as follows:

The software (in the `python’ language) used to generate the reconstructions presented in the manuscript will be made available on http://mitrie.badc.rl.ac.uk when the manuscript is published.

Now this is obviously an improvement, but, to the extent that they are trying to adhere to improved practices in the paleoclimate field – something that I obviously endorse – it would be much better if they archived their code at the time of submission, as required in econometrics journals.

Notwithstanding the above disclaimer, some of their Python functions are archived at the MITRIE website here, although there does not appear to be any unifying script to date. To give a flavor of Juckes’ programming style, I’ve attached his code for the do_regress.py function below. I suspect that their job would have been more cost-effective if they had used R, but people use what they’re used to and I’ve got no complaint about that.

At my present rate, I’ll have much of the article replicated in R in a day or two.

Keenan’s Comment on Chuine

Douglas Keenan has published an excellent comment on his attempt to replicate the prominent study by Chuine et al [Nature 2004] of harvest dates in Burgundy. The article being criticized is all too typical of a climate article in Nature – lurid headlines, lamentable statistics, ineffective peer review and obstructive authors.

It’s really nice to see someone else wade into this sort of study.

Keenan’s forthcoming article is here and two insightful additional comments are here and here.

You might want to take the opportunity to browse his website while you’re at it.

UPDATE: Keenan has added an Addendum to the page describing more technical issues with Chuine et al. This tells how the model performs no better than linear regression, and thanks Willis for pointing that out.

Re-Fried Greenland Ice Cores

This is not going to be an especially interesting post for most (probably any) of you, but I want to document a couple of points mostly as an aide-memoire for myself so I don’t forget.

I actually learned something new about the Jones et al reconstruction from the Euro Team study. I’ve been able to sort of replicate the Jones study, but not to anything like adequate precision. Jones did not archive data as used, but did publish diagrams of the data and I used digital versions cobbled together elsewhere – mostly from MBH. Of the 17 proxies in the "independent" Jones study, 14 were used in MBH. After trying diligently, I asked Jones to provide me with the data as used and code. He refused. THe Euro Team has posted up digital versions of 5 Jones series, 4 of which were identical to the Mann versions. However, their Greenland version was somewhat different. It had a high correlation (0.75) but differed in detail. I’ve gone back and traced Mann’s version to Fisher’s veries DELNORM6.CVG, which is his "super-stack", including sites: Crete, A, B, C, D, H, Milcent, GISP2 and 4 GRID series, which I haven’t been able to identify.

The Jones version has a high correlation to a scaled composite of Crete, A,B, C, D, H and Milcent, but I couldn’t locate a precise equivalent in a detailed data set sent to me by Fisher, the author of the series.

Hegerl also uses a "decadally smoothed" version of this series, which is archived by the Euro Team (but not by herself.) She doesn’t state what the decadally smoothing is. I experimented with various gaussian filters, which are common in Team studies and got close but not exact. Eventually I determined that she used an 11-year running average. This filter applied to the Mann version yielded the Hegerl version – which was cut off in 1960 (though the original goes to 1983). Here is a plot of the 3 versions in the 20th century.


Figure 1. Three versions of West Greenland dO18.

As an experiment, I plotted up the data from the 7 cores which are said to be in the Crete stack, shown below. A couple of cores have values in 1984, which was not included in the composite. However the 1984 dO18 values appear to be the lowest values in the duration of the record (which starts in 553)


Figure 2. Twentieth Century Values of 7 Ice Cores in Crete, Greenland Area

As a reminder, here is a plot of the values over the entire series. This is remarkable for the uniformity of values. Fisher’s thoughts on this were that the sites were on a high plateau and were relatively unaffected by centennial changes in temperature.


Figure 3. Three Versions of West Greenland dO18

Stern Review

The Stern Review has been published today here . I don’t intend to spend much time on it, but others may wish to comment. I noticed that he cited Hansen’s “Warmest in a Milllll-yun Years” article as authority for the claim that it was the warmest in 12,000 years. It s frustrating when policy recommendations are based in part on sophomoric splices like Hansen’s.

Road Map #3

Opinions expressed on Climate Audit, other than those expressed by Stephen McIntyre personally, are those of the individual posters and do not necessarily reflect the opinions of Climate Audit or myself.
Continue reading

Goosse et al 2006 and the MWP

Kevn UK writes in: "I’ve just been to the Climate in the Past and have just spotted the following report

“The origin of the European “Medieval Warm Period”
H. Goosse, O. Arzel, J. Luterbacher, M. E. Mann, H. Renssen, N. Riedwyl, A. Timmermann, E. Xoplaki, and H. Wanner”

“Abstract. Proxy records and results of a three dimensional climate model show that European summer temperatures roughly a millennium ago were comparable to those of the last 25 years of the 20th century, supporting the existence of a summer “Medieval Warm Period” in Europe. Those two relatively mild periods were separated by a rather cold era, often referred to as the “Little Ice Age”. Our modelling results suggest that the warm summer conditions during the early second millennium compared to the climate background state of the 13th–18th century are due to a large extent to the long term cooling induced by changes in land-use in Europe. During the last 200 years, the effect of increasing greenhouse gas concentrations, which was partly levelled off by that of sulphate aerosols, has dominated the climate history over Europe in summer. This induces a clear warming during the last 200 years, allowing summer temperature during the last 25 years to reach back the values simulated for the early second millennium. Volcanic and solar forcing plays a weaker role in this comparison between the last 25 years of the 20th century and the early second millennium. Our hypothesis appears consistent with proxy records but modelling results have to be weighted against the existing uncertainties in the external forcing factors, in particular related to land-use changes, and against the uncertainty of the regional climate sensitivity. Evidence for winter is more equivocal than for summer. The forced response in the model displays a clear temperature maximum at the end of the 20th century. However, the uncertainties are too large to state that this period is the warmest of the past millennium in Europe during winter.

The whole report in PDF format is given here

Potential Academic Misconduct by the Euro Team

I raise here a troubling issue pertaining to potential academic misconduct by the Euro Hockey Team. They stated:

The code used by MM2005 [the GRL article] is not, at the time of writing, available, ….

First, this statement is obviously untrue. The code used in McIntyre and McKitrick 2005 is available at the SI to the article which is cited in the publication. It is at ftp://ftp.agu.org/apend/gl/2004GL021750. (The code for MM05 (EE) is at http://www.climate2003.com/scripts/MM05_EE/ee2005.backup.txt as is the code for MM03). The Wegman Report specifically notes that they examined availability of our source code and confirmed that it was functional. Previously, Huybers examined our code as did Wahl and Ammann, neither of whom required any assistance from me. Huybers annotated the code in his Supplementary Information. So there is no possible question that this particular statement is false.

Given the fact that I’ve emphasized the importance of source code to determine what authors actually do, this untrue observation about supposed unavailability of code disparages Ross and myself.

We are the only authors that they criticize on this count. They didn’t comment on the unavailability of data or code from co-authors Briffa, Esper and Moberg or on Mann’s code and data. Esper never responded to any requests for data. It practically took a federal case with Science to get any data from him and it is still incomplete. Esper failed to provide any coherent explanation of his methodology, let alone source code. Briffa has never provided source code. Brifffa has even failed to even identify the sites in Briffa et al 2001 (which have been used in 6 other studies.) At this point, despite all the publicity, Mann has still not archived the actual results for his AD1400 step through to 1980 and the Euro Team didn’t collect it.

Standard definitions of academic misconduct include the following definitions:

1. Fabrication is making up data or results and recording or reporting them.
2. Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record.

It appears possible to me that that (a) their false allegation that the source code for McIntyre and McKitrick was unavailable at the time of writing; and/or (b) their failure to report that data and code from Briffa, Esper, Moberg, etc was unavailable at the time of writing, are, either jointly or severally, fabrication and/or falsification within standard definitions of academic misconduct.

"Honest error" or "difference of opinion" are defences to a charge of academic misconduct. However, the availability of source code is not a "difference of opinion" and, given their isolation of us for criticism, it is hard for me to picture how they could put forward a defence that the error was an "honest error" or one that could have been made with even the most cursory due diligence at the SI.

Is it plausible that none of the Euro Hockey Team co-authors were aware of this "inaccurate representation of the research record"? I list them below with their affiliations:

M. N. Juckes, British Atmospheric Data Centre, Rutherford Appleton Laboratory, UK
M. R. Allen, Atmospheric, Oceanic and Planetary Physics, University of Oxford, UK,
K. R. Briffa, Climatic Research Unit, University of East Anglia, UK,
J. Esper, Swiss Federal Research Institute for Forest, Snow and Landscape, Bern University, Switzerland
G. C. Hegerl, Dept. Earth and Ocean Sciences, Duke University, NC, USA,
A. Moberg, Department of Meteorology, Stockholm University, Sweden,
T. J. Osborn, Climatic Research Unit, University of East Anglia, UK
S. L. Weber, Royal Netherlands Meteorological Institute (KNMI), De Bilt, The Netherlands
E. Zorita,GKSS Research Centre, Geesthacht, Germany

I sent the following email today to Martin Juckes, copy to the other coauthors:

Dear Dr Juckes,

In your recent submission to Climate of the Past, now available for comment on the Internet, you make the following untrue and defamatory statement:

"The code used by MM2005 [the GRL article] is not, at the time of writing, available, …. "

As you either know or should know, the code used in McIntyre and McKitrick 2005 is available at the Supplementary Information to the article at ftp://ftp.agu.org/apend/gl/2004GL021750, as is made clear in the article itself. (The code for MM05 (EE) and MM03 (EE) are at http://www.climate2003.com/scripts/MM05_EE and http://www.climate2003.com/scripts/MM03 respectively). The Wegman Report specifically noted that they verified availability of our source code at the time of their report last summer. Previously, both Huybers and Wahl and Ammann had examined the source code, neither of whom required any assistance from me. Huybers annotated the code in his Supplementary Information.

Given our emphasis on making source code available, your purpose in making this untrue statement was to disparage us, especially when you isolated us for criticism while omitting to report the actual unavailability of data and code from authors in the field, including your coauthors.

I request that you withdraw this allegation from your submission to Climates of the Past on or before November 2, 2006 or I will take such other steps as I see fit.

Yours truly,

Stephen McIntyre

The Euro Team and the SWM Network

Principal components do not necessarily have an orientation. However, when you are making principal components from networks of tree ring widths, it’s a good idea to try to think about physical interpretations. Here’s a funny example where the Euro Hockey Team has lost its way. They observe the following:

for the proxy principal components in the MBH collection the sign is arbitrary: these series have, where necessary, had the sign reversed so that they have a positive correlation with the northern hemisphere temperature record).

Now there’s an interesting way to illustrate the potential pitfalls of this assumption in the Stahle/Southwestern US-Mexico network where the Euro Hockey Team has unwisely waded in. The EHT has made duplicate calculations in most AD1400 situations of PCs without using series extended to 1980. (As I noted in a post, in MM03, we noted that many of the series had been extended to 1980, but it’s not a point that we dwelled on and I’m pretty sure that we subsequently said that this was not an issue that we were particularly fussed about.) Nonetheless, the EHT have re-done all these calculations. Maybe they were checking to see, if by reducing the number of series being used, they could enhance the “desired signal” – this is perhaps the case in the NOAMER network. Trying to enhance the “desired signal” is an established dendrochronological procedure as stated by elsewhere by coauthor Esper as follows:

this does not mean that one could not improve a chronology by reducing the number of series used if the purpose of removing samples is to enhance a desired signal. The ability to pick and choose which samples to use is an advantage unique to dendroclimatology.

In doing so, they might not have noticed that the SWM network declines to only 2 series in the AD1400 network in their 00 network option. While they don’t say that they only used 2 series in this network, I’ve been able to replicate series #11 in archived mitrie_new_proxy_pcs_1400_v01.nc using two series obtained through this culling and am confident that this is what they did. Now the PC1 of two series is the average – which has a fairly obvious interpretation. In this particular case, I’m satisfied that the two series in the Euro Team “network” are the earlywood and latewood widths of Cerro Durango and Los Angeles Sawmill, WDCP/ITRDB codes mexi023e and mexi023l (which I’ve specifically compared and matched to Mann series swmxdfew09.dat and swmxdflw09.dat.

Figure 1 below shows the plot of the Euro SWM PC1 and the two contributing series. As you see, the underlying series have greater ring widths in the 15th century than in the 20th century. The site in question is at 3170 meters. We’re told that ring widths of these high-altitude sites are supposed to be linearly correlated to temperature so the ring widths themselves are supposed to have some physical meaning. However, in establishing an orientiation for the SWM PC1, the EHT have flipped them over “so that they have a positive correlation with the northern hemisphere temperature record”, disregarding the presumed physical interpretation of the ring widths. While this, in some small way, aids the project of enhancing 20th century values relative to 15th century values, the flipping of the PC1, in this particular case, seems a little, shall we say, opportunistic on the part of the Team.


Figure 1. PC1 is downloaded from mitrie_new_proxy_pcs_1400_v01.nc. The EW and LW series are Mann’s versions, which can be traced to mexi023e and mexi023l, Cerro Durango and Los Angeles Sawmill, PSME, 3170 m.

Does it “matter”? I don’t think that it “matters” very much to the final reconstruction. But, at this point, people should be trying for a little craftsmanship.

This little fiasco also nicely points out one of the problems with using PCs at all. PCs discard information on the orientation of a series – whether it points up or down, requiring later potentially ad hoc interpretations of the results using expedients like correlation with NH temperature to determine whether the PC series is pointing up or down. It would be better not to discard the information on the orientation. If the Euro Team were actually trying to advance the state of the art, that’s what they’d be thinking about. As it is, I think that someone on the Euro Team deserves a few minutes in the penalty box, don’t you?

The Euro Hockey Team and Yamal

Readers of this blog are familiar with the Yamal subsitution. Briefly, Briffa et al 1995 reported in Nature that 1032 was the coldest year of the millennium based on no more than 3 poorly dated and short cores in the 11th century.

Subsequently new cores were dated to the 11th century by Schweingruber, resulting in the opposite situation – a very warm 11th century. Instead of reporting the new information, Keith Briffa in Briffa 2000 seamlessly inserted another site – Yamal – over 100 km away – indeed this site is sometimes denoted “Polar Urals” – without reporting the “bad” results from the Polar Urals.

However, Esper didn’t get the memo and used the Polar Urals update in his 2002 reconstruction. This resulted in a rather elevated MWP in Esper, which he attempted to “solve” by using not one but two foxtail sites to lower the MWP.

By itself, using the Polar Urals Update, instead of the Yamla substitution, gives a high MWP to (say) the Briffa 2000 reconstruction. So how did the Euro Hockey Team grasp this particular nettle? Here’s what they say:

the Polar Urals data of ECS2002 [Esper], MBH1999 and the Tornetraesk data of MSH2005 [Moberg] have been omitted in favour of data from the same sites used by JBB1998 and ECS2002, respectively (i.e. taking the first used series in each case).

In what other field would people use the older data instead of the newer data? The audacity of the Team in moving the pea under the thimble is sometimes breathtaking.
Continue reading