Jan and Ulf’s Nature Trick: The Hottest Summer in 2000 Years

A couple of weeks ago, the New York Times and other institutional media proclaimed that “tree rings” (sometimes “ancient tree rings”) had “shown” that 2023 was the warmest summer in 2000 years (link link) Almost 25 years to the day since they had similarly proclaimed that 1998 was the warmest year in 1000 years.   The recent proclamation, as in 1998, was based on an article in Nature (Esper et al, 2024); the proclamation in 1998 was similarly based on an article in Nature (Mann et al, 1998), together with its companion article in Geophysical Research Letter (link).

A “Confidence Interval” Trick

In addition to the similarity of the conclusions to the two article, the structure of the money diagram in both cases is almost identical, as shown in the comparison below.

Figure 1. Left – On the left is the money diagram from Mann et al (1999)  showing confidence intervals from the Mann reconstruction in light grey, the reconstruction in black. The reconstruction itself only went to 1980. The instrumental temperature (as an “anomaly”) for 1998 was shown as a point. A horizontal line was then drawn across the diagram to show that the 1998 instrumental temperature exceeded the confidence interval for all prior dates of the reconstruction – hence the “warmest year in 1000 years”.  Right – the Esper et al 2024 reconstruction only went to 2010.  The point estimate for 2023 exceeds the confidence intervals for all prior years.

Each diagram shows purported confidence intervals around the reconstruction in light/medium gray. Each diagram also denotes a the recent instrumental estimate featured in the headline as a highlighted point.  In each case, the highlighted instrumental point is more than 10 years after the end of the reconstruction. In each case, the conclusion is obtained by observing that the instrumental point is higher than any of the upper confidence limits of the corresponding reconstruction.  In other words, both Esper et al (2024) and Mann et al (1998-99) used the same technique.

In Climategate controversy, there was voluminous discussion of the term “Mike’s Nature trick”, but, to my recollection, there was little to discussion of the term in relation to the use of confidence intervals to arrive at a “warmest year” conclusion.

In his notorious Climategate email, Phil Jones described “Mike’s Nature trick” as the splicing of proxy and instrumental temperatures that he had carried out for the World Meteorological Organization.  Mann had indeed spliced proxy and instrumental temperatures for the calculation of the smoothed reconstruction illustrated in the article, but had cut the smoothed version back to 1980 (the end of the proxy data.)  Mann vehemently denied that the splicing of proxy and instrumental data was “Mike’s Nature trick” and instead claimed that Mike’s Nature trick was nothing more than showing an estimate (reconstruction) and actual (observed temperature) on the same figure, clearly marked.  But this is such a benign and commonplace statistical practice that it cannot be reasonably described – even by the statistically and mathematically challenged Jones – as a “trick” in the mathematical sense.  A mathematical “trick” implies ingenuity or novelty, but showing estimate vs actual is trivial and commonplace.

However, the technique shown above – comparing a point estimate in a year without proxy data to the confidence envelope of the proxy reconstruction – was a novelty introduced by Mann and, in that sense, an actual candidate for the term “trick” (sensu mathematics) as opposed to the commonplace and trivial comparison of estimate and observed on the same figure.

The validity of this comparison (for Mann et al 1998-99 and Esper et al 2024) depends not just on the validity of the reconstruction, but the validity of the confidence intervals.  

Some Comments on Mann et al Confidence Intervals

Mann et al went to considerable lengths to obfuscate analysis of their confidence intervals and, indeed, key information related to that calculation was discovered within the last year – see link.  Mann’s reconstruction was done in 11 steps, with each step having a different reconstruction. The purported confidence interval for each step was twice the standard error in calibration period for that step.  However, to this day, Mann never archived the results of each step, and, to its shame, Nature refused a request that Mann be required to archive the results for each step.  Mann had also refused to provide us with the residuals that had been used to calculate the standard errors. (However, he did provide residuals to Tim Osborn in a Climategate email and this data became available as a result of Climategate.)  Further complicating analysis, while Wahl and Ammann and ourselves could replicate one another’s results, neither of us could replicate Mann’s results.  In the last couple of years, more than 20 years after the original study, Swedish engineer Hampus Soderqvist, by reverse engineering, finally accomplished an exact replication of Mann’s reconstruction – it turned out that Mann’s list of proxies used in early steps was inaccurate: several listed proxies were not actually used, and several unlisted proxies were used.

One of the long standing (and, unfortunately, under-appreciated) issues with the Mann et al 1998-99 reconstruction is overfitting in the calibration. Due to a sort of inverse regression of instrumental temperature on large networks of proxies (a variation of Partial Least Squares). If there is overfitting in the calibration period, the standard error in the calibration period will be artificially small.  To get a more realistic estimate of confidence intervals, one needs to use the standard error in the verification period.  There is a direct relationship between the calibration period standard error and calibration period r2: a high r2 statistic necessarily means a small calibration standard error and vice versa.

If one is concerned about calibration period overfitting, the verification period r2 statistic is a simple and effective test: if there is a high calibration r2 and dismal verification r2, then there has almost certainly been statistical overfitting and a flawed model.  In addition to withholding the results of the individual steps, Mann et al 1998-99 concealed extremely low verification r2 statistics.

In the recent Mann-Steyn libel trial, the DC judge blocked both McKitrick and myself from presenting any technical evidence on Mann’s failed verification statistics or Mike’s Nature trick, even refusing to allow the defense to show a table of failed verification statistics that had been published in Geophysical Research Letters.

These controversies have been known for some time, but they connect directly to the confidence interval trick. Had the “money diagram” been calculated with standard errors from verification period residuals, the confidence interval envelope would almost certainly have been at least double the size of the confidence envelope shown in MBH98 and MBH99.  (Late last year, Soderqvist made exact stepwise MBH results available for the first time. Re-doing the diagram with verification period standard errors would be a useful exercise.)

Some Comments on Esper et al Confidence Intervals

Needless to say, there are many issues with both the Esper et al 2024 and with its associated “confidence” intervals.  But before getting into the details, I recommend that interested readers re-examine a truly excellent 2012 article by Esper, Buentgen and coauthors also in Nature (link), an article on which I favorably commented many years ago (link).  Esper et al (2012) observed that tree ring chronologies (black below) dismally failed to record huge millennial-scale change in the high-latitude Northern Hemisphere summer insolation (and temperature), showing the following diagram in their SI.

Figure S1. Temperature trends recorded over the past 4000-7000 years in high latitude proxy and CGCM data. Multi-millennial TRW records from Sweden 1 , Finland2 , and Russia 3 (all in grey) together with reconstructions of the glacier equilibrium line in Norway4,5 (blue), northern treeline in Russia 3,6 (green), and JJA temperatures in the 60-70°N European/Siberian sector from orbitally forced ECHO-G 7,8 (red) and ECHAM5/MPIOM9 (orange) CGCM runs10. All records, except for the treeline data (in km) were normalized relative to the AD 1500-2000 period. Resolution of model and TRW data were reduced (to ~ 30 years) to match the glacier data.

Esper et al (2012) observed that summer insolation at 50N decreased by more than 35 wm-2 (!!!) since the early Holocene  and 6 wm-2 since Roman times – vastly more than the forcing of ~1.5 wm-2 associated with increased CO2 since pre-industrial times. They identified changes in the northern treeline at Yamal (green) and changes in the equilibrium altitude of a small ice cap in Norway (blue) as two proxies that were responsive to large-scale millennial-scale changes in insolation and high-latitude summer temperatures.

Conclusion

Whether or not the comparison of an observed temperature point to the confidence envelope of a reconstruction to draw conclusions about “warmest year in 1000 years” was precisely what either Mann or Jones defined as “Mike’s Nature trick”, it can be fairly described as a trick (sensu mathematics), whereas plotting an estimate and observed on same figure is so commonplace and trivial that it cannot reasonably be described as a trick (sensu mathematics.)

In that spirit, I think that it is fair to describe “Mike’s Nature trick” (and the similar trick employed by Esper et al 2024) as a confidence trick.  In the mathematical sense, of course.

As a caveat, readers should note that the question of whether tree rings (ancient or otherwise) show that 2023 (1998) was the warmest summer (year) in 1000 or 2000 years is a different question than whether 2023 was the warmest summer in 1000 years.  My elevator take is

  1. that 20th and 21st century warming are both very real, but that the 19th century was probably the coldest century since the Last Glacial Maximum and that the warming since the 19th century has been highly beneficial for our societies – a view that was postulated in the 1930s by Guy Callendar, one of the canonical climate heroes;
  2. per Esper et al 2012, given the failure of tree ring chronologies to reflect major millennial-scale changes in summer insolation and temperature, what possible reliance can be attached to pseudo-confidence intervals attached to 2000-year tree ring chronologies in Esper et al 2024 (or any other tree ring chronologies)
  3. in addition, we know that there is global-scale “greening” of the planet over the past 30-40 years that has been convincingly attributed to enhanced growth due to fertilization by higher CO2 levels. So, in addition to all other issues related to tree ring chronologies, it is necessary to disaggregate the contribution of CO2 fertilization from the contribution of increased warming – an effort not made by Esper et al 2024 (or its references.)

In a follow-up article, I will examine details of the Esper et al 2024 reconstruction, which, among other interesting features, connect back to Graybill bristlecone sites and the Briffa sites under discussion in the period leading up to the Climategate emails.

16 Comments

  1. joethenonclimatescientist
    Posted May 24, 2024 at 3:17 PM | Permalink | Reply

    As SM notes, the judge barred SM & RM from presenting any technical evidence of Manns statistical errors. I dont recall any testimony from Wyner as to the statistical errors by Mann, other than a broad discussion of using better methods would have yield much lower confidence ranges. ie I dont recall any specific detailed mention of any of the specific problems.

    My question for SM or RM – Was Wyner also barred from testifying on the specific errors?

    In general SM was barred as an expert which has a legal basis under daubert ( though I have to agree that SM certainly qualified as an expert under daubert for the issue at hand).

    My second question, Even though you did qualify as a “fact witness” , what else did the judge bar you from testifying that would have been of a fact nature and not of an expert opinion?

    My last comment – the Judge, and the defense completely botched the Harte Hanks standard in the case, though, that jury was going to find Styen liable no matter how solid the facts & defense in his favor.

    Thanks for any insight to my two questions

    • Stephen McIntyre
      Posted May 24, 2024 at 8:33 PM | Permalink | Reply

      Wyner wasn’t barred from testifying on the issues that were relevant to misconduct and even fraud. However, he failed to address those issues in his “expert” report. Making matters worse, he had an incorrect understanding of Mann’s methodology. the defense lawyers should have set out terms of reference for Wyner’s report that listed the key issues – all familiar to Climate Audit readers – and sent his report to me or McKitrick for review. Instead they seem to have left it up to Wyner, who didn’t know anything about bristlecones, Gaspe etc etc. Wyner;s report was academic and unhelpful. Wyner himself was engaging and charming, but his testimony was a waste of time. Mostly the fault of the defense lawyers who didn’t understand this aspect of the case.

      In fairness to Baker Hostetler lawyers (who are competent), their strategy was entirely directed towards showing that there was no damage. And the result for their client was very acceptable to the insurance company: $1 damages, $1000 punitive.

      Steyn was far too sick and weak to offer a proper defense. A healthy Steyn could have defended himself, but he needed to be briefed. This never happened.

      • joethenonclimatescientist
        Posted May 25, 2024 at 3:08 PM | Permalink | Reply

        Thanks for the insight –
        Though I doubt any of the jurors would have been inclined or even influenced by a better expert report. (I doubt any of them read the expert report).

        I was very disappointed by the defense doing such an extremely poor job on the Harte Hanks standard – ie did the defendents have reason to believe their statements were true. The jury instructions were very weak (pathetically wrong) on Harte Hanks – almost to the point, that I would have had to find for the plaintiff.

      • jddohio
        Posted May 31, 2024 at 11:20 AM | Permalink | Reply

        “In fairness to Baker Hostetler lawyers (who are competent), their strategy was entirely directed towards showing that there was no damage. And the result for their client was very acceptable to the insurance company: $1 damages, $1000 punitive.” I think they did a very fine job fighting upstream against a bad jury pool. Unfortunately, the scientific issues would have gone way over the jury’s head unless they were presented in a very basic simple way — similar to Feynman explaining the Challenger crash. To be really effective, I would guess the defense lawyers would have had to spend 3 weeks learning the intricacies of what Mann did incorrectly and then finding a way to present it in basic terms to the jury. Might have cost the defendant an extra $1,000,000 in attorney fees.

        I would also add that many climate realists seem to have an obsession with explaining everything in a completely accurate manner (making understanding very complex) instead of explaining what is going on in basic terms at the beginning subject to the subtleties being explained accurately in longer written portion of article. One of the worst examples of this is Roy Spencer’s website which explains global temperatures only in the framework of anomalies instead of having the actual month to month and year to year temperatures listed in addition to the anomalies. In so doing, he reduces his audience by about 95% because most people don’t understand anomalies and simply would like to take a quick look at whether temperatures went up or down for the month or year. I commented on this once and of course, nothing was done. I don’t bother to go to the website because he makes me work to find what should be very simple information.

        • Stephen McIntyre
          Posted Jun 3, 2024 at 1:43 PM | Permalink

          the defense spent a day with their “expert” who, unfortunately, didn’t understand Mann’s methodology and who didn’t address any of the fraud/misconduct issues that had been identified at Climate Audit over the years. Maybe it wouldn’t have mattered given the jury pool, but his testimony and report were a waste of time. I hadn’t even been sent a copy of the “expert” report until a couple of weeks before trial. But by then, it didn’t matter.

          Mann’s concealment of adverse verification results is something that ordinary Climate Audit readers understood. It could have been explained to a DC jury. But defense lawyers failed to ask either Mann or Bradley about verification results in cross-examination. The Baker hostetler lawyer who did the cross-examination had never talked to me prior to trial (except a brief hello) and didn’t ask some obvious questions. The judge then blocked both McKitrick and me from answering questions about verification statistics. Whether this was due to judicial error or a defect in filing by defense lawyers isn’t clear to me, but, either way, Steyn got sandbagged. but he was so sick and weak that he was unable to carry out a proper defense.

          Whether Spenser’s website is unclear on the issues that you mention is a different issue than Mann’s concealment of verification statistics.

          Also, the defense didn’t even challenge Mann on the deletion of the adverse portion of Briffa series in IPCC diagram. The diagram wasn’t even an exhibit.

        • joethenonclimatescientist
          Posted Jun 3, 2024 at 6:48 PM | Permalink

          SM comments of the defense attorneys failing to ask Mann about the verification results.
          My recollection is that Mann’s attorney asked Mann about the residuals ( verification stats?) and his response was that they were meaningless number or that the residuals were throwaway numbers – or something to that effect. Very short once sentence question and very short response.

  2. MikeN
    Posted May 24, 2024 at 6:52 PM | Permalink | Reply

    Does Phil Jones’s WMO chart have separate temperature and tree-ring lines? If the answer is no, then Mann’s explanation is proven false.

    • Stephen McIntyre
      Posted May 24, 2024 at 8:36 PM | Permalink | Reply

      Mann used multiple “tricks” in his article. Jones’ understanding of Mann’s techniques was essentially non-existent. So one can’t read too much into Jones’ characterizations.

      My point in respect to Mann’s “explanation” is as I said in article: showing an estimate together with the actual is NOT some sort of trick (in the mathematical sense). It’s trivial.

    • Stephen McIntyre
      Posted May 24, 2024 at 8:36 PM | Permalink | Reply

      Mann used multiple “tricks” in his article. Jones’ understanding of Mann’s techniques was essentially non-existent. So one can’t read too much into Jones’ characterizations.

      My point in respect to Mann’s “explanation” is as I said in article: showing an estimate together with the actual is NOT some sort of trick (in the mathematical sense). It’s trivial.

  3. Curious George
    Posted May 24, 2024 at 8:59 PM | Permalink | Reply

    “tree rings” (sometimes “ancient tree rings”) had “shown” that 2023 was the warmest summer in 2000 years.”
    Where can I find 2023 tree ring data?

  4. Jeff Alberts
    Posted May 24, 2024 at 10:21 PM | Permalink | Reply

    I still object to the use of single lines representing global temperature, anomaly, or whatever.

    How many people live in the “average global climate”? It’s nonsense. It gives, to the layperson, the incorrect view that temperatures rise and fall all over the globe at the same time.

  5. David Brewer
    Posted May 25, 2024 at 2:12 AM | Permalink | Reply

    How did they justify no confidence intervals around the 1998 and 2023 point estimates?

  6. A
    Posted May 25, 2024 at 6:13 PM | Permalink | Reply

    I was waiting for this. Bravo!

    You really should write a book.

    • Andrew Russell
      Posted May 26, 2024 at 5:14 PM | Permalink | Reply

      A book by Steve on these issues would be a master class in statistics!

  7. joethenonclimatescientist
    Posted May 25, 2024 at 7:46 PM | Permalink | Reply

    one more question – any insight on the confidence intervals or the verification stats for the pages 2k or other paleo reconstructions? thanks

  8. ccscientist
    Posted May 26, 2024 at 2:48 PM | Permalink | Reply

    In the following paper, I showed using several types of data (not tree rings per se) that forest in E US have shown 30% increased growth over the past 100 yrs
    Loehle, C. 2020. Forest Growth Trends in the Eastern United States. Forestry Chronicle 96:121-129.
    Hardly tree mometers. Probably mostly CO2 fert effects.

3 Trackbacks

  1. […] Jan and Ulf’s Nature Trick: The Hottest Summer in 2000 Years […]

  2. […] Jan and Ulf’s Nature Trick: The Hottest Summer in 2000 Years […]

  3. […] Jan and Ulf’s Nature Trick: The Hottest Summer in 2000 Years […]

Post a Comment

Required fields are marked *

*
*