A Quote from Esper et al [2003]

You’ve all seen my frustration with Jacoby and his doctrine of a "few good men". I haven’t posted on this, but, one thing that puzzled me was some missing inventory numbers at Polar Urals, just before the critical trees 45, 46 and 47 (upon which the "coldest year of the millennium" depends.

Here’s a comment from Esper et al. (Cook, Krusic, Peters, Schweingruber) from Tree Ring Res. 2003, p.92, which may shed some light on this. Tree Ring Research, unlike Science, is intended for narrow circulation.

Before venturing into the subject of sample depth and chronology quality, we state from the beginning, "more is always better". However as we mentioned earlier on the subject of biological growth populations, this does not mean that one could not improve a chronology by reducing the number of series used if the purpose of removing samples is to enhance a desired signal. The ability to pick and choose which samples to use is an advantage unique to dendroclimatology. That said, it begs the question: how low can we go?

It is taking all my will power to avoid making a sarcastic comment or making a slightly out-of-context rhetorical answer to the last question.


  1. Dave Dardinger
    Posted Sep 14, 2005 at 11:57 AM | Permalink

    Actually you could have left a blank space and we all could have filled it in with no trouble. So all that left unsaid, what exactly is the logic claimed behind enhancing signals by selective reduction of the data used? I can only imagine it’s the tree-ring equilivant of someone having a whole bunch of art-prints from a master lithograph and looking at them and saying, “This one is blurry, that one is slghtly skewed. This one over here, however, is nice and sharp.” Can there be such a self-verification in tree-ring samples? Where’s the theory being used?

  2. ET Sid Viscous
    Posted Sep 14, 2005 at 12:09 PM | Permalink

    But that (lithograph choosing) is an aesthetic call by an individual. Another individual might prefer the blurred litograph, another might appreciate the skewed one. That is an artisic call.

    Scientific data is held to a different standard. It’s value is not based on personal preference. 2+2=4. You might prefer 2+2=2 because it looks better, but sadly it’s not true.

  3. Steve McIntyre
    Posted Sep 14, 2005 at 12:31 PM | Permalink

    Mining promoters aren’t allowed to “pick and choose” which drill holes to report. Neither are drug companies. I hope that he meant something different than he said.

  4. TCO
    Posted Sep 14, 2005 at 12:44 PM | Permalink

    This behaviour by Esper, Jacoby (or whoevever is advocating it, doing it) makes me a bit leary. Even if there is a GOOD reason to do it, one ought to keep track and explain what data was excluded and why. I guess there might be some rationale for doing this if for instance the variability was much higher or if one could somehow prove confounding factors in some samples but was at same time not able to deconvolute with a multiple correlation (multiple regression). In any case, you ought to save the info and explain exactly what was done…just in case, later the method used is shown to be biased (allows others to debias and still get value out of the work).

  5. Chas
    Posted Sep 14, 2005 at 3:13 PM | Permalink

    The Esper article is pretty thought provoking. But taking one step back: Do the dendro drillers aim to consider only isolated trees and remains or are they quite happy to include cores from woodland sites?
    One would imagine that the long term growth behaviour of individual trees in woodland is more influenced by the fate/dominance of their nearest neighbours than any long term climatology.
    It seems odd that forest dymanics models (some of which seem to do 1000 year simulations) arent called in to produce, for example, some indications of the number of tree cores that might be required to reconstruct a reasonable climate history, or perhaps to test various RCS variants.

  6. Jeff Norman
    Posted Sep 14, 2005 at 7:46 PM | Permalink

    What I don’t understand is that there should be thousands of tree ring samples from thousands of locations in the Northern Hemisphere. Is this preconception of mine at all true?

    If yes, why are we seeing so few locations reported/”picked”?

  7. Steve McIntyre
    Posted Sep 14, 2005 at 10:22 PM | Permalink

    There’s the question… maybe because most of them don’t have hockey stick shapes. So the ones that do – have to do extra duty. Now if they turn back down, like TTHH, then there’s a problem.

    As I’ve mentioned before, if Hughes’ studies at Sheep Mountain in 2002 showed a big hockey stick in the 1990s, we’d have heard about it by now. As Sherlock Holmes said, the clue is the dog that didn’t bark.

  8. TCO
    Posted Sep 15, 2005 at 10:55 AM | Permalink

    Bla bla bla. If you want to take this any further than internet sniping, you’re going to need to do (or sponsor or motivate someone els to do) the experiments that show bias in selection of samples. You can get a little ways with the after the fact audits and come up with suspicious snippets. But it won’t be conclusiver (or have traction in terms of the debate) unless you have alternate studies done that show different tendancies. (And if you do end up getting that type of experimental work done…and it supports the Manniacs…you owe it to science, to bring that forward and to tell the world that now the Manniacs or Jacobists or whatever are proven out with higher quality results/analysis.)

  9. Steve McIntyre
    Posted Sep 15, 2005 at 12:30 PM | Permalink

    In a sense, I’m just studying Esper et al [2002] and posting up study notes. The Hockey Team claims that these various studies MBH99, Jones et al 1998; Esper et al 2002 are all independent. Now they are obviously not “independent” in the sense that the coauthors overlap. You see Briffa, Jones, Bradley, Mann, Schweingruber… in the author rosters. They say that the proxies are independent and the methods are independent.

    I’m just working through Esper et al 2002 as I go. My view on Jones et al 1998 and Crowley and Lowery is that the “hockey stick ness” of the results are derived from bristlecones and the incorrectly cold MWP Polar Urals and Dunde series. The other series are just white/red noise and are meaningless to the results.

    In Moberg, there are a couple of new series that contribute the hockeystickness: (bizarrely) the Oman coldwater diatoms. I’m halfway through that.

    I got a bit of a foothold on Esper from being able to benchmark the Gotland series and, despite not being able to get all the data, I’m glipsing its structure now. My hypothesis is that any hockey stickness in Esper will be driven by the foxtails, the incorrectly cold Briffa Polar Urals series and the Jacoby Sol Dav, Mongolia series.

    The Sol Dav series has a hockey stick shape; but most tree ring series in the 20th century do not have a closing uptick. Briffa complains about this. The Hockey Team naturally selects series with closing upticks and these get incorporated. Thus Sol Dav is in Esper; Mann and Jones 2003; Jones and Mann 2004.

    Where am I going with this: non-robustness. the “active ingredient” in these are a few series. Cherry-picking: hockey stick shaped series are data mined and then repetiveiy used.

    In Esper’s case, this is complicated because the foxtail series are not available, so I’m just guessing at them. I plan to look at each site of Esper’s. There are some interesting tidbits in each one. Not all of them will have points of interest. I like to do some systematic plots for each site to see what they look like. If I post a note up on the blog while I’m doing it, it keeps me a little more organized about it. Each one isn’t necessarily very fascinating, but I like to see what turns up.

  10. TCO
    Posted Sep 15, 2005 at 1:08 PM | Permalink

    I think you do a great job, Steve. Thanks for the expanded explanation. My earlier comment was specifically related to the “dog that didn’t bark” comment about the Sheep Mountain 1990s not being reported. To get traction on that sort of stuff, one needs to show what was ignored (not just the suspicion).

    I see the issue of independance of the multi-proxy studies as a seperate (relevant) issue from what I was addressing in (8). And easier to address, with the approach of parsing others’ work vice doing new work.

  11. Steve McIntyre
    Posted Sep 15, 2005 at 2:34 PM | Permalink

    I appreciate the encouragement.

    Back to a personal question you raised a month ago about do I wonder about what if my career had been different, sure I wonder. A guy that was 3 grades ahead of me at my high school won a Nobel Prize in economics about 2 years ago. When he was a high school senior and I had just turned 14, I beat him in the big international math contest. The story of Galois is probably a very bad one to tell young math students, as I got discouraged at 20-21 because I didn’t have any brilliant ideas and wasn’t Galois. While I’ve had many interesting adventures in business and have enjoyed the raffish characters of micro-cap business, I’ve always been disappointed not to have had any academic accomplishments. But you get busy with kids and making a living. A few years ago, my kids were mostly grown up, business was slow and I got interested in this. If I’d followed a more traditional route, I would never have stumbled upon this interesting little niche.

    I really have to finish the set of critical papers on Moberg, Jones , Esper etc., simply because these things continued to be relied on and I’m most of the way through them. After that, I’ll try to do something more constructive. If the Hockey Team doesn’t like me now, they will like me less after these critical papers are all done. But there’s a generational thing. When I was at the AGU last year, I don’t think that any 40-45 year olds had anything but hate, while quite a few 25 year olds were intrigued and thought that what I was doing was pretty good.

  12. John Hekman
    Posted Sep 15, 2005 at 3:07 PM | Permalink

    Re #10: Cherry-picking the tree-ring series is only half of the problem. The use of PC analysis is advanced cherry-picking. I mentioned to an academic friend who teaches econometrics that MBH was based on PC analysis, and he said it was equivalent to stepwise regression. Stepwise regression has not been acceptible in econometrics since maybe the 1950s. From an econometrics textbook: “While stepwise regression can be useful in looking at data when there are a large number of possible explanatory variables [but tree ring width is only a single explanatory variable], it is of little value when one is attempting to analyze a model statistically. The reason is that ‘t’ and ‘F’ tests consider the test of a null hypothesis under the assumption that the model is correctly specified. If we have searched over a large set of variables, selecting the ones that fit well, we are likely to get significant ‘t’ tests with great frequency [tell them about it, Steve]. As a result, large ‘t’ statistics do not allow us to reject the null hypothesis at a given level of significance.”

  13. Steve McIntyre
    Posted Sep 15, 2005 at 4:12 PM | Permalink

    It’s actually even worse than this, as there are two layers of data mining – one in the PC stage and one in the regression-inversion stage, which I’ve not talked about much, but which I’ve spent a lot of time on. I’ve been criticised for calling it a “laboratory of horrors”, but you really could do an interesting course simply on the defects in MBH, as it pretty much finds every statistical defect ever discovered.

  14. TCO
    Posted Sep 15, 2005 at 7:51 PM | Permalink

    Thanks for the personal update. At least you have the family. That is a real accomplishment in life.

  15. bender
    Posted Oct 19, 2009 at 1:04 PM | Permalink

    When I recounted the Esper quote to “luminous beauty”, I was challenged in this way:

    “Show me you can think or STFU.”

    That’s fair. “luminous beauty” was “hinting” that the quote had something to do with a priori site selection, not a posteriori data-sifting. Myself, I can’t see how the obligation to pick the right sites a priori could be an “advantage” let alone one “unique” to dendroclimatology. [kim agreed, but hey, that's kim, right?] So I gave the authors the benefit of the doubt – that they’re not illogical people – and tried to figure what they might possibly mean by this puzzling remark.

    Here was my response to “luminous beauty”:

    It behooves me to mention that I don’t see as an “advantage” the *burden* of having to choose sites in particular, hard-to-find circumstances, where. moreover, no experiments have ever been done to calibrate ACTUAL responses to temperature in a controlled experiment.

    To further suggest that this burden of selecting the correct sampling method (site and species combination) is “unique” to one domain is non-sensical. The only sense in which it is unique is that you must GUESS at the species-site combination that you PRESUME will give a strong response.

    This is not an “advantage”. It is a *disadvantage*.

    So, naturally, the reader is led to contemplate: what advantage could he possibly be thinking of? The advantage of substituting “unresponsive” chronologies for “responsive” ones? Hmmm. That would be statistical suicide; yet it seems this is precisely what Briffa has done. Did Briffa incorrectly invoke LB’s interpretation of the Esper principle? Did he invoke my interpretation instead?

    I’ve had no reply. But a reply would be welcome in this thread. By anyone. Keith Briffa. Rob Wilson. Jan Esper, or any of his co-authors. Even Gandalf himself.

  16. bender
    Posted Oct 19, 2009 at 1:06 PM | Permalink

    How low? Pretty low.

  17. bender
    Posted Oct 19, 2009 at 1:08 PM | Permalink

    Can we fix the link to the paper above? I don’t want to be accused of hiding context.

  18. bender
    Posted Oct 19, 2009 at 1:12 PM | Permalink

    From the “laboratoy of horrors” comes a substance so addictive …

  19. bender
    Posted Oct 19, 2009 at 1:19 PM | Permalink

    Esper et al (2003) Tests of the RCS method

  20. bender
    Posted Oct 19, 2009 at 1:39 PM | Permalink

    Chronology “stripping” can increase the statistical quality of tree ring chornologies … although Briffa claims he never does this in practice.


