Verifying RCS Methodology

One reader asked whether my RCS results held up using “standard” software. There is no “standard” software for RCS. It is different than ARSTAN. Further, despite the use of Briffa’s RCS chronologies in many multiproxy studies, until the present data sets were archived, to my knowledge, there were no public examples where both a measurement data set and an RCS chronology had been archived. The following post is technical.

Having said that, as I’ve observed before, the method as described is actually simpler than “conventional” standardization as, on its face, one negative exponential curve for aging is fitted to the entire population with the chronology; the observed ring widths are divided by the corresponding smooth, with the chronology value in the year being the resulting average. This can be implemented in R with the nls function in a few lines of code. (My RCS.chronology function on file provides some variations developed in experimentation that I probably won’t carry in my code on an ongoing basis.) The formula is:

rw= A+B*exp(-C*age)

I told the reader that I’d gotten very close results, but the sudden availability of a data set for benchmarking is an opportunity that I can hardly pass by. So I’ve benchmarked my algorithn against the three Briffa data sets with some interesting results.

TornFin
The first graphic compares my RCS emulation of the Briffa TornFin chronology from the archived TornFin data set. The top panel shows the archived chronology; the second panel my emulation in R; the third panel the difference. After the first couple of hundred years, the calculations are indistinguishable – the differences are negligible – less than 0.01 and often 0.001. This shows convincingly that, in this network, one curve is fitted to the entire network. Otherwise, differences would be greater. Differences arise in the first part of the series: we know that the underlying data set covers a longer period. It seems pretty clear to me that the chronology in the first few hundred years uses cores that are not included in the measurement archive – which, to that extent, is incomplete (but is complete for this calculation after the teething period. By 0 AD, the start of the Briffa analysis period, the two versions are in synch.


Figure 1. TornFin Emulation.

Avam-Taimyr
The next figure shows the same thing for Avam-Taimyr. Here the emulation is very close, but not exact. There’s a little low-frequency variability that doesn’t occur in the Tornetrask case. Given the virtually exact replication, this one should replicate exactly as well. Maybe there are a few cores missing here and there. Dunno. The discrepancy at the start of the series is less than at Tornetrask, suggesting that there is less contribution from early cores not included in the measurement archive.


Figure 2. As above for Avam-Taimyr.

Yamal
Next here is the same graphic for Yamal. Like both graphics, the emulation is very good throughout most of the series, though not exact like TornFin. Like TornFin, there is a discrepancy at the beginning, presumably due to missing cores used in the early portion of the chronology; this discrepancy doesn’t have a noticeable effect after 0 AD, the start of the period under study. However, there is also a discrepancy at the end of the period, with my emulation actually being a little more HS-ish than the archived versions.

This looks to me like there is a discrepancy of 1-3 cores at the end between the archived measurement data and the measurement data set used in the calculation of the archived CRU chronology. Does this matter? Yes and no. It’s something that one has to watch out for sensitivity studies.

34 Comments

  1. Fred Harwood
    Posted Sep 29, 2009 at 4:26 PM | Permalink

    Nice addition, Steve.

  2. bender_team_member
    Posted Sep 29, 2009 at 5:41 PM | Permalink

    So what you’re saying is that your method doesn’t really work that well? Figured as much.
    🙂

  3. giano
    Posted Sep 29, 2009 at 5:43 PM | Permalink

    Many thanks Steve for taking the time to perform these analyses. This is not exactly what I suggested but it’s close, i.e. checking how different were your results vs. the results obtained using some of the “standard” (“official”, or whatever you want to call them) standardization routines usually employed in dendro studies. By looking at the graphs, the differences are really small and I would say you are making a pretty good job emulating the RCS chronologies. However let me clarify one point. The RCS is just one standardization method among a range of options for detrending tree-ring series available in the software ARSTAN. RCS is a method of detrending tree-ring series, and ARSTAN is a program used to standardize series and create mean tree-ring chronologies.

    • Steve McIntyre
      Posted Sep 29, 2009 at 9:35 PM | Permalink

      Re: giano (#3),

      I’m familiar with ARSTAN and have emulated it as well. ARSTAN fits each tree, defaulting to a linear fit. In a population of relatively short-lived trees, an ARSTAN chronology won’t recover centennial variability. Thus ARSTAN chronologies often look like low-order red noise.

      I think that I have a post somewhere showing my tree ring methods, which place ARSTAN into a mixed effects context. It’s quite a pretty post and a much more sensible derivation that the arbitrary dendro recipe.

      • giano
        Posted Sep 30, 2009 at 1:35 AM | Permalink

        Re: Steve McIntyre (#8),
        I know Steve you have emulated most analyses presented so far in dendro studies. What I’m just saying is that ARSTAN is actually a software not a standardization technique (OK, ARSTAN is also a type of tree-ring chronology, together with another types called the STANDARD and the RESIDUAL chronologies that you usually get when you run those “official” programs like ARSTAN I was referring to before).

        Using your own phrase here, RCS and ARSTAN are not “apples and apples”. The program ARSTAN has many standardization options (RCS one of them). Most options are for fitting different curve types (negative exponential, linear, different splines, etc) to each sample and then averaging all the series to create a mean chronology. Depending of the stiffness of those curves, you can keep or loose the “low frequency” variability present in the original series. RCS is different in that all series are aligned by age first and then a single curve is fit and used to detrend all series. RCS requires a pith offset estimation for each series (see papers by J Esper) and you can also play with the stiffness of the curve fit to all series. That might be creating the differences you are seeing in the diagrams above (but I doubt it).

        • Alexander Harvey
          Posted Sep 30, 2009 at 8:48 AM | Permalink

          Re: giano (#11),

          RCS requires a pith offset estimation for each series (see papers by J Esper)

          In the 2003 paper at least, the importance of PO is highlighted and then downplayed to quite a degree. Anyway as far as I know, if the PO is not archived with the RW it cannot be inferred afterwards.

          That said I am a bit concerned about the exponential fit, Esper talks about splines, but anyway the emulations seem close enough for the current purpose. And in the first example I would suggest that the method is very close to the author’s indeed.

          I cannot see how without sight of the author’d RC one could do more.

          On another tack, I would have thought that where there were missing/additional cores this would show up throughout the chronology via their impact on their common RC. Perhaps in the first example they did not share a common RC.

          I should declare almost complete ignorance of this subject, please bear this in mind if you are going to jump on me.

          Alex

  4. Geoff Sherrington
    Posted Sep 29, 2009 at 7:29 PM | Permalink

    On detrending and divergence and relaibility.

    Copy of email Rob Wilson to Hal Fritts and Frank Berninger in exchange starting Tue, 10 Apr 2007

    Rob Wilson wrote:

    Hi

    I think we (the dendroclimatologists at least) have to be careful
    here. With regards to the ‘divergence problem’, I actually think we
    are talking the same thing.
    Some sites show it, some sites don’t.
    Some trees show it, some trees don’t.

    It essentially all comes down to ecology and limiting factors and how
    they may both change through time.

    It seems dangerous to me to ‘pick’ only those trees that fit the
    instrumental record from a site.

    The BIG issue here is, if the trees fail to model modern warming,
    then can we trust them in the past during periods that COULD also
    have been similarly as warm (e.g. the MWP). Although we can ‘pick’ in
    the present, we cannot ‘pick’ in the past.

    My personal opinion is that it will be difficult (but maybe not
    impossible) to address this issue using current standard approaches
    (i.e. use of RW and MXD which need to be detrended in some way).

    However, long isotopic records are now being produced from tree-ring
    samples and if indeed these data sources do not need to be detrended
    (which is still under debate), then such time-series could help
    verify long term trends in more traditional chronologies from RW and
    MXD.

    There are of course other proxy types which might not be so strong at
    interannual time-scales (except documentary sources), but provide
    invaluable information at decadal and longer time-scales.

    Phil Jones, Mike Mann and Anders Moberg have tried to combine
    different proxy types. It is easy to criticise these past studies,
    but we must face the fact that solely sticking to just tree-ring data
    might not provide us with a final robust answer.

    Rob

    A final answer? I’m still wondering if they are asking the right first question.

    • bender
      Posted Sep 29, 2009 at 7:48 PM | Permalink

      Re: Geoff Sherrington (#4),

      I’m still wondering if they are asking the right first question.

      What exactly do you mean? Rob Wilson’s points are right on the money. That at least one dendro understands the limitations of the science is, at this point, a victory for all.

      • Geoff Sherrington
        Posted Sep 30, 2009 at 12:50 AM | Permalink

        Re: bender (#5),

        By “they” I did not mean Rob Wilson, whose questions are obviously right on the money, more those who must be aware of the scientific uncertainty, but publish despite it. Some are named in the last para with the statement on final robust answer.

        IMO, the first question should be along the lines of “Can we show a reproducible relationship between tree properties and a relevant temperature series, first for samples from the instrumented era, then for samples before that”.

        The BIG issue he notes, the inability to “pick” in the distant past, worries me greatly, as it does our host.

    • Skiphil
      Posted Dec 29, 2012 at 6:52 AM | Permalink

      Thinking about Rob Wilson’s comments over the years brought me back to Jim Bouldin’s excellent recent series on “Severe Analytical Problems in Dendroclimatology” — Bouldin notes that his manuscript was blown off by PNAS reviewers. Still seems exceedingly difficult to have critical perspectives discussed within the hermetic circle of Hockey Team Science.

  5. Jeff Id
    Posted Sep 29, 2009 at 8:27 PM | Permalink

    I’ve got this figured out now. It’s interesting that all trees receive the same corrections.

    • bender
      Posted Sep 29, 2009 at 8:56 PM | Permalink

      Re: Jeff Id (#6),
      Are you referring to the fitting of parameters A, B, C? That is done at the level of individual ring width chronologies. They are free to vary.

      • Jeff Id
        Posted Sep 29, 2009 at 11:59 PM | Permalink

        Re: bender (#7),

        In Steve’s Yamal replication, A,B,C are the same for each individual series. The fit happens as a group and the replication is good. I can’t claim to understand the implications of that yet but it seems that some records will suffer a bias.

  6. J. Bob
    Posted Sep 30, 2009 at 8:23 AM | Permalink

    Is there a site where one can get the Yamal data in a .txt or EXCEL format?

    • bender
      Posted Sep 30, 2009 at 8:34 AM | Permalink

      Re: J. Bob (#12),
      Read the blog. The url is given in the thread named “Fresh Data on Briffa’s Yamal #1”. Have fun.

  7. Posted Sep 30, 2009 at 8:28 AM | Permalink

    There are a lot of problems with the statistical analysis of any proxie variable and trying to relate it to measured climate change. What is it actually measureing? How accurate are the measurements? I have discussed some of these problems in my presentation. I’ve been trying to get some kind of peer review by inserting a URL in blog comments. “Open Mind” and “Real Climate” are closed to me and I have very little response from several others. I wish I could find someone who could take some time to do a critical review. Go to kidswincom.net/climate.pdf or Google “Fred H.Haynie”+climate.

    • RunngMoose
      Posted Sep 30, 2009 at 10:23 AM | Permalink

      Re: Fred H. Haynie (#13),

      Although this post was somewhat OT, as a long-time lurker here I was intrigued. I read Fred’s referenced presentation “Future of Global Climate Change”. I highly encourage others more qualified than myself to check it out. He makes a very compelling case, IMHO.

  8. Ira
    Posted Sep 30, 2009 at 9:09 AM | Permalink

    congrats, in case no one has told you this, you made the Register:

    http://www.theregister.co.uk/2009/09/29/yamal_scandal/

    but not till page 3:

    “The scandal has only come to light because of the dogged persistence of a Canadian mathematician who attempted to reproduce the results. Steve McIntyre has written dozens of letters requesting the data and methodology, and over 7,000 blog posts. Yet Yamal has remained elusive for almost a decade.”

  9. Kenneth Fritsch
    Posted Sep 30, 2009 at 10:00 AM | Permalink

    What exactly do you mean? Rob Wilson’s points are right on the money. That at least one dendro understands the limitations of the science is, at this point, a victory for all.

    Rob Wilson’s comments quoted above show that he understands at least some of the important limitations of the way reconstructions with tree proxies are performed, but I agree with Geoff S that one would like to hear a more detailed and concise methodology for an a priori selection process of trees that is reasonable and has physical connections – as I assume that is what Geoff means by asking the correct initial questions.

    I see nothing of this sort in dendro papers and not in Rob Wilson’s either – only rather vague outlines and references. I do stand ready to be informed otherwise – with detailed and published recitations.

    • bender
      Posted Sep 30, 2009 at 10:05 AM | Permalink

      Re: Kenneth Fritsch (#17),
      I’m afraid you are correct, Ken, if that is what is meant by “asking the right initial questions”. If dendros knew exactly how trees respond to climate under all circumstance, then there wouldn’t be a between-tree or between-site “divergence problem”, would there? All you are going to get as far as site selection criteria are qualititaive lists of site attributes. i.e. “vague outlines”

  10. Robinedwards
    Posted Sep 30, 2009 at 2:58 PM | Permalink

    Fred Haynie #13 has indeed written a most interesting paper/presentation. /WELL/ worth reading and studying. He throws a new (to me) light on the climate change situation, which may well be a key to further progress in persuading influential people that the HS is nothing more than a red herring in the overall scheme of things. I would really like to be able to contact him but can find no hint of an address.

  11. Don B
    Posted Sep 30, 2009 at 4:27 PM | Permalink

    Roger Pielke, Jr. has commented on Steve’s Yamal revelations.

    http://rogerpielkejr.blogspot.com/2009/09/has-steve-mcintyre-found-something.html

  12. scietific method
    Posted Sep 30, 2009 at 5:43 PM | Permalink

    Steve, now that you are armed with data used to re-construct the “sons of the hockey stick” graphs, you should write a paper to uncover the truth and submit it to a well recognized journal. I know you have tried to have papers published before with not much success but this time it’s vastly different. It will be a major test in the way so called peer-reviewed scientific journals behave. Regardless of whether any accepts your paper or not, it will mark a major event. For the sake of supporting truth in the scientific method, please seriously consider this. If none accept your paper, then the whole scientific community will have to seriously doubt whether scientific journals have any merit anymore.

  13. JT
    Posted Sep 30, 2009 at 6:41 PM | Permalink

    I second the motion for publication but I suggest that an approach be made to the Economist magazine. The implications of this work which deeply undermines the whole “recent warm temperatures are unprecedented” meme which has been implanted in the mainstream media come with massive economic consequences. Its time to seek out journalists who understand the uses of mathematics, particularly statistics, and the larger economic and political context in which these flawed studies are being made to function.

  14. Alexander Harvey
    Posted Oct 1, 2009 at 11:39 PM | Permalink

    As I see it, the problem of extracting both the age related growth function and the date related growth function is symmetric.

    It seems that with RCS the initial estimate of the date function is unity and this is used to give an estimate of the age function which in turn yields an estimate of the date function.

    Why stop there. You now have a better estimate of the date function to use to get a better estimate of the age function.

    Try forgetting everything you know about the way trees grow (not difficult in my case) and just let the data pick both functions for you.

    This would be indicated whenever you have “sensitive” trees as the initial assumption that the date function is unity is going to be more troublesome than for non “sensitive” trees.

    I have given it a go for about the first twenty trees in the Yamal archive and the iteration is stable and converging and gives results similar but not quite the same as Steve’s resuls above.

    Alex

  15. Chas
    Posted Oct 3, 2009 at 2:15 PM | Permalink

    Re J.Bob (#12) There is an Excel add-in called OpenRWL that reads in decadal files (Tucson format?)
    I havent had much success with it, but maybe you will have more luck!
    It can be found at: http://web.utk.edu/~grissino/software.htm

  16. Steve McIntyre
    Posted Oct 5, 2009 at 12:30 AM | Permalink

    This thread is currently getting more hits than any other thread on the blog even threads that have attracted hundreds of comments.

    Can someone who is hitting this thread say hello and comment on their interest?

  17. Alan S. Blue
    Posted Oct 5, 2009 at 12:54 AM | Permalink

    Well, I’ve peeked a couple times. I’m just wondering about the details of how RCS actually merges the data.
    .
    With PCA, you can determine an effective weight of an individual input pretty directly.
    .
    Can something similar be done with RCS? Or is it meaningless?
    .
    Reconstructing PCA was outlined in detail, and I was hoping to find a similar exposition for RCS in R.

  18. Alan S. Blue
    Posted Oct 5, 2009 at 8:23 AM | Permalink

    Thank you Jean.

  19. Posted Oct 12, 2009 at 11:22 AM | Permalink

    RE Giano #11, Alexander Harvey #15,
    On looking back at this thread, it appears that the “pith offset” mentioned by Giano on Sept. 30 is the issue I raised on Oct. 8 over on the “Yamal and the divergence Problem” thread, here:

    I’m afraid I haven’t kept up with all the comments on every Yamal thread, so please forgive me if this question has already been answered:

    How do we know how old the trees are in Briffa’s newly released Yamal data file at http://www.cru.uea.ac.uk/cru/people/melvin/PhilTrans2008/YamalADring.raw?

    Each core in the file has a start date and an end date, but do we know the core reached the oldest part of the tree? The person taking the core presumably aimed for the center, but often trees grow lopsided. Even the relatively symmetrical tree round held by Michael Mann in the photo on his webpage has off-center heartwood.

    Evidently age is the critical factor in the RCS standardization that is central to much of this discussion. For example, our friend YAD06 had an admittedly hunking 28.70 mm ring in 1993, which is astonishing for any species that is not bamboo! But skimming through the Yamal file, lots of trees had similar rings throughout the past 2000 years. In 1611, during the LIA no less, tree L15581 actually had a 41.30 mm ring!

    YAD06’s record went back to 1803, or 190 years before its big ring, while L15581’s went back to 1574, only 37 years before its big ring, so maybe there was an age difference that means we should interpret these growth spurts differently. But how do we know how old L15581 was in 1574 when its record started?

    Even if dendros can measure the increasing curvature of the rings as the bottom of the core is approached, and can extrapolate to where the true center would be, where is this estimate recorded in the Briffa file?

    Of course, to the extent that these “day in the sun” growth spurts are just due to competing neighbors being taken down by old age or tornadoes, the median age-adjusted ring size must be a more representative indicator of local climate than the mean, provided the sample size is large enough to be representative.

    So do we know the “pith offset” for the new Briffa Yamal data? Is it safe to assume that Briffa would have just set these to 0 for each tree? Any common pith offset would just change the cofficients, but leave the anomalies the same, so this only makes a difference if different trees might have different offsets.

    If Briffa did use non-zero offsets, but didn’t include these in his file, he still hasn’t complied with TRSB’s requirement that he make his data available.

One Trackback

  1. By Yamal Emulation I « delayed.oscillator on Oct 5, 2009 at 6:38 AM

    […] from a data point-of-view. McIntyre has rolled his own Regional Curve Standardization code in R, strangely eschewing the freely available software used by dendrochronologists, so I wanted first to ensure there was no […]