McCullough and McKitrick on Due Diligence

Bruce McCullough and Ross McKitrick today published an interesting article under the auspices of the Fraser Institute entitled Check the Numbers: The Case for Due Diligence in Policy Formation.

Their abstract states:

Empirical research in academic journals is often cited as the basis for public policy decisions, in part because people think that the journals have checked the accuracy of the research. Yet such work is rarely subjected to independent checks for accuracy during the peer review process, and the data and computational methods are so seldom disclosed that post-publication verification is equally rare. This study argues that researchers and journals have allowed habits of secrecy to persist that severely inhibit independent replication. Non-disclosure of essential research materials may have deleterious scientific consequences, but our concern herein is something different: the possible negative effects on public policy formation. When a piece of academic research takes on a public role, such as becoming the basis for public policy decisions, practices that obstruct independent replication, such as refusal to disclose data, or the concealment of details about computational methods, prevent the proper functioning of the scientific process and can lead to poor public decision making. This study shows that such practices are surprisingly common, and that researchers, users of research, and the public need to consider ways to address the situation. We offer suggestions that journals, funding agencies, and policy makers can implement to improve the transparency of the publication process and enhance the replicability of the research that is published.

They canvass an interesting selection of cases from different fields (and I alert readers that I don’t have the faintest interest in debating the pros or cons of the issues in these other studies at this blog and do not wish readers to debate these issues here.) They report quantitative results from McCullough’s replication work in economics, but most readers will probably take the most interest from their accounts of several high profile studies – the Boston Fed study, the Bellesiles affair and the Hockey Stick.

The “Boston Fed” study was apparently related to policy changes on subprime mortgages. (I don’t want people to debate the rights or wrong of the policy) only the replicability issue. It appears that data requests were refused and, reminiscent of our dealings with Santer, Jones, etc, authors interested in testing the results finally resorted to FOI requests. (Climate scientists resent this, but there are precedents.) MM report the denouement as follows:

Day and Liebowitz (1998) filed a Freedom of Information Act request to obtain identifiers for these observations so they could re-run the analysis without them. They also noted that the Boston Fed authors (Munnell et al., 1992) did not use the applicant’s credit score as generated by the bank, but had replaced it with three alternate indicators they themselves constructed, which Day and Liebowitz found had omitted many standard indicators of creditworthiness. Day and Liebowitz showed that simply reverting to the bank’s own credit score and correcting the 26 misclassified observations caused the discrimination coefficient to drop to zero.

Harrison (1998) noted that the Boston Fed data set included many more variables than the authors had actually used. These included measures such as marital status, age, and whether the application contained information the bank was unable to verify. These variables were significant when added back in, and their inclusion caused the discrimination effects to drop to zero even without correcting the data errors noted by Day and Liebowitz.

Thus, the original Boston Fed conclusions were eventually shown to be wholly insupportable. But due to various delays these studies were not published until 1998 in Economic Inquiry, six years after the original study’s release …

The Bellesiles story is also very interesting for blog readers. Clayton Cramer, Bellesiles’ nemesis was a software engineer – he profiles very much like a typical Climate Audit reader. Cramer eventually published in the journal, Shotgun News, which, according to recent statistics, has an impact factor lower than either Science or Nature.

Despite the political importance of the topic, professional historians did not actively scrutinize Bellesiles’ thesis. Instead it was non-historians who began the process of due diligence. Stephen Halbrook, a lawyer, checked the probate records for Thomas Jefferson’s three estates (Halbrook, 2000). He found no record of any firearm, despite the fact that Jefferson is known to have been a lifelong owner of firearms, putting into question the usefulness of probate records for the purpose. Soon after, a software engineer named Clayton Cramer began checking Bellesiles’ sources. Cramer, who has a master’s degree in history, found dates changed and quotations substantively altered. However, Cramer was unable to get academic journals to publish his findings. Instead he began sending articles to magazines such as the National Review Online and Shotgun News. He compiled an extensive list of errors, numbering in the hundreds, and went so far as to scan original documents and post them on his website so historians would check the original documents against the text of Bellesiles’ book (Cramer, 2006)

Bellesiles claimed to have examined hundreds of San Francisco probate records from the 1850s. When confronted with the fact that all the San Francisco probate records had been destroyed in the 1906 earthquake, Bellesiles claimed that he obtained them from the Contra Costa County Historical Society. But the Society stated that it did not possess the requisite records. Bellesiles soon resorted to ad hominem, claiming that the amateur critics could not be trusted because they lack credentials. Referring to Clayton Cramer, Bellesiles said, “It is not my intention to give an introductory history lesson, but as a non-historian, Mr. Cramer may not appreciate that historians do not just chronicle the past, but attempt to analyze events and ideas while providing contexts for documents” (Bellesiles, 2001). Note that Bellesiles could have, at any time, ended the controversy by simply supplying his data to his critics, something he refused to do.

Ultimately Bellesiles appears to have been brought down by the black and white fact that it was impossible for him to have consulted the records, said to have been consulted, because they didn’t exist. Anyone remember the claims in Jones et al 1990 to have consulted Chinese station histories that don’t exist, and the absurd claims of the coauthors to have lost the records that supposedly had been faithfully preserved through World War II, the Cultural Revolution… but it’s climate and Doug Keenan’s effort to pursue the matter got nowhere.

I raised one beef today with coauthor McK. The term “due diligence” is used to frame the discussion – as it usefully puts “(journal) peer review” in a more general context. However, Mc and Mc do not identify the first academic article to use this term in this context (tho the article was cited in passing on another matter.) The first such usage, to my knowledge, was, of course, McIntyre and McKitrick (2005 EE), which ended as follows:

We are also struck by the extremely limited extent of due diligence involved in peer review as carried out by paleoclimate journals, as compared with the level of due diligence involved in auditing financial statements or carrying out a feasibility study in mineral development. For example, “peer review” in even the most eminent paleoclimate publications, as presently practiced, does not typically involve any examination of data, replication of calculations or ensuring that data and computational procedures are archived. We are not suggesting peer reviewers should be auditors. Referees are not compensated for their efforts and journals would not be able to get unpaid peer reviewers to carry out thorough audits. We ourselves do not have explicit recommendations on resolving this problem, although ensuring the archiving of code and data as used is an obvious and inexpensive way of mitigating the problem.

But it seems self-evident to us that, recognizing the limited due diligence of paleoclimate journal peer review, it would have been prudent for someone to have actually checked MBH98 data and methods against original data before adopting MBH98 results in the main IPCC promotional graphics.

The issues raised in McCullough and McKitrick are important ones and presented in an engaging fashion (though I’m obviously a fellow traveller on these matters.) Ross was on one radio show already and, like any member of the public (as I once was ), the host was dumbfounded at the lack of due diligence in the chain.

A simple and virtually zero-cost improvement in the system would be one that we’ve long supported: require the archiving of data and code. The purpose of this requirement has been totally misrepresented by Gavin Schmidt – it’s not to parse for code errors, but to put yourself in a position where you can quickly analyse sensitivities or the impact of new data, without having to run the gauntlet of doing everything from scratch.

As I’ve said on numerous occasions, I do not think that the issue is primarily inadequate peer review, though, in my opinion, journal peer review all too easily lapses into POV gatekeeping of the style of Burger and Cubasch Referee #2 and academicians are far too quick to shrug that off. Journal peer review is what it is – a cursory form of due diligence. The issue is that “buyers” assume that it’s something that it isn’t and fail to exercise caveat emptor .

Reblog this post [with Zemanta]

133 Comments

  1. Dave Dardinger
    Posted Feb 18, 2009 at 3:30 PM | Permalink

    Interesting enough. But since we can’t talk about the particular examples in the article,

    I alert readers that I don’t have the faintest interest in debating the pros or cons of the issues in these other studies at this blog

    and we can’t discuss public policy per site policy, what sort of replies are we supposed to make? Surely not just RCish accolades?

  2. Ross McKitrick
    Posted Feb 18, 2009 at 3:44 PM | Permalink

    Surely not just RCish accolades?

    Sounds fine to me.

    • Willis Eschenbach
      Posted Feb 19, 2009 at 1:48 PM | Permalink

      Re: Ross McKitrick (#2), I busted out laughing, lucky I wasn’t drinking. Made my night, the interchange below was just too good. My best to both you and Dave, and congratulations to you and Mc on the paper. It is clear and consistently interesting. The most valuable part to me was this:

      a. The data have been published in a form that permits other researchers to
      check it;
      b. The data described in the article were actually used for the analysis;
      c. Computer code used for numerical calculations has been published or is made
      available for other researchers to examine;
      d. The calculations described in the paper correspond to the code actually used;
      e. The results listed in the paper can be independently reproduced using the pub-
      lished data and methods;
      f. If empirical findings arise during the analysis that are materially adverse to the
      stated conclusions of the paper, this has been acknowledged in the article and an
      explanation is offered to reconcile them to the conclusions.

      Regarding the journals, would you agree that they need to be responsible for verifying a and c above, and the rest are not their responsibility?
      .
      Given that, where do the peer reviewers fit in this process? I don’t see that they can verify b, d, e, or f, given the time constraints and the fact they are usually unpaid. But given they’re not verifying that stuff … what exactly are they verifying?
      .
      It seems to me that about all they can do take any small part of the whole study and see if they can verify just that bit. However, in many studies, that’s not possible. Hmmm …
      .
      w.

      … what made me laugh was

      =====================================================

      Dave Dardinger:
      February 18th, 2009 at 3:30 pm

      Interesting enough. But since we can’t talk about the particular examples in the article,

      I alert readers that I don’t have the faintest interest in debating the pros or cons of the issues in these other studies at this blog

      and we can’t discuss public policy per site policy, what sort of replies are we supposed to make? Surely not just RCish accolades?
      .
      Ross McKitrick:

      February 18th, 2009 at 3:44 pm

      Surely not just RCish accolades?

      Sounds fine to me.

  3. AJ Abrams
    Posted Feb 18, 2009 at 4:03 PM | Permalink

    Small correction for you – In the paragraph “Ultimately Bellesiles appears to have been brought down by the black and white fact that it was impossible for him to have consulted the records, said to have been consulted, because they didn’t existed” The last word should be exist, obviously.

    Good stuff and thanks for posting it.

  4. MikeU
    Posted Feb 18, 2009 at 4:07 PM | Permalink

    Archiving data sets, code, and documentation may be “virtually zero cost” if all you’re doing is copying snapshots to new directories. However, it’s far more robust to use even the most rudimentary version control system: it’s much more difficult to tamper with, you get an audit trail of every change (what, when who, and often a comment on why), and you have data duplication in case something is accidentally “lost” or deleted. None of that comes for free in terms of tools and time (configuration, maintenance, training, structuring branches, etc), but it’s a complete no-brainer for anyone relying on the integrity of their data and methods. Like software developers. Or (hopefully) climate scientists.

    Good revision control has been a staple of software development for 20+ years now… we simply cannot function properly without it. Hopefully climate science will come to that same conclusion moving forward. Whether they’ll make those archives available to other scientists or interested laypersons is another question entirely.

  5. Gerald Machnee
    Posted Feb 18, 2009 at 4:11 PM | Permalink

    I now get a bit of a chuckle when I see some new study in the media and it is immediately followed by “peer-reviewed”. We now have a case of a paper that was done by two U of Manitoba scientists that has been withdrawn in 2008 from the prestigious Science. They are sort of trying to figure out how they (peer-reviewers and the supervisor) missed the bad data.
    Of course then there is the other problem where you cannot get the data to replicate a study.

  6. kim
    Posted Feb 18, 2009 at 4:26 PM | Permalink

    Skating on this ice here, but there are people who blame that Boston Fed paper for the housing policies which have led to our present disaster. A cautionary tale for policy based on opaque studies.
    ===============================================================

  7. jae
    Posted Feb 18, 2009 at 4:33 PM | Permalink

    Thanks for posting something I can understand for a change 🙂

    • MJT
      Posted Feb 18, 2009 at 4:39 PM | Permalink

      Re: jae (#8),
      Funny I was thinking the same thing…one of the few posts and links that didn’t require me to use a combination of google/dictionary.com/wikipedia on every other word 🙂

  8. Jason
    Posted Feb 18, 2009 at 4:46 PM | Permalink

    This is the core issue raised by Climate Audit.

    Nothing has been (or likely ever will be) published in this blog that materially refutes the IPCC consensus.

    What this blog has shown, fairly conclusively, is that the scientific methods used to derive that consensus fail to pass due diligence.

    Steve: no, it only shows things about the papers discussed here, which are only a subset of all the papers referred to in IPCC. Please do not over-generalize.

    • Ivan
      Posted Feb 19, 2009 at 3:48 AM | Permalink

      Re: Jason (#9),

      This is one of the strange things at Climate Audit. Couple of times, some posters (including me once) urged Steve to “audit” per reviewed studies of mainstream scientists, Jaworowski, Segalstad etc claiming to refute basic IPCC dogma of pre-Industrial CO2 concentrations bellow 280 ppmv, or Beck’s claims that accurate chemical CO2 measurements in the firt part of XX century often showed higher CO2 concentrations than now.

      Steve’s answer to such suggestions was that he wanted to concentrate on mainstream IPCC studies upon which the climate policy is based, and not to be distracted by these fringe theories, because his time is precious.

      Ok, that’s fine. But, apart from obvious objection that finding that maybe the most basic assumptions of mainstream IPCC science concerning CO2 concentrations could be incorrect (what would have immense implications for that science and climate policy as well), one addittional strange thing now occured. It turned out that posts about squash and now auditing studies in economics are not “distractions”, and have much more to do with “mainstream IPCC climate science” than Segalstad’s or Jaworowsky’s claims about ice core reconstructions. Strange indeed.

      Such things in my eyes diminsih somewhat credibility of Climate Audit, and rise legitimate concernes that Steve’s refusal to audit controversial and potentially explosive things on J-S interpretation of ice core record is motivated by some “politicial” rather than opportunity cost reasons. An limit relevance of the blog at the same time. Folks at Real Climate and other alarmists will not be impressed by such Steve’s kindness, and will continue to call him denier, while his basic auditing “calling” will be compromised.

      Steve: I’m not personally all that interested in “fringe” theories. I’m really pretty conventional and like to focus on mainstream things. And I really don’t want the blog to turn into a debate on fringe theories. I want it to focus on mainstream theory. Further there are critical analyses of fringe theories available elsewhere – realclimate, for example. If you want hold squash against me, too bad. The McC-McK was not just a study about “auditing in economics” but a study about “due diligence” as applied to academic journal studies. As I observed, the term “due diligence” in this context was first used in an academic article in McIntyre and McKitrick 2005b and so the issue is well within the purview of my academic interests.

  9. Ron Cram
    Posted Feb 18, 2009 at 4:50 PM | Permalink

    Congrats to Ross and Hu! Sometimes RCish accolades can be both sincere and well-earned.

    Steve: Bruce McC – yet another Mc.

  10. Steve Geiger
    Posted Feb 18, 2009 at 4:56 PM | Permalink

    Great. Now waiting for the first Mc3 article!

  11. IanRae
    Posted Feb 18, 2009 at 4:58 PM | Permalink

    Science changes slowly. Bill Bryson’s “Short History of Nearly Everything” mentions a Geological Society of America meeting where more than 50% of attendees dis-believed the notion that dinosaurs were wiped out by an asteroid. The year: 1988!

    • Armstrong
      Posted Feb 19, 2009 at 8:09 AM | Permalink

      Re: IanRae (#12),

      Science changes slowly. Bill Bryson’s “Short History of Nearly Everything” mentions a Geological Society of America meeting where more than 50% of attendees dis-believed the notion that dinosaurs were wiped out by an asteroid. The year: 1988!

      Sorry to get off-topic, but the thread doesn’t seem to have any particular structure anyways. I remember reading this passage of Bryson’s book, and feel the need to point out that the figure is not as surprising as he tells it. The idea of an asteroid causing the Cretaceous extinction was first proposed in 1980, and the Chicxulub crater had not even been identified publicly (it had been found by petroleum engineers, who were not allowed to announce or discuss its existence). The news about the crater really didn’t proliferate until 1990. For geologists, that’s actually a very fast turnaround considering the prevailing view in 1980!

    • Peter D. Tillman
      Posted Feb 19, 2009 at 1:43 PM | Permalink

      Re: IanRae (#12), Dino-Killer
      Re: Armstrong (#37),

      Actually, this explanation is still under debate, and it’s not at all clear that the Chicxulub bolide was the sole cause.

      Well, dammit, I thought I had a ref handy, but don’t. IIRC Wikipedia is fairly up-to-date on this. OT anyway!

      Cheers — Pete Tillman

  12. jack mosevich
    Posted Feb 18, 2009 at 5:19 PM | Permalink

    Science is the belief in the ignorance of the experts.
    —— Richard Feynman

  13. Tom C
    Posted Feb 18, 2009 at 5:28 PM | Permalink

    So does Ross only publish with other Mc’s?

    • Bernie
      Posted Feb 18, 2009 at 7:30 PM | Permalink

      Re: Tom C (#14), Its the Scottish mafia don’t y’know.

  14. Kenneth Fritsch
    Posted Feb 18, 2009 at 5:51 PM | Permalink

    Unfortunately the attention that these revelations from out of the area of “expertise” receive is very dependent on whose ox is being gored. Nonetheless the revelations are interesting and informative to read and perhaps with the advent of the internet are changing what can be ignored.

  15. PaulC
    Posted Feb 18, 2009 at 6:04 PM | Permalink

    As someone who deals with due diligence professionally, both in my own actions and assessing the actions of others, I would like to comment that the concept of “due diligence” has two components, more or less of equal weight: the explicit recognition by a party that there is an expected course of “diligent” action to be taken as dictated by the circumstances of the matter, and, the actual tangible performance of the action that is expected.

  16. Mark O
    Posted Feb 18, 2009 at 6:33 PM | Permalink

    I work in Aerospace at a test lab where we do testing on rocket and spacecraft components and systems. We not only archive all the data we produce, but we document everything that was involved in the test. Time of day, atmospheric pressure, temperature, who was involved in the test, instrumentation used, calibration sheets for all instrumentation, drawings of all equipment tested and test equipment used, functional failure effects and modes analysis (FFMEA), stress calculations on test equipment, who attended design reviews, any code used, the list goes on and on. All this is documented in case something goes wrong. In most cases it is never used again.

    Requiring “scientists” in other fields to document their code and data is trivial compared to what we “rocket scientists” have to do.

    • Ron Cram
      Posted Feb 18, 2009 at 7:21 PM | Permalink

      Re: Mark O (#17),

      Yeah, that is what climate science needs – the contribution of rocket scientists! Maybe we can get NASA interested. Oops. Never mind.

      Just kidding. It’s actually a good idea. If only GISS was run by a rocket scientist, things would be different I’m sure.

      • Mark O
        Posted Feb 18, 2009 at 9:16 PM | Permalink

        Re: Ron Cram (#19), Yea, it always puzzles my why NASA requires so much data and documentation from their contractors, yet they let Hanson get away with without properly documenting what he does with GISS. Any contractor working for NASA that treated their data as Hansen does would probably be fined, and would definitely not get any more NASA contracts.

        And just because somebody works at NASA does not make them a “Rocket Scientist.” You actually have to work on Rockets to be a “Rocket Scientist” and Hansen doesn’t.

  17. Chris Ferrall
    Posted Feb 18, 2009 at 7:16 PM | Permalink

    Ross,

    I skimmed the references and did not notice this infamous episode:
    The Hoxby-Rothstein exchange

    But maybe there is some reason it did not fit into the scope of your study or was covered by some other reference.

    Rather than “My Fair Lady” references (“The Rain in Spain …”) this controversy may bring to mind “The Harder They Come” as in “Many Rivers to Cross.”

    –Chris

    • Ross McKitrick
      Posted Feb 18, 2009 at 7:35 PM | Permalink

      Re: Chris Ferrall (#18), It’s a great story and there was no particular reason not to include it except that we had lots of econ stories already. I expect people in every discipline have stories they could add to the list.
      Re: Tom C (#14), And this time it was a Bruce too!

  18. pouncer
    Posted Feb 18, 2009 at 8:39 PM | Permalink

    McCullough, McGeary and Harrison (2006, hereafter “MMH”) attempted to replicate every
    empirical article published in the JMCB since 1996. Of 186 empirical articles, only 69 had archive entries. Of these, replication could not be attempted for seven due to lack of software or the use of proprietary data. Of the remaining 62, the results of 14 articles could be replicated. This is better than the 2 of 54 that Dewald, Thursby and Anderson (1986) could replicate, but hardly cause for enthusiasm. The primary reason so few empirical articles had archive entries is that nobody was checking – it was entirely up to the author to submit his data and code.

  19. Allen63
    Posted Feb 18, 2009 at 8:44 PM | Permalink

    Interesting read. Based on my professional experiences the article rings true — in particular the part about the near worthlessness of peer review as a form of assurance to a reader. But, the numbers of authors refusing to meet journal standards for archiving are worse than I would have guessed.

    I think its [misrepresentation] by the journals “advertise” that data & code will be available when it will not. Even I would have assumed that if data and code have been archived, I need be less suspicious about a result (even though I would not be checking personally). No longer. Apparently, I cannot trust the word of the typical journal regarding archiving.

    Of course, it occurs to me that the subject article itself needs to meet the standards it has set for others. Wonder if it has.

  20. stan
    Posted Feb 18, 2009 at 9:00 PM | Permalink

    If policymakers got serious about the process of science and required (and funded) the replication of studies before they could be used as the basis of policy, most of these problems would be taken care of.

  21. theduke
    Posted Feb 18, 2009 at 10:11 PM | Permalink

    A very fine paper. Kudos to Ross and Hu. Any scientific paper that gives one side of a heated public policy dispute a distinct advantage and is widely publicized (often because it gives one side an advantage) needs to be viewed with suspicion until it can be independently replicated/verified.

  22. Dishman
    Posted Feb 18, 2009 at 11:15 PM | Permalink

    “Peer Review” is not a Quality Process by today’s standards.

    It may be suitable for use by people who are aware of its limitations, but it is not sufficient for applications where there is a risk of injury or death.

  23. Geoff Sherrington
    Posted Feb 19, 2009 at 2:54 AM | Permalink

    Nobody expects the Spanish McInquisition.

  24. Adam Gallon
    Posted Feb 19, 2009 at 2:57 AM | Permalink

    Bellesiles soon resorted to ad hominem, claiming that the amateur critics could not be trusted because they lack credentials.

    Hmm, reminds me of something, can’t quite remember what.

  25. Alan Bates
    Posted Feb 19, 2009 at 3:04 AM | Permalink

    If a paper is intended to contribute towards decision making it should be verified at a level that depends on the degree of risk and/or cost involved.

    I was a chemist in the nuclear power industry and the output of my group was technical advice on station chemistry issues. Often the work we did fed into the preparation of Nuclear Safety Case papers and regularly involved chemical safety issues and Statutory obligations (would a wrong piece of advice kill someone or put us in Court for poluting the coast etc.). We had a system where the level of authorisation and extent of review depended on the implications of the work if we got it wrong. It seemed rather obvious …

    Maybe this is where technology wins over academia.

  26. Dishman
    Posted Feb 19, 2009 at 5:39 AM | Permalink

    Ivan,

    Steve does this on his own time. If, like the RealClimate people, he was being paid to blog, your challenge would have some merit.

    If some paper is of interest to you, dig in to it. Steve or Anthony Watts might be willing to post your analysis and host a discussion.

  27. Ian Blanchard
    Posted Feb 19, 2009 at 6:00 AM | Permalink

    Ivan
    As a long-time lurker I would disagree strongly with your characterisation of Steve’s opinion of the works of Beck and similar. He has consistently stated that there are flaws in these, and hence they are not accepted by the mainstream journals and has significantly curtailed discussions of these other than if someone were to offer an audit-style critique

  28. Gary P
    Posted Feb 19, 2009 at 7:00 AM | Permalink

    I am audited randomly on small purchases I make at work with a credit card. Since I was told I would be audited and what would be required, I knew to keep the records and the whole process is painless and takes very little time. It would be a major pain if I was not expecting an audit.

    By this time the climatologists should be expecting requests for data and methods and they have no excuse for not being prepared. It should be done for their own internal needs.

    They seem not to have caught up with the computer age to understand that the classical methods of statistical analysis for estimating error is inadequate for the complex computer processing. It is much faster and more definitive to take and feed the entire process random data to see what happens. One can assume a model such as rising temperatures, add random noise to perfect data and then see if the same slope comes out of proposed method. Their made up statistical methods are totally unconvincing.

  29. Jason
    Posted Feb 19, 2009 at 7:17 AM | Permalink

    snip – no fringe theories please

  30. Fred Harwood
    Posted Feb 19, 2009 at 7:43 AM | Permalink

    When I attempted to forward the pdf to interested parties, Verizon.net blocked the attachment as outgoing spam. I have sent it to spamdetector.update@verizon.net for their intervention.

  31. Posted Feb 19, 2009 at 8:25 AM | Permalink

    Steve, thanks, IMHO this post is very relevant to your work… helping the push for transparency and “due diligence” where huge public policy is at stake…

    So I agree with Ivan (#31) et al, that this blog would lose its particular strength and direction, if it were to focus on Jaworowski, Segalstad, and Beck, whose scientific pronouncements have not been accepted for public policy and who, though far more appealing to the skeptics, may be flawed as much as Mann – for who can be sure, without a decent open audit?

    Nevertheless, I feel that the work of these, and others like Landscheidt, could be important to real transparent Climate Science, and this is worthy of discussion. I’ve recently opened a thread on our forum on “doing Jaworowski justice“. It’s hard work. On that count alone, Steve is wise not to even try to begin here. But others who miss such discussion here might like to join our Forum. It has other imperfections but might suit some.

  32. Posted Feb 19, 2009 at 9:07 AM | Permalink

    Steve, what I and I’m sure many others would like to see here is some kind of summary or diary, for non-statistically-literates, on where the story is at, where everyone is at, regarding deconstructing Steig’s figures and hence his conclusions. I can sense the excitement and the progress by leaps and bounds (eg, great to have RegEM now ported into R, great that Tapio Schneider made this possible, one more tick for openness and transparency). But this story can only be written accurately enough by a statistician!
    .
    Writing the story is also a form of good record-keeping, to be available for others to enjoy and check, h’mmmmm…
    .
    Thanks for all the good work.

    Steve: the Tapio Schneider code has been available for quite a while and theoretically it could have been ported a while ago. The present spurt of work is related more to the fact that the Antarctic data set is an easy one to work on benchmark; plus Steig said that he used the unadorned code and so, at face value, one doesn’t have the extra layer of Mann-Rutherford “adaptations”. Also Jeff Id got the Matlab version working so I had intermediates to benchmark.

  33. Dean P
    Posted Feb 19, 2009 at 9:33 AM | Permalink

    This reminds me of a story from a few years ago:

    http://www.newscientist.com/article/dn7915

    I especially like the line

    “We should accept that most research findings will be refuted. Some will be replicated and validated. The replication process is more important than the first discovery,” Ioannidis says”

    Or, in the immortal words of Jimmy Buffett: “Don’t ever forget that you just wind up being wrong!”

  34. Clark
    Posted Feb 19, 2009 at 9:58 AM | Permalink

    The new online series of journals from BMC has the question below for peer reviewers as part of a checklist for reviewing manuscripts. It would be nice if this sort of check spread to other journals:

    Statistical review

    Is it essential that this manuscript be seen by an expert statistician?
    If you feel that the manuscript needs to be seen by a statistician, but are unable to assess it yourself then please could you suggest alternative experts in your confidential comments to the editors.

    *Yes, and I have assessed the statistics in my report.

    *Yes, but I do not feel adequately qualified to assess the statistics.

    *No, the manuscript does not need to be seen by a statistician.

    • bernie
      Posted Feb 19, 2009 at 2:47 PM | Permalink

      Re: Clark (#41), That is interesting and surely a sign of some progress?

  35. david_a
    Posted Feb 19, 2009 at 12:26 PM | Permalink

    From someone who has only recently begun delving into the climate prediction world let me just say thanks to all the people here who do what they do and share their expertise. This is the first place I have found where the attitude is not condescending and there is an easy openness to what is being done.

    I’ve spent many years building prediction systems in the financial world, sometimes successfully sometimes not. While I am in no position to publish either results or data or algorithms because of obvious intellectual property issues the things that we do in house to document our research and results are substantial yet trivial to implement. Perhaps it is just second nature because I have been programming for a very long time, but it is beyond belief that climate modelers do not use simple source code control systems and could not easily release data and code so others could examine it and more importantly ‘play’ with it.

    Beyond the whole point of whether or not ‘science’ would be advanced by publication of the underlying data and codes (which it obviously would) there is a much bigger point and that is that what is being contemplated politically are fairly large scale, policy driven, economic dislocations. Given that the stakes are very high, the process needs to be as transparent as possible. I would suspect that the majority of research being done in the area is funded with public monies and the public certainly has a right to see the nuts and bolts.

    A long time ago I had the pleasure of working as a research assistant in a solid state physics lab. While my job revolved around the automation of experiments and data collection (moving out of the dark ages of strip chart recorders) it fell upon me to do a little bit of data reduction since I was somewhat facile programming where the experimentalist driving the project was not. I was so fearful of screwing up the data that I went to enormous lengths to document, check and recheck everything I touched and was only too pleased to show it to anyone that would look. And since what we were doing was measuring charge carrier concentrations across p-n junctions for some folks trying to grow better transistors it wasn’t as if some large fraction of the world’s GDP hinged upon me getting it right. I can not at all understand the attitude of ‘scientists’ who wish only to release their methods, data and results to their ‘peers’.

    On a completely separate topic is there a thread or a place here to ask a few newbie questions?

    thx
    d

  36. Harry Eagar
    Posted Feb 19, 2009 at 12:52 PM | Permalink

    Mc and Mc and Mc. I think we have a ‘Scotch verdict’ on quality control.

  37. Simon Evans
    Posted Feb 19, 2009 at 1:43 PM | Permalink

    I have not followed every episode of the H-S soap opera, so would appreciate it if someone could explain to me the following:

    This Mc&Mc paper states as follows:

    …files found on Mann’s FTP site showed that he had re-run his analysis specifically excluding the bristlecone pine data, which yielded the result that the hockey sick shape disappeared, but this result was also not reported.
    In the context of private sector due diligence, a failure to disclose adverse performance is a misrepresentation. It is no less serious in scientific contexts.

    However, MBH99 reads as follows, my bold:

    In using the sparser dataset available over the entire mil-
    lennium (Table 1), only a relatively small number of indica-
    tors are available in regions (e.g., western North America)
    where the primary pattern of hemispheric mean temperature
    variation has signi cant amplitude (see Fig. 2 in MBH98),
    and where regional variations appear to be closely tied to
    global-scale temperature variations in model-based experi-
    ments [Bradley, 1996]. These few indicators thus take on a
    particularly important role (in fact, as discussed below, one
    such indicator{ PC #1 of the ITRDB data{is found to be es-
    sential
    ), in contrast with the post AD 1400 reconstructions
    of MBH98 for which indicators are available in several key
    regions [e.g., the North American northern treeline (“NT”)
    dendroclimatic chronologies of Jacoby and D’Arrigo, 1989].

    and

    It is furthermore found that only one of these series{PC #1 of
    the ITRDB data{exhibits a signi cant correlation with the
    time history of the dominant temperature pattern of the
    1902-1980 calibration period. Positive calibration/variance
    scores for the NH series cannot be obtained if this indica-
    tor is removed from the network of 12
    (in contrast with
    post-AD 1400 reconstructions for which a variety of indica-
    tors are available which correlate against the instrumental
    record). Though, as discussed earlier, ITRDB PC#1 rep-
    resents a vital region for resolving hemispheric temperature
    trends, the assumption that this relationship holds up over
    time nonetheless demands circumspection. Clearly, a more
    widespread network of quality millennial proxy climate in-
    dicators will be required for more con dent inferences.

    How is Mc & Mc’s very serious charge of misrepresentation tenable given these explicit statements of the study’s dependance upon PC1?

    • Steve McIntyre
      Posted Feb 19, 2009 at 3:12 PM | Permalink

      Re: Simon Evans (#46),

      I am not a coauthor of McCullough and McKitrick. Given that Mc-Mc at this site typically involves me, I would appreciate it if you would avoid short forms where they may be misleading. I suggest that you also consult our NAS presentation here in which the representations as to “robustness” were examined. The following claims on this matter were noted:

      MBH98:

      the long-term trend in NH is relatively robust to the inclusion of dendroclimatic indicators in the network, suggesting that potential tree growth trend biases are not influential in the multiproxy climate reconstructions. (p. 783, emphasis added.)

      This is obviously untrue as the bristlecones affect the result.

      Mann et al., 2000 stated:

      We have also verified that possible low-frequency bias due to non-climatic influences on dendroclimatic (tree-ring) indicators is not problematic in our temperature reconstructions… Whether we use all data, exclude tree rings, or base a reconstruction only on tree rings, has no significant effect on the form of the reconstruction for the period in question. … These comparisons show no evidence that the possible biases inherent to tree-ring (alone) based studies impair in any significant way the multiproxy-based temperature pattern reconstructions discussed here.

      This latter assertion has been parsed from time to time, with Mann’s defenders arguing that the claim is cunning: it is true for the AD1730 shown – and the fact that it is untrue for the AD1400 period is neither here nor there. You’d be torched by a securities commission if you tried that line of argument. Failure to disclose relevant information is held to be a misrepresentation in such circumstances.

      Note that, examined closely, the MBH99 language pointedly avoided conceding an inch on MBH98 (“in contrast with the post AD 1400 reconstructions”) of MBH98, limiting their caveat to the AD1000 step. This is not correct. As we observed at the time, they had done calculations in their CENSORED directory for the AD1400 step as well and knew that there was non-robustness to bristlecones in the AD1400 step then in dispute. (MM2005a,b were primarily about MBH98.)

      There’s also an issue about whether the language in MBH99 is even consistent with the earlier and later claims. If your view is that the MBH99 language somehow undoes the MBH98 representation (and I don’t agree that it does), then they should have issued a corrigendum in 1999 at Nature, the more prominent journal. It also doesn’t explain the subsequent representation in Mann et al 2000, which is an even more extreme statement of MBH98 language.

      Also the McCullough and McKitrick language stated the disappearance of the distinctive hockey stick shape was not reported in MBH99:

      bristlecone pine data, which yielded the result that the hockey s[t]ick shape disappeared,

      If you look closely at the actual language of MBH99, this is so. What MBH9 actually report is only that the bristlecone-free recon has failing calibration-verification statistics, which is not the same thing though it is sometimes passed off as being equivalent.

      And don’t forget to look at the verification r2 issue while you’re at this.

      • Mark T
        Posted Feb 19, 2009 at 3:18 PM | Permalink

        Re: Steve McIntyre (#53),

        There’s also an issue about whether the language in MBH99 is even consistent with the earlier and later claims.

        If your view is that the MBH99 language somehow undoes the MBH98 representation (and I don’t agree that it does), then they should have issued a corrigendum in 1999 at Nature, the more prominent journal.

        The excerpts that Simon quoted seem to be referring to the PCs after including the BCP series, not the BCP series themselves, which is what Mc-Mc argue, i.e., he’s offering up a strawman.

        Mark

        Steve: Again, I’m not an coauthor of the study in discussion and ask that Mc-Mc not be used on this site for studies not involving me.

        • Mark T
          Posted Feb 19, 2009 at 3:58 PM | Permalink

          Re: Mark T (#54),

          Steve: Again, I’m not an coauthor of the study in discussion and ask that Mc-Mc not be used on this site for studies not involving me.

          Ooops, sorry, I only skimmed the first part of your post since it did not contain any meat and didn’t notice your request.

          Mark

      • Simon Evans
        Posted Feb 19, 2009 at 3:41 PM | Permalink

        Re: Steve McIntyre (#53),

        Steve,

        My apologies for the Mc&Mc ambiguity – it was not intentional.

        I am not questioning the finer points of detail. I’m sure it is very possible that MBH have said things elsewhere that you consider to be ambiguous, misleading, or whatever. But the fact is that in the fifth paragraph of MBH99 they make explicitly clear that PC #1 is “essential”, do they not? And later they make absolutely clear that “positive calibration/variance
        scores for the NH series cannot be obtained if this indicator is removed from the network”, do they not? Therefore McCullough’s and McKitrick’s assertion that this fact is “not reported” is unfounded, is it not? We don’t have to examine the language closely and we don’t have to look at verification r2 issues. McCullough’s and McKitrick’s assertion is simply false.

        • Jean S
          Posted Feb 19, 2009 at 3:52 PM | Permalink

          Re: Simon Evans (#56),
          You a mixing two things: MBH99 assertion which refers to AD1000-AD1399 part of the reconstruction, and McCullough’s and McKitrick’s assertion that refers to AD1400-AD1449 part of the reconstruction.

        • Mark T
          Posted Feb 19, 2009 at 3:59 PM | Permalink

          Re: Simon Evans (#56),

          But the fact is that in the fifth paragraph of MBH99 they make explicitly clear that PC #1 is “essential”, do they not? And later they make absolutely clear that “positive calibration/variance
          scores for the NH series cannot be obtained if this indicator is removed from the network”, do they not? Therefore McCullough’s and McKitrick’s assertion that this fact is “not reported” is unfounded, is it not?

          Both Ross and I have pointed out the flaw in your assertion, drop the strawman.

          Mark

    • Ross McKitrick
      Posted Feb 19, 2009 at 3:30 PM | Permalink

      Re: Simon Evans (#46), From the cryptic quotations you don’t get the actual weakness of the MBH results. Read my APEC paper for the background details. By calling it the PC1 of the NOAMER network Mann implied that it was the dominant pattern of variance in the whole North American sample. But it wasn’t the dominant pattern in the NA sample, it was a pattern associated only with a small group of bristlecones sampled in a single western US region, which were long suspected of being invalid for use as climate proxies (a point agreed to by the NAS panel).

      The only reason the bristlecones forced the shape of the Mann PC1 was the incorrect PC method applied by Mann. Had he used a conventional method the hockey stick shape would drop to PC4 (depending on whether you homogenize the variance or not, discussed in our 2005 GRL exchange with Huybers). The associated eigenvalue on the PC4 was less than 8% indicating the bristlecones contribute a minor regional signal, not the dominant pattern of variance. Had the quoted paragraphs said that the Graybill bristlecones were essential to the results, and had he shown the graphs with and without them, the hockey stick would have been ignored, since he would have had to argue that the bristlecone pines should not only be assumed to proxy the entire Northern Hemisphere climatic reconstruction, but to do so even though they negate the pattern arising from the entire rest of the dataset.

      The passage you quote also seems to suggest that only the pre-1400 portion loses significance, and only when the PC1 is dropped. But the loss of significance extends further forward in time into the 18th century when the unreported CE and r2 stats are consulted even when the PC1 is retained, and as Wahl and Ammann showed, when the PC1 is dropped the whole reconstruction becomes, in their words, “without merit.” The same effect arises by retaining everything else in the MBH reconstruction but merely dropping the bristlecones. That was what the “Censored” folder showed. The passage you quote did not disclose this. Had it done so, nobody would have been shocked when Steve and I demonstrated the influence of the bristlecones many years later.

      • Simon Evans
        Posted Feb 19, 2009 at 4:11 PM | Permalink

        Re: Ross McKitrick (#55),

        Ross, thank you for the link to your APEC paper. It’s a diversion, but may I ask you a question about it? Were you aware when you wrote it that your figure 3 (taken from the FAR) is actually a working of Lamb’s reconstruction of the Central England Temperature record, and not a global representation at all? I ask because you introduce this graoh of Central England temperatures in the following way –

        “The Medieval Warm Period (MWP) is an interval from approximately AD1000 to AD1300 during which many places around the world exhibited conditions that seem warm compared to today. In the 1990 First Assessment Report of the IPCC, there was no hockey stick. Instead the millennial climate history contained a MWP and a subsequent Little Ice Age, as shown as in Figure 3.”

        It seems to me that you either were not aware that this was based on Lamb’s reconstruction of CET or else you were not being suitably clear as to what the graph showed. Can you clarify that for me?

        Moving back to your response:

        “By calling it the PC1 of the NOAMER network Mann implied that it was the dominant pattern of variance in the whole North American sample. But it wasn’t the dominant pattern in the NA sample, it was a pattern associated only with a small group of bristlecones sampled in a single western US region, which were long suspected of being invalid for use as climate proxies (a point agreed to by the NAS panel).”

        That is irrelevant. You have stated that they did not report the matter. They did, in their fourth paragraph of MBH99.

        “The only reason the bristlecones forced the shape of the Mann PC1 was the incorrect PC method applied by Mann. Had he used a conventional method the hockey stick shape would drop to PC4 (depending on whether you homogenize the variance or not, discussed in our 2005 GRL exchange with Huybers). The associated eigenvalue on the PC4 was less than 8% indicating the bristlecones contribute a minor regional signal, not the dominant pattern of variance. Had the quoted paragraphs said that the Graybill bristlecones were essential to the results, and had he shown the graphs with and without them, the hockey stick would have been ignored, since he would have had to argue that the bristlecone pines should not only be assumed to proxy the entire Northern Hemisphere climatic reconstruction, but to do so even though they negate the pattern arising from the entire rest of the dataset.”

        That is also irrelevant to the point under discussion. You have stated in the paper this thread refers to that they did not report the dependency. They did.

        The passage you quote also seems to suggest that only the pre-1400 portion loses significance, and only when the PC1 is dropped. But the loss of significance extends further forward in time into the 18th century when the unreported CE and r2 stats are consulted even when the PC1 is retained, and as Wahl and Ammann showed, when the PC1 is dropped the whole reconstruction becomes, in their words, “without merit.” The same effect arises by retaining everything else in the MBH reconstruction but merely dropping the bristlecones. That was what the “Censored” folder showed. The passage you quote did not disclose this. Had it done so, nobody would have been shocked when Steve and I demonstrated the influence of the bristlecones many years later.

        That is not what you have stated in your paper under discussion. If you wished to do so, then you could have done so. As it is you stated that “he had re-run his analysis specifically excluding the bristlecone pine data, which yielded the result that the hockey sick shape disappeared, but this result was also not reported.” That does not appear to me to be true, since MBH99 makes absolutely clear that “positive calibration/variance scores for the NH series cannot be obtained if this indica-
        tor is removed from the network of 12”.

        • Ross McKitrick
          Posted Feb 19, 2009 at 4:45 PM | Permalink

          Re: Simon Evans (#62), On Lamb: the caption in the IPCC Report refers to global temperature variations (extracted here). The fact that they took the graph from Lamb’s CEt doesn’t change the point that it was the prevailing view of the global picture.

          Now as to your other point, you’re being obtuse. Bruce and I asserted that Mann did not report that the characteristic hockey stick shape disappears from the reconstruction when the bristlecones are removed. Your quotes from MBH99 confirm this. If they had disclosed the effect nobody would have been startled by it when Steve and I pointed it out many years later. Moreover the MBH98 paper gave not a hint of the sensitivity. The MBH99 paper only suggested that significance might be lost in the pre-1400 portion upon removal of the PC1. This is not disclosure in any scientific sense. Removal of the bristlecones destroys the shape of the reconstruction across its length: this has enormous implications for the robustness of the result. This was not reported, and as Steve has noted, Mann’s comments in other publications claimed otherwise.

        • Simon Evans
          Posted Feb 19, 2009 at 5:10 PM | Permalink

          Re: Ross McKitrick (#67),

          On Lamb: the caption in the IPCC Report refers to global temperature variations (extracted here). The fact that they took the graph from Lamb’s CEt doesn’t change the point that it was the prevailing view of the global picture.

          I am very well aware of that. I was asking whether you were aware that it was only a reconstruction of Central England when you wrote the paper you linked to. Is it your view that the temperature record of Central England is a good representation of the global picture? If not, then why did you present it as such? (I’m not interested in what the IPCC FAR suggested, but in what you wished to suggest).

          Now as to your other point, you’re being obtuse. Bruce and I asserted that Mann did not report that the characteristic hockey stick shape disappears from the reconstruction when the bristlecones are removed. Your quotes from MBH99 confirm this. If they had disclosed the effect nobody would have been startled by it when Steve and I pointed it out many years later. Moreover the MBH98 paper gave not a hint of the sensitivity.

          So your point is that they didn’t say “the characteristic hockey shape disappears” even though they did say “positive calibration/variance scores for the NH series cannot be obtained if this indicator is removed from the network”? Right!

          The MBH99 paper only suggested that significance might be lost in the pre-1400 portion upon removal of the PC1. This is not disclosure in any scientific sense. Removal of the bristlecones destroys the shape of the reconstruction across its length: this has enormous implications for the robustness of the result.

          What it suggested is that pre-1400 correlation depended upon the PC1 but that post-1400 correlation was supported by other indicators, and it said “the assumption that this relationship [PC1] holds up over time nonetheless demands circumspection.” Is that not a circumspect enough disclosure for you?

          “This is not disclosure in any scientific sense”

          – and was your answer to my question to you as to your use of Lamb’s CET reconstruction “disclosure in any scientific sense”? Either you think that represents the global picture or you don’t. And if you don’t, then why did you use it?

          Steve: as you observed above, the Lamb issue is a diversion. Please take it to Unthreaded.

  38. Ivan
    Posted Feb 19, 2009 at 2:36 PM | Permalink

    snip – please discuss Beck and such things elsewhere.

  39. G Alston
    Posted Feb 19, 2009 at 3:04 PM | Permalink

    Steve, you said — “The purpose of this requirement has been totally misrepresented by Gavin Schmidt – it’s not to parse for code errors… [SNIP]”

    And what, precisely, is wrong with looking for code errors? Isn’t being able to reproduce things at the core of the scientific method? If I were to use a given work as a starting point for something, having it combed for errors and pronounced clean would be a great deal warmer and fuzzier than some flavour of hopeful trust.

  40. Mark T
    Posted Feb 19, 2009 at 3:09 PM | Permalink

    I think Steve views finding code errors as incidental to his stated goals, i.e., he’s not saying there’s anything wrong with doing so, just that it should be a side effect of larger goals. Keep in mind, methodological errors in the code are not the kinds of errors I think he’s referring to.

    Mark

  41. MrPete
    Posted Feb 19, 2009 at 3:52 PM | Permalink

    Simon (#56), your post appears to have overlapped with the response from Ross McKitrick (#55). I recommend you consider his response then update your assertions.

  42. Mark T
    Posted Feb 19, 2009 at 3:59 PM | Permalink

    And Jean S.

    Mark

  43. Steve McIntyre
    Posted Feb 19, 2009 at 4:13 PM | Permalink

    #62. Simon, would you do me the courtesy of replying to my comment prior to moving on?

    • Simon Evans
      Posted Feb 19, 2009 at 4:34 PM | Permalink

      Re: Steve McIntyre (#63),

      “Simon, would you do me the courtesy of replying to my comment prior to moving on?”

      I thought that I had somewhat, but evidently not to your satisfaction. You state:

      “If you look closely at the actual language of MBH99, this is so. What MBH9 actually report is only that the bristlecone-free recon has failing calibration-verification statistics, which is not the same thing though it is sometimes passed off as being equivalent.”

      Well, I do consider it to be equivalent. They say that it is “essential” and that “positive calibration/variance scores for the NH series cannot be obtained if this indicator is removed from the network”. I read that, and still read it, as meaning clearly that the findings are without substance if it is excluded.

      I am bewildered now by some posters suggesting I am presenting a strawman. I am not. I have simply read McCullough’s and McKitrick’s statements and find them to be in conflict with the text of MBH99. Perhaps you might consider the issue of the language in McCullough and McKitrick as well as the language in MBH?

      • jim edwards
        Posted Feb 19, 2009 at 6:02 PM | Permalink

        Re: Simon Evans (#64),

        “If you look closely at the actual language of MBH99, this is so. What MBH9 actually report is only that the bristlecone-free recon has failing calibration-verification statistics, which is not the same thing though it is sometimes passed off as being equivalent.” (SM)

        Well, I do consider it to be equivalent. They say that it is “essential” and that “positive calibration/variance scores for the NH series cannot be obtained if this indicator is removed from the network”. I read that, and still read it, as meaning clearly that the findings are without substance if it is excluded.

        I am bewildered now by some posters suggesting I am presenting a strawman. I am not. I have simply read McCullough’s and McKitrick’s statements and find them to be in conflict with the text of MBH99 .

        I may be in the minority here, Simon, but I don’t think your intrepretation of Mann’s comment is bad – IF you ignore the public comments of Mann and his supporters.

        Your construction seems a fair reconciliation of two statements. But I believe that your construction was rejected early on by Mann and company when the question was directly asked whether these HS constructions were “robust” to the removal of the bristle cone pines.

        I seem to remember it was a big fight before whether PC1 [the dominant signal in the proxies] was emergent in the whole group of proxies or an artifact of the Greybill bristle-cone-pines only. Your argument was rejected by Mann and co. early on, I believe.

        Even if it became bregrudgingly accepted later, that would only serve to reinforce Ross’s fundamental point that the facts in these cases only become clear well after the policy train has left the station.

        • Simon Evans
          Posted Feb 19, 2009 at 6:13 PM | Permalink

          Re: jim edwards (#75),

          Jim,

          Thanks for your comment. You may have helped to answer my original question, or at least to explain how we’ve got from there to here, so thanks 🙂

  44. Mark T
    Posted Feb 19, 2009 at 4:40 PM | Permalink

    Not some, just one.

    Mark

  45. Steve McIntyre
    Posted Feb 19, 2009 at 4:43 PM | Permalink

    #64. Please comment on the following. Do you believe the following to be true statements:

    MBH98:

    the long-term trend in NH is relatively robust to the inclusion of dendroclimatic indicators in the network, suggesting that potential tree growth trend biases are not influential in the multiproxy climate reconstructions. (p. 783, emphasis added.)

    Mann et al., 2000:

    We have also verified that possible low-frequency bias due to non-climatic influences on dendroclimatic (tree-ring) indicators is not problematic in our temperature reconstructions… Whether we use all data, exclude tree rings, or base a reconstruction only on tree rings, has no significant effect on the form of the reconstruction for the period in question. … These comparisons show no evidence that the possible biases inherent to tree-ring (alone) based studies impair in any significant way the multiproxy-based temperature pattern reconstructions discussed here.

    If I can understand your position these points, I’ll be in a better position to appraise your other comments.

  46. Colin Davidson
    Posted Feb 19, 2009 at 4:50 PM | Permalink

    To return to the subject, the issue of the fallibility and corruption of the peer review system is of long standing. There are many examples of the system being used to censor original work, rather than addressing the quality of the data collection, method and analysis.

    (Further examples, Fred Hoyle’s and Chandra Wickraasinge’s work on panspermia, Louis Frank’s work on small comets. I don’t want to debate the correctness of these, just note the difficulty in getting the original papers published, due to censorship.)

    I think journal editors are in the frame here: they should be publishing and policing a robust set of rules for authors (data, method and analysis archiving so that replication/falsification can occur) and also for reviewers (are comments pertinent to the quality of the data, the method and the analysis, or do they just disagree with the conclusions?).

    I look forward to a robust discussion of the issue.

    Steve: there have been discussions in various venues over peer review warts and, with respect, I don’t want to re-hash them here. THe issue at hand is what happens between peer review and policy.

    • davidc
      Posted Feb 20, 2009 at 4:44 PM | Permalink

      Re: Colin Davidson (#68),

      Steve’s comment:

      THe issue at hand is what happens between peer review and policy.

      That’s the important point. One of the disservices of (some) climate scientists to the public perception of science is this issue of peer review. A regular response to criticsm has been that it hasn’t been peer reviewed (by them) so it’s inconsequential. The implication (of course this is never stated explicitly) is that peer review somehow rejects the flawed papers and admits only those that reveal the truth. In fact, the kind of questions reviewers are asked are: Is the material of interest to the readers of this journal? Are the methods adequately described? Is there adequate citation of relevant literature? Are there major blunders which are obvious to anyone with a knowledge of this field?

      The real business of science gets underway after publication. Most papers are simply ignored, which is why reviewers are unwise to spend too much time on them. The ones that aren’t ignored are subject to public scrutiny (generally through peer reviewed papers) including attempts to replicate the data if that’s important (remember cold fusion). This is the process which has been missing in climate science. Straight from peer review to policy (including a modified hockey stick on the Australian govt website on carbon trading).

  47. mondo
    Posted Feb 19, 2009 at 5:05 PM | Permalink

    Following the above exchange between Simon Evans, Steve McIntyre and Ross McKitrick with great interest.

    It is interesting to consider the situation if it is allowed that Simon’s interpretation of the Mann statements is correct, and that Mann DID in fact disclose that the Hockey Stick graph was highly dependent on the bristlecone pines for it shape, and that that shape would disappear if the bristlecone pines were excluded (acknowledging that whether he did clearly disclose this is disputed and not at all clear). That would mean that the IPCC had mistakenly used a dubious graph in its reports. Further, it would mean that Al Gore had mistakenly used a dubious graph as a centrepiece in his film An Inconvenient Truth.

    If Mann did know about the bristlecone problem (and the NAS panel made it very clear that bristlecone pines should not be used), should he not then have come out publicly and said that the Hockeystick graph doesn’t actually represent the true position, and it would be wrong to base policy decisions on it?

    I realise that Mann et al’s position is that even if you leave the dendroclimatological papers out of the corpus, that the evidence is still strong for AGW. However, that is all the more reason that he perhaps should have spoken up, and not allowed his graph to be used to create a wrong impression.

    Of course, had he spoken up, then we would not still be going over MBH98 and MBH99 as we are.

    FWIW, my lay perspective, having followed the discussion pretty much since the beginning is that McCullough and McKitrick pretty much have it correct in their just released paper.

    • jae
      Posted Feb 20, 2009 at 5:39 PM | Permalink

      Re: mondo (#69),

      FWIW, this is the way I understand it, also. All the parsing by Simon Evans does not change how the study and data were actually interpreted and used.

  48. Steve McIntyre
    Posted Feb 19, 2009 at 5:14 PM | Permalink

    #70. One more time. Please comment on the following before proceeding to discuss later characterizations of the Mann corpus. Do you believe the following to be true statements:

    MBH98:

    the long-term trend in NH is relatively robust to the inclusion of dendroclimatic indicators in the network, suggesting that potential tree growth trend biases are not influential in the multiproxy climate reconstructions. (p. 783, emphasis added.)

    Mann et al., 2000:

    We have also verified that possible low-frequency bias due to non-climatic influences on dendroclimatic (tree-ring) indicators is not problematic in our temperature reconstructions… Whether we use all data, exclude tree rings, or base a reconstruction only on tree rings, has no significant effect on the form of the reconstruction for the period in question. … These comparisons show no evidence that the possible biases inherent to tree-ring (alone) based studies impair in any significant way the multiproxy-based temperature pattern reconstructions discussed here.

    • Simon Evans
      Posted Feb 19, 2009 at 5:34 PM | Permalink

      Re: Steve McIntyre (#71),

      #70. One more time. Please comment on the following before proceeding to discuss later characterizations of the Mann corpus. Do you believe the following to be true statements:

      MBH98:

      the long-term trend in NH is relatively robust to the inclusion of dendroclimatic indicators in the network, suggesting that potential tree growth trend biases are not influential in the multiproxy climate reconstructions. (p. 783, emphasis added.)

      Mann et al., 2000:

      We have also verified that possible low-frequency bias due to non-climatic influences on dendroclimatic (tree-ring) indicators is not problematic in our temperature reconstructions… Whether we use all data, exclude tree rings, or base a reconstruction only on tree rings, has no significant effect on the form of the reconstruction for the period in question. … These comparisons show no evidence that the possible biases inherent to tree-ring (alone) based studies impair in any significant way the multiproxy-based temperature p

      I don’t know.

      I was discussing the fact of disclosure in MBH99. Your references to MBH98 and 2000 are actually irrelevant to the fact of whether they did or did not make clear the dependence upon PC1 in MBH99. The paper we are discussing suggests that they concealed that. They did not.

  49. Steve McIntyre
    Posted Feb 19, 2009 at 5:35 PM | Permalink

    Good editorial on this topic at a blog on administrative law that I’ve mentioned in the past. IPCC; status as an exempt international organization makes it immune from FOI requests.

    The term “due diligence” used here is drawn directly from administrative law and while the blogger takes pains to disassociate himself from any technical arguments made here, he is supportive of some of the administrative positions taken here and in Ross’ article.

  50. carl g
    Posted Feb 19, 2009 at 5:50 PM | Permalink

    #70, #72:

    Ross:
    The MBH99 paper only suggested that significance might be lost in the pre-1400 portion upon removal of the PC1. This is not disclosure in any scientific sense. Removal of the bristlecones destroys the shape of the reconstruction across its length: this has enormous implications for the robustness of the result.

    Simon:

    What it suggested is that pre-1400 correlation depended upon the PC1 but that post-1400 correlation was supported by other indicators, and it said “the assumption that this relationship [PC1] holds up over time nonetheless demands circumspection.” Is that not a circumspect enough disclosure for you?

    You are arguing that if you remove PC1, the post-1400 recon is still significant. Does that make it ok to not report that the shape of the post-1400 recon is no longer a hockey stick? That was the entire premise of their paper: This was the hottest decade/century in the last 1000/600 years.

  51. Posted Feb 19, 2009 at 6:02 PM | Permalink

    RE: #73

    Thanks for the link, Steve; and I never did thank you for the earlier mention, which was remiss of me. In that, you nailed the point I was trying to make – that thinking about these things in terms of administrative law (in particular robust transparency, participation and review mechanisms) might be one fruitful way of making sure that public policy is based upon sound science. Your efforts have meant that the IPCC now presents an interesting case-study in this regard.

    In terms of global governance bodies, this is still pretty speculative, however – the immunities issue that you note being one key reason (although there are some indications of national courts being prepared to set these aside – to date only in staffing issues – where no equivalent procedure is established within the organization itself).

    I should stress that my desire to disassociate myself from the techincal stuff is the product of my own total ignorance; nothing more. Like all good lawyers – and bad scientists – I look for the weight of authority in debates like this. I appreciate the work that you’re doing here, however, and the tone with which its done.

    • Steve McIntyre
      Posted Feb 19, 2009 at 6:11 PM | Permalink

      Re: Euan MacDonald (#76),

      thanks for the kind words. I’m probably a bit unique in this debate in that I have considerable hands-on experience with administrative law and this perspective definitely informs much of the commentary here.

      By characterizing “journal peer review” as a form of “due diligence”, it places this activity in a broader sociological context that academics don’t seem to have. I sometimes feel like an anthropologist stumbling across a tribe that has no concept of how things are done in the wider world.

  52. Steve McIntyre
    Posted Feb 19, 2009 at 6:03 PM | Permalink

    I do not regard the following statement as being, in any sense, “full, true and plain disclosure” of the disappearance of the HS shape without the bristlecones.

    Positive calibration/variance scores for the NH series cannot be obtained if this indicator is removed from the network of 12 (in contrast with post-AD 1400 reconstructions for which a variety of indicators are available which correlate against the instrumental record).

    First, the statement is untrue about the AD1400 “contrast”. The problem also occurs with the AD1400 network, as we observed in our articles.

    Second, erosion of calibration/verification scores can occur without the HS disappearing. They also erode if there is, for example, a super HS that overshoots 19th century temperatures. Thus the MBH99 language does not undo the prior assertion:

    the long-term trend in NH is relatively robust to the inclusion of dendroclimatic indicators in the network, suggesting that potential tree growth trend biases are not influential in the multiproxy climate reconstructions.

    Nor does it plainly disclose the disappearance of the HS. To undo that assertion and plainly report that the HS disappeared with the removal of bristlecones, they would have had to make a much more straightforward statement of their results, something along the lines of the following.

    Our previous claim that “the long-term trend in NH is relatively robust to the inclusion of dendroclimatic indicators in the network” was incorrect. No long-term trend was observed in a reconstruction carried out with bristlecones removed. Accordingly, the issue of whether “potential tree growth trend biases” are or are not influential in the multiproxy climate reconstructions remains outstanding.

    That’s the sort of thing that you would have to have in a mining speculation and I see no reason why climate scientists should adhere to more cunning standards than Vancouver stock promoters. If they made that sort of statement, everyone would have understood it. With hindsight, one can look at the MBH99 statement and see that it provides evidence that they knew about the problem, but, in my opinion, it falls short of disclosing the disappearance of the HS, the point at issue in the new article (which I did not see prior to publication by the way.)

    I would probably have suggested a little more precise language on this point to pre-empt the sort of issue that you raise here, but I think that this language is much more defensible than the assertions in MBH98 and Mann et al 2000 and elsewhere that don’t seem to bother you.

    • Simon Evans
      Posted Feb 19, 2009 at 6:45 PM | Permalink

      Re: Steve McIntyre (#77),

      Steve,

      Thank you for your further comments. I didn’t say that assertions in MBH98 & Mann et al 2000 don’t bother me – I seem to be bothered by everything, including the wording of papers written ten years ago. I am also bothered by published assertions of misrepresentation, which I personally feel should be reserved for judicious consideration with proper representation. That’s my view – perhaps it is naive. I wish that I could feel that the discussion of science that is relevant to all of us was less muddied by personal accusations, regardless of whichever ‘side’ is right or wrong. But I have said my piece and will leave you to it.

      Good night 🙂

    • Posted Feb 20, 2009 at 1:25 PM | Permalink

      Re: Steve McIntyre (#77), The struggle with these guys often comes down to this. If the average of test drill holes contains a significant indications of ore, then it is strictly accurate to say that, even if only one hole had the indication. (Trying to parallel bristlecones here). And as my experiences with reviewers show, they are inclined to let it by, as after all it is not *wrong*.

      It is also robust to a specific proceedure of averaging after removing one hole. But you have to specify the proceedure for estimating robustness more specifically to show it is not robust. Surely there is a standard methodology that captures the misleading nature of such claims.

      • Geoff Sherrington
        Posted Feb 20, 2009 at 5:44 PM | Permalink

        Re: David Stockwell (#82),

        Very appropriate analogy, David, especially the “robust” residual. In mining here at least, an off-put to this type of manipulation is that part of all of the drill core/chips, after analysis, is required by law to be kept for possible further examination.

        Which brings us precisely back to Steve’s early point about archiving of authors’ data and its availability. Here, for mining, statutory bodies can seize and examine the kept material and make criminal charges as appropriate. Perhaps the publishing sector needs an overseer who can seize data and expose manipulation.

  53. Geoff Sherrington
    Posted Feb 20, 2009 at 5:45 AM | Permalink

    Slight divergence, but the Chairman of the Australian Conservation Foundation and vocal green activist has been appointed to chair an advisory committee on radiation safety to the Australian peak body on nuclear, ARPANSA.

    Part of his c.v. from http://www.icmi.com.au/Speaker/Technology_Future/Professor_Ian_Lowe/Biography

    Professor Lowe has been a referee for the Inter-Governmental Panel on Climate Change, and he personally attended the groundbreaking Geneva and Kyoto conferences on Climate Change, which resulted in the ratification of the Kyoto Protocol. Ian Lowe was also a member of the Australian delegation to the 1999 UNESCO World Conference on Science. He was on the steering group for the UNEP project Global Environmental Outlook, an invited participant in the 2000 workshop on Sustainability Science and a referee for both the International Geosphere-Biosphere Program’s 2004 book on planetary science and the UN’s Millennium Assessment Report.

    This appointment raises conflict of interest of the type discussed when appointing reviewers of papers. If such a person were a Judge, he might have the training and prudence to excuse himself.

    The lack of good manners is not confined to the Northern Hemisphere.

  54. jae
    Posted Feb 20, 2009 at 5:43 PM | Permalink

    Nuts, substitute “relate to” for “change.”

  55. Curt Covey
    Posted Feb 20, 2009 at 10:55 PM | Permalink

    I believe there’s a difference between sharing data and methods, and subjecting every noteworthy publication in climate science to an audit in the style of the US Internal Revenue Service. For example, suppose you want to investigate in depth the way that the leading US climate model predicts global warming. You can download the source code from ccsm.ucar.edu. You can read it to see what assumptions the model makes. You can change the assumptions and run the model on just about any computer. But none of this is easy! The model developers will provide documentation and answer occasional questions, but they will not devote themselves to the task of debugging and running your version of the model on your computer.

    I think this is the way most cutting edge science works — indeed the way it needs to work. Engineering is apparently a different story.

    • Craig Loehle
      Posted Feb 24, 2009 at 3:29 PM | Permalink

      Re: Curt Covey (#87), No one is necessarily asking the authors of the papers in question to spend much time on this. In many cases the data being audited at CA is simply flat files (think tree ring data) easily listed in Text or CSV format. Overly much is made of the difficulty of making code available. There are a few cases where this may be true, but if your code is a mishmash of Fortran, Python, and scripts that is compiler dependent, you need help anyway (and you know who you are). Any “major” scientific discovery (cloning, a cancer treatment, a genetic discovery, a new atomic particle) sets off a race among competitors to replicate the work so they can make use of it and be at the forefront–and to comment on failed replication. It escapes me why there should be an exception for climate science.

      • Simon Evans
        Posted Feb 24, 2009 at 5:13 PM | Permalink

        Re: Craig Loehle (#92),

        Do you think that Wegman should have made his work fully available, upon which he based his criticisms of MBH 98/99? I do – why didn’t he? Are people as concerned here about unsatisfactory practice on either ‘side’, or is it a matter of double standards?

        • bender
          Posted Feb 24, 2009 at 5:59 PM | Permalink

          Re: Simon Evans (#93),
          Why don’t you write to Wegman and ask for his code?

        • Simon Evans
          Posted Feb 24, 2009 at 6:18 PM | Permalink

          Re: bender (#94),

          Why don’t you write to Wegman and ask for his code?

          David Ritson did write to Wegman and received no response, AFAIAAA. See here:

          Click to access RitsonWegmanRequests.pdf

          You want me to repeat the exercise, after all these years? I don’t think so.

          So what do you think? Was it ok for Wegman not to have responded? Are you an objective commentator who would like to see standards applied evenly or are you simply interested in referencing such ‘standards’ when it suits you?

        • Jeff Alberts
          Posted Feb 24, 2009 at 8:52 PM | Permalink

          Re: Simon Evans (#96),

          So what do you think? Was it ok for Wegman not to have responded? Are you an objective commentator who would like to see standards applied evenly or are you simply interested in referencing such ‘standards’ when it suits you?

          Yes, it was wrong for Wegman not ot have responded.

          So, was it wrong for Mann, and many others, not to have responded?

          For sure, I agree. I just wonder why we’re still talking about ten-year-old papers, as if that were a guide to what we should be talking about now. I don’t think science is served by the soap opera of who has said what and so on before. I think we should be looking at current work now, on its own merits, and without prejudice. What do you think?

          Because said papers are still widely quoted as “independently” validated, when in fact no such thing has occurred. So since we should be “looking at current work now”, what’s your cutoff? 1 month? 6 months? a year? 5 years? Pretty arbitrary isn’t it? And does that mean we can safely ignore non-current papers?

        • mondo
          Posted Feb 24, 2009 at 6:12 PM | Permalink

          Re: Simon Evans (#93), Simon,

          Are people as concerned here about unsatisfactory practice on either ‘side’, or is it a matter of double standards?

          It isn’t really a matter of sides. The simple fact is that if a party chooses to make a public statement, especially one that could have massive policy and economic implications, they should be able to demonstrate the truth of their statement by providing whatever backup is appropriate to demonstrate that their statement is true, and should therefore be taken seriously.

          This applies to anyone who chooses to weigh into the discussion. It applies to Steve McIntyre, Ross McKitrick, Roy Spencer, Jeff ID, Ryan O, Wegman just as much as it does to Al Gore, James Hansen, Michael Mann et al.

          What happens though is that certain parties have chosen, on the one hand, to make substantive public statements while on the other hand acknowledging that in their view, the ‘problem’ is so serious that they feel justified in exaggerating to make their point (Gore, Schneider quotes are readily available on this). It is hardly surprising that when challenged to verify the truth of their statements by providing back up data, methods, code, or whatever else it might be that would verify their statements, and the response is – let’s say – uncooperative, that others might be motivated to dig into the work underlying those statements.

          You would have to acknowledge that many of the statements made haven’t stood up too well to examination. I think that you would find that anybody who made statements that could not then withstand scrutiny would soon be subject to intense questioning. You might ask, why is it that Steve McIntyre doesn’t seem to be challenged too often with clear and direct questions regarding his work? Perusal of this site makes the answer to that pretty obvious.

          As to Wegman, he should be as prepared as anybody else to demonstrate the truth of his statements, especially to people capable of understanding and challenging his work. So the right answer is, as bender suggests, to ask Wegman yourself.

        • Simon Evans
          Posted Feb 24, 2009 at 6:30 PM | Permalink

          Re: mondo (#95),

          As to Wegman, he should be as prepared as anybody else to demonstrate the truth of his statements, especially to people capable of understanding and challenging his work. So the right answer is, as bender suggests, to ask Wegman yourself.

          What is the point, mondo? The history is that he didn’t answer basic questions about his analysis already. This has been well known for a long time but, of course, those who wish to reference Wegman’s judgments against Mann have seemingly conveniently forgotten it. Wegman has answered to nothing. If you were truly interested in ‘auditing’ then you should be jumping on that, rather than suggesting that I pursue a lost cause. But then, perhaps, you are not truly interested in that, but only in attacking those whom you wish to attack, rather than in having the objectivity to criticise also those who seem to support the attack upon those whom you wish to attack.

        • Steve McIntyre
          Posted Feb 24, 2009 at 6:25 PM | Permalink

          Re: Simon Evans (#93),

          I commented on this matter in August 2006 here.

          I agree 100% that standards should be consistent. In Wegman’s place, I think that he should have buttoned up his code so that it could be examined. I would have supplied the relevant data or code and I think that he should have done so as well. However, Wegman’s regrettable failure to do so does not justify Team obstruction from now until the end of time.

          At the time, given Mann’s then track record of refusals and obstruction – not to speak of Lonnie Thompson, Phil Jones, … – in some cases stretching then over years, it seemed cheeky, to say the least, for realclimate to be dumping on Wegman for not immediately responding to Ritson. These were the people who had had a long history of obstruction and now they were complaining.

          I know that his group actually ran the code that I had archived. At the time, I was placing the code online as a reference but hadn’t developed the turnkey approach that I try to use now. They got stuck at a couple of points where I referred to internal directories; the points were small and I clarified them, as I’ve done online here from time to time. In contrast, the Mann code supplied to the Committee was incomplete and inoperable, as we discussed here in 2005.

        • Simon Evans
          Posted Feb 24, 2009 at 6:37 PM | Permalink

          Re: Steve McIntyre (#97),

          Steve,

          Are you aware of Wegman having responded since? I am not. Three weeks is now several years!

          I couldn’t care less about ‘track record’. The fact is, to me, that people should not be referencing Wegman’s ‘unaudited’ criticisms of Mann et al, whilst at the same time continuing to criticise Mann et al for inadequate disclosure, without showing the intellectual honesty to recognise that the lack of disclosure has cut both ways.

        • Steve McIntyre
          Posted Feb 24, 2009 at 6:54 PM | Permalink

          Re: Simon Evans (#99),

          I’m not sure what you’re disagreeing with in my comment. I said:

          I agree 100% that standards should be consistent. In Wegman’s place, I think that he should have buttoned up his code so that it could be examined. I would have supplied the relevant data or code and I think that he should have done so as well.

          What part of that do you disagree with? I think that my position is entirely consistent.

          As to prohibiting people from referencing reports that do not archive their data and code, as you suggest, I’m OK with that. If people argue that the Wegman Report doesn’t meet those standards and therefore should not be cited, so be it. But they have to apply the same standards consistently. It’s not as though the NAS panel met these standards either or the studies used in IPCC.

        • Simon Evans
          Posted Feb 24, 2009 at 6:59 PM | Permalink

          Re: Steve McIntyre (#101),

          Ok, I agree with that.

        • Steve McIntyre
          Posted Feb 24, 2009 at 6:41 PM | Permalink

          Re: Steve McIntyre (#97),

          further to this post, in my August comment on the Ritson request then outstanding for 3 weeks, I observed:

          I guess it’s time to re-submit a request to Mann for the actual stepwise results from MBH, how he calculated the confidence intervals, how he retained principal components…

          These three requests still remain outstanding. No one knows how Mann calculated the confidence intervals. Jerry North said that he would ask, but later said it was time to “move on”. Again, I think that Wegman should have buttoned everything up at the time, but, for Mann to be complaining… puh-leeze.

        • Simon Evans
          Posted Feb 24, 2009 at 6:57 PM | Permalink

          Re: Steve McIntyre (#100),

          But Steve, this is not Mann complaining, this is ‘little me’, pointing out that Wegman, whilst criticisding Mann for not being open, was not open himself. Ok, I’ve picked that up via Mann – so what? The facts remain – and no amount of complaining about Mann not being open changes the fact of Wegman not being apparently hypocritical. If you discount Mann’s work because of your view that he has not been open then you should discount Wegman’s judgment and, frankly, I think you should do it with just as much vigour and regularity. That is, if you wish to be convincingly ‘objective’. I recognise that you have criticised Wegman in your responses. However, I do not recognise that you have been as quick to do that, nor as ready, as you have been , over many years now, to criticise the very same limitations (in your view) regarding Mann’s preparedness to disclose.

        • Steve McIntyre
          Posted Feb 24, 2009 at 7:14 PM | Permalink

          Re: Simon Evans (#102),

          If you discount Mann’s work because of your view that he has not been open then you should discount Wegman’s judgment

          I don’t “discount” Mann’s work because he has not been “open” though I object to the lack of openness.

          My issues are on the merits. The obstruction is a nuisance and a defect, but I’ve never argued that the work is without merit on that account. That doesn’t mean that I can’t simultaneously argue that scientists should do a bang-up job of archiving data and code.

          And FWIW I’ve also made a point of saying that Mann’s record is not bad relative to some others in the field. I’ve made a point of saying that Mann et al 2008 made an honest effort to show their work – it wasn’t A-grade, but they tried and I’ve clearly stated that,

        • Simon Evans
          Posted Feb 24, 2009 at 7:35 PM | Permalink

          Re: Steve McIntyre (#105),

          Steve,

          And FWIW I’ve also made a point of saying that Mann’s record is not bad relative to some others in the field. I’ve made a point of saying that Mann et al 2008 made an honest effort to show their work – it wasn’t A-grade, but they tried and I’ve clearly stated that,

          Well, good. I guess that your pressure has added to that improvement, so that’s good.

          I’d be more persuaded of your own objectivity if I felt aware of your being equivalently critical (in terms of the space you devote) of what I will term ‘anti-AGW’ presentations. I’m sure you can reference critical comments on Loehle, for example. But it is entirely obvious that your energies are focused negatively upon presentations which are supportive of the ‘AGW’ case. That’s ok in itself, if you wish to be a partisan (which is fine – I don’t object to people being honest partisans for one side or the other). Reviewing your blog, however, it’s an impossible stretch to think that you’re ‘auditing’ the matter with disinterest.

        • Craig Loehle
          Posted Feb 24, 2009 at 8:02 PM | Permalink

          Re: Simon Evans (#107), Simon, we have been here before (see Jon and Walt et al recently). Everyone wants Steve M. to be a saint if he is going to critique anything. I’m not aware of any rule about that anywhere. Whether he is auditing with “disinterest” or not, everything he does is out in the open, and he fixes any mistakes people point out. Someone who has an “interest” in finding a cure for cancer (or whatever) can nevertheless be objective if he keeps the data first (as Feynman said). Pure disinterestedness does not exist. Every time someone does a study they hope to find something, good, bad, or important. BUT a good scientist doesn’t let himself be fooled by what he wishes to see.

          By the way, much of Wegman’s critique was about statistical method and the lack of statisticians on climate papers, and was not an original study. The code for the network of who co-authors with who was an ancillary point, IMO.

        • Simon Evans
          Posted Feb 24, 2009 at 8:14 PM | Permalink

          Re: Craig Loehle (#108),

          Craig,

          Every time someone does a study they hope to find something, good, bad, or important. BUT a good scientist doesn’t let himself be fooled by what he wishes to see.

          For sure, I agree. I just wonder why we’re still talking about ten-year-old papers, as if that were a guide to what we should be talking about now. I don’t think science is served by the soap opera of who has said what and so on before. I think we should be looking at current work now, on its own merits, and without prejudice. What do you think?

        • Dave Dardinger
          Posted Feb 24, 2009 at 9:33 PM | Permalink

          Re: Simon Evans (#109),

          So tell me Simon, what exactly in the Wegman Report requires back-up? As I read it, he basically said, “I tried replicating Mann’s work and couldn’t because the code and data weren’t there. I tried replicating M&Ms work and could easily show that the Mannomatic created Hockey Sticks from (white / red whichever) noise.”

          Now the methods M&M used are right here and can be replicated by anyone who wants. While having the exact code Wegman used might be nice, it’s not necessary since the code he was replicating is available and does indeed replicate M&Ms work. Now, if you doubt this, feel free to try replicating M&M’s work or even ask someone here to help you do it. Unlike at Mann-world you won’t lack for help to get you over the rough spots. Once you’ve done so, you merely need to learn what the finding that hockeysticks are created from noise means. In essence it means that the Mannomatic is worthless for creating multi-century temperature proxies. Both Wegman and the NAS panel agreed on this. The only difference is that the NAS panel thought there had been “independent” verifications of Mann that didn’t use the results of the Mannomatic. This is not the case, though as was pointed out here at the time, it was probably as much as could be expected given the political realities.

          Now it’s nice to audit the auditor but the situation isn’t quite the same as an original audit. And auditor presents findings and is a signed document bearing the guarantee of the auditor. If a problem arises with something an auditor signs off on, then it should be taken to the responsible party (a regulatory body, etc) or a court of law. AFAIK, nobody has found what Wegman said is incorrect.

        • Steve McIntyre
          Posted Feb 24, 2009 at 10:51 PM | Permalink

          Re: Dave Dardinger (#113),

          Wegman said:

          We have been to Michael Mann’s University of Virginia website and downloaded the materials there. Unfortunately, we did not find adequate material to reproduce the MBH98 materials.

          We have been able to reproduce the results of McIntyre and McKitrick (2005b). While at first the McIntyre code was specific to the file structure of his computer, with his assistance we were able to run the code on our own machines and reproduce and extend some of his results.

        • Steve McIntyre
          Posted Feb 24, 2009 at 9:57 PM | Permalink

          Re: Simon Evans (#107),

          Simon, as I’ve said on many occasions, I’m interested in mainstream science used by IPCC.

          I think that it should be evident that I like a cross fire. I would be quite pleased if Gavin Schmidt or whoever posted critical analyses of the type you desire here. I’ve provided passwords to some people who are severely critical of me, but thus far this hasn’t resulted in takers. But it hasn’t been because I didn’t offer.

          The question is not why I’m not doing more than I am already am – I’m already doing an unreasonable amount of work.

        • theduke
          Posted Feb 24, 2009 at 10:41 PM | Permalink

          Re: Simon Evans (#102),

          Bogus analogy. snip – policy If Mann was so incensed by Wegman’s findings, he could have done what Steve does so well: he could have AUDITED them, without the benefit of codes and data.

        • Gerald Machnee
          Posted Feb 25, 2009 at 8:34 AM | Permalink

          Re: Simon Evans (#102),
          snip Wegman did not produce a new study. He was trying to replicate others. You could have tried the same with appropriate references. All you are doing here is attacking the messenger because there is a very low probability that you are able to match his work. It is not surprising that Wegman did not respond. If everyone in the past would have archived their codes, it is likely Wegman would have done so. However, he did what he was asked to do – review Mann.

      • Curt Covey
        Posted Feb 25, 2009 at 1:49 PM | Permalink

        Re: Craig Loehle (#92) and also Comment 124 above, I think we all agree that peer-reviewed publication is not enough; “due diligence” and “replication” are essential. The issue is how precise and complete the replication must be. I do not believe cutting-edge science allows the same ease and precision of replication as engineering work involving well-practiced techniques. This has long been the case with laboratory experimentation (Rutherford’s lab was infamous for its jury-rigged apparatus) and is equally true of computer modeling. Codes are indeed “a mishmash of Fortran, Python, and scripts.” Nevertheless the basic results of climate models, such as a noticeable increase in surface temperature from human-produced carbon dioxide, have been replicated dozens of times. The different codes written by different groups may all be wrong but they are fundamentally consistent.

        • Craig Loehle
          Posted Feb 25, 2009 at 3:13 PM | Permalink

          Re: Curt Covey (#125), It hardly seems “fundamentally consistent” when the trajectories of global temperature for the models make a spagetti graph and the models disagree violently about arctic weather (for example), BUT it is one thing to say they agree (qualitatively, let us say), it is quite another to base policy on that agreement. The preliminary nature of most discoveries is certainly true, but you don’t build an airplane with preliminary discoveries.

        • stan
          Posted Feb 25, 2009 at 3:14 PM | Permalink

          Re: Curt Covey (#125),

          But the assumption that they can use their hindcasts to make accurate forecasts is not supported by anything other than their own faith. That ain’t science, no matter how many of them partake of the cup.

        • Mark T
          Posted Feb 25, 2009 at 3:50 PM | Permalink

          Re: Curt Covey (#125),

          The different codes written by different groups may all be wrong but they are fundamentally consistent.

          The ones they use are (seemingly) fundamentally consistent, but that doesn’t mean they don’t intentionally throw out the ones that aren’t. Furthermore, if you listen to Gavin much, just about any result in the actual climate is “consistent” with the model runs, so it serves to reason that just about any outcome can be construed as consistent with another, i.e., in this context, the phase “fundamentally consistent” when applied to various models could simply be “didn’t crash.”

          Mark

        • Craig Loehle
          Posted Feb 25, 2009 at 5:16 PM | Permalink

          Re: Mark T (#128), Any altitude between 0 and 60,000 feet will be “consistent with” our planned flight today. Would you like some peanuts?

        • Mark T
          Posted Feb 25, 2009 at 5:32 PM | Permalink

          Re: Craig Loehle (#129), Yes, yes, they can actually claim my flight made it to Orlando in spite of taking an early dirt nap somewhere near Atlanta. I should add that this is also within the margin of error of the projection.

          Mark

  56. mondo
    Posted Feb 23, 2009 at 10:47 PM | Permalink

    Curt,

    Sorry to bang on about this, but to me at least, the real point about replication is that the truth of statements made can be verified and confirmed.

    The parallels in the engineering world and the commercial world, while different in detail, are designed so that interested parties can verify that what is claimed is in fact true.

    I had thought that similar logic underlies the requirement for replication in science, However, for some reason, some climate scientists do not seem to consider it necessary to provide any confirmation that what they are saying is in fact true, and can be verified as true.

    Given the massive implications for policy and the world economy, surely the least we can ask is that those making public statements on these important matters can in fact verify that the statements are true.

    This approach also highlights the failure of much of MSM (with honourable exceptions of Christopher Booking, Lawrence Solomon, George Will and Andrew Bolt – and maybe a few others) to ask the hard question – “Is this statement actually true?” and “How can I establish whether or not it is true?”

    The great value of the work being done by Steve McIntyre and many participants at CA (and other sites like WUWT, Air Vent, Blackboard) is that when people seek to establish whether statements are true, they have been able to demonstrate, in an embarassingly large number of instances now, that statements have been shown NOT to be true.

    Given that, it is perhaps understandable that certain climate scientists are not all that willing to expose their work to scrutiny.

    • stan
      Posted Feb 24, 2009 at 8:16 AM | Permalink

      Re: mondo (#88),

      Agreed with one slight modification. The point about policy is not that the studies “can” be replicated (as in “are capable of replication if some scientist chooses to do so”). Policy decisions need to be based on studies that actually have been replicated often enough that we have a solid foundation for being convinced that they represent the best information available.

    • Mark T
      Posted Feb 24, 2009 at 10:08 AM | Permalink

      Re: mondo (#88),

      The parallels in the engineering world and the commercial world, while different in detail, are designed so that interested parties can verify that what is claimed is in fact true.

      This is why perpetual motion machines, and their inventors, never get any traction nor do they ever sell any products in the real world. Numberwatch has a bit on one such outfit this month.

      Mark

  57. bernie
    Posted Feb 24, 2009 at 7:57 AM | Permalink

    Slightly off thread, but sometime poster paddikj mentioned on another pro-AGW site a book by Josephine Tey, The Daughter of Time. I mention it here because it traces the deconstructing of a grand historical “fact” – Richard III’s murdering of two young princes in the Tower. It is a great, easy read and a great introduction, IMHO, to what this site and this paper is all about. Tey’s book and Aaron Wildavky’s But Is It True? are great handbooks and justifications for checking the facts before accepting things.

  58. mondo
    Posted Feb 24, 2009 at 7:11 PM | Permalink

    Simon,

    Just to be clear. Do you agree that people making public pronouncements on anything at all should be prepared to provide detailed back up of whatever kind is appropriate to demonstrate the truth of the statement made?

    • Simon Evans
      Posted Feb 24, 2009 at 7:20 PM | Permalink

      Re: mondo (#104), Re: mondo (#104),

      mondo,

      Yes, I do agree that should be the way. I don’t think it has been so clearly the way in the past, but I think that is changing for the better. I would be satisfied to see the criticism applied even-handedly, and not to think that it was just being applied in a partisan manner.

  59. stan
    Posted Feb 24, 2009 at 9:02 PM | Permalink

    Wow Simon!

    If you beat that false equivalence drum any harder you’re liable to get a sore arm.

  60. curious
    Posted Feb 24, 2009 at 9:04 PM | Permalink

    Hi Simon – I think the current lead post on CA is on a paper published this January? As far as discussion of 10 year old papers goes I think you came in above at comment 46 asking after the treatment of MBH99?

    FWIW my tuppence – I’ve seen some of the points you’ve made and you’ve highlighted new info. to me, so thank you. To answer your question above from the perspective of a lay person trying to get a view of where things are at it good to have a source of info. on the work up to this point in time. There have been significant past errors and it is the work of critics to expose these.

    In the present paper at the most fundamental level it turns out there were errors in the base data and whichever “independent” source found them I think it is good that they did so. To repeat a comment elsewhere I cannot understand how with the global significance attached to this issue a paper can go to press with world PR coverage and within days it be shown that the most basic data checks hadn’t been done.

    So, to ask you a question – as the post is on due diligence – do you think that is good enough?

  61. bender
    Posted Feb 24, 2009 at 10:39 PM | Permalink

    Simon Evans,
    Fact is I’ve no reason to doubt anything Wegman does or says – and I challenge you to give me one. I would welcome it. Unfortunately, in the case of Dr. Mann there are no shortage of examples leading me to doubt most everything he does and says. Where do I start … ?

  62. Simon Evans
    Posted Feb 25, 2009 at 9:46 AM | Permalink

    I realise I’ve posted quite enough on this thread, so these responses are meant as courtesies and then I’ll rest my views –

    Re: curious (#112),

    curious,

    Yes, I realise that I’m discussing ten-year-old stuff, but that’s because this thread started on the McCullough and McKitrick paper, and my entry to the discussion was concerning its statements as to MBH 98/99. To answer your question on due diligence, AFAIAA the errors in raw data were not particular to the Steig paper. I presume they’d been in the system and had been part and parcel of whatever analyses had supported the generalised ‘Antarctica cooling’ perception (if I’m wrong then I’m sure I’ll be corrected). I agree it would have been better had Steig et al found the errors before publication, but they’ve been spotted now and accepted, so that’s proper process at work, IMV.

    Re: Jeff Alberts (#110),

    Yes, it was wrong for Wegman not ot have responded.
    So, was it wrong for Mann, and many others, not to have responded?

    I think so, yes, though I’m not entirely sure in either case.

    Re: Steve McIntyre (#114),

    Steve,

    I would be quite pleased if Gavin Schmidt or whoever posted critical analyses of the type you desire here. I’ve provided passwords to some people who are severely critical of me, but thus far this hasn’t resulted in takers.

    Ok, I didn’t know that. I think that could be a good thing. I wish there were a ‘climate site’ where the balance of debate was rather more even (btw, I’ve only posted a couple of comments on RC, partly since I’m not sure I’m comfortable with their moderation approach), though I’m not saying it’s your responsibility to ensure that!

    Re: Gerald Machnee (#118),

    All you are doing here is attacking the messenger because there is a very low probability that you are able to match his work.

    There’s no probability at all – it’s not my field. I don’t think that all I’m doing is making an attack, though evidently I’ve given you that impression, which I regret.

    Anyway, moving on…….

  63. Craig Loehle
    Posted Feb 25, 2009 at 11:08 AM | Permalink

    In defence of Ross “dwelling on 10 year old studies” instead of moving on, it was necessary in the paper discussed on this thread for enough time to elapse for things to play out. It would not be possible, for example to discuss due diligence yet for the Steig paper. Correct me if I’m wrong here Ross.

  64. Ross McKitrick
    Posted Feb 25, 2009 at 11:43 AM | Permalink

    Craig, The hockey stick is still current as long as governments continue to promote it and the IPCC continues to defend it (i.e. by their petty, grudging and argumentative battles over the Ch 6 text). By these actions they say it’s important to them, so the story of its downfall must continue to be told. As a scientific matter I don’t think it’s important–and I never have–because I don’t like the calibration algorithm and I think properly-estimated error bars would be infinitely large. Which reminds me, Simon, in answer to your question about the IPCC 1990 graph, I don’t advocate that one either. What I have said is that prior to the hockey stick, that was the conventional view. It said so in the IPCC 1990 report. When David Deming got that email telling him “We have to get rid of the medieval warm period” they didn’t mean in Central England! They were talking about the global reconstructions, or at least the NH.

    For the Steig paper, the issue for me is that if this paper matters to a policymaker (and please understand, that McKitrick and McCullough is concerned with policy-relevant research, not ordinary run-of-the-mill academic doodling) then the replication process needs to operate on a timetable relevant to the policymaking process. Whether replication is successful or not, it’s useless if it only happens 5 years after the laws were voted on.

    If Steig, or any other paper by any other author, matters to someone who is going to be voting on some important policy, I’d like there to be a process whereby the policymaker can hand the paper to an office, something like the CBO in the US, and say: Here, see if this paper can pass all the items on the replication checklist. Then the replicator sends a letter to the author asking for the data and the code, and cooperation in reproducing the results, while pointing out that the author is not obliged to disclose data or code, but a report will be sent to Congress anyway and if one or more items on the checklist fails it may carry a recommendation that the study should not be relied on for policy purposes. Then the onus goes on the scientist: you want to influence policy, fine, but first there’s some due diligence. If you don’t want to go through it you don’t have to. But then we won’t be using your results for guiding policy.

    • Jeff Alberts
      Posted Feb 25, 2009 at 12:39 PM | Permalink

      Re: Ross McKitrick (#121),

      By these actions they say it’s important to them, so the story of its downfall must continue to be told.

      IMHO, climate scientists don’t seem to consider it downfallen, but instead continue to publish work based directly upon it. This is a problem.

  65. curious
    Posted Feb 25, 2009 at 12:26 PM | Permalink

    Hi Ross above – all sounds good and the point about timescales is IMO spot on. But don’t you think if it has been publicly funded research there should be an obligation /full stop/ to provide code and data to agreed standards? I can’t see a counter argument given the work has to have been done anyway and the public sector is not for profit, or am I missing some IPR type argument?

  66. James Chamberlain
    Posted Feb 25, 2009 at 1:16 PM | Permalink

    Unfortunately, currently, peer-reviewed is considered the due diligence. Which is a big mistake.

  67. mondo
    Posted Feb 25, 2009 at 5:21 PM | Permalink

    Curt,

    I think that you are getting into too much detail. The question we need to have answered by climate scientits is “How can we be sure that what you are saying can in fact be demonstrated to be true?”

    Whatever it takes.

  68. Curt Covey
    Posted Feb 25, 2009 at 7:57 PM | Permalink

    Finishing my comments on this thread with a summing-up, I agree that there is plenty of room for improvement in climate models, but I disagree that significant improvements can or should come easily.

    I will close on a note of optimism. The level of model “transparency” described in my initial comments is not all one could wish for, but is a lot better than it was 25 years ago when I started in the business. Back then, the primordial ancestor of the CCSM could only be run on one computer (at the US National Center for Atmospheric Research). It also had problems that today would be described as unacceptable. Antarctica was so dry that its humidty was below zero!). Improvements do happen, albeit at a frustratingly slow pace.

    • Craig Loehle
      Posted Feb 26, 2009 at 7:16 AM | Permalink

      Re: Curt Covey (#132), I think everyone here admits the models have gotten better. The question, as noted above, is how good are they? How accurate? What are the error bars on forecasts and can we even put error bars? “Pretty good” just isn’t a very satisfying answer.

One Trackback

  1. By Correlation | alazycowboy.com on Mar 6, 2009 at 5:53 AM

    […] got a good chuckle out of this. But then I find the blog, Climate Audit, and the article by Bruce McCullough and Ross McKitrick entitled, Check the numbers; From the U. S. […]