Gergis et al “Put on Hold”

A few days ago, Joelle Gergis closed her letter refusing data stating:

We will not be entertaining any further correspondence on the matter.

Gergis’ statement seems to have been premature. David Karoly, the senior author, who had been copied on Gergis’ surly email and who is also known as one of the originators of the “death threat” story, wrote today:

Dear Stephen,

I am contacting you on behalf of all the authors of the Gergis et al (2012) study ‘Evidence of unusual late 20th century warming from an Australasian temperature reconstruction spanning the last millennium’

An issue has been identified in the processing of the data used in the study, which may affect the results. While the paper states that “both proxy climate and instrumental data were linearly detrended over the 1921–1990 period”, we discovered on Tuesday 5 June that the records used in the final analysis were not detrended for proxy selection, making this statement incorrect. Although this is an unfortunate data processing issue, it is likely to have implications for the results reported in the study. The journal has been contacted and the publication of the study has been put on hold.

This is a normal part of science. The testing of scientific studies through independent analysis of data and methods strengthens the conclusions. In this study, an issue has been identified and the results are being re-checked.

We would be grateful if you would post the notice below on your ClimateAudit web site.

We would like to thank you and the participants at the ClimateAudit blog for your scrutiny of our study, which also identified this data processing issue.

Thanks, David Karoly

Print publication of scientific study put on hold

An issue has been identified in the processing of the data used in the study, “Evidence of unusual late 20th century warming from an Australasian temperature reconstruction spanning the last millennium” by Joelle Gergis, Raphael Neukom, Stephen Phipps, Ailie Gallant and David Karoly, accepted for publication in the Journal of Climate.

We are currently reviewing the data and results.

The inconsistency between replicated correlations and Gergis claims was first pointed out by Jean S here on June 5 at 4:42 pm blog time. As readers have noted in comments, it’s interesting that Karoly says that they had independently discovered this issue on June 5 – a claim that is distinctly shall-we-say Gavinesque (See the Feb 2009 posts on the Mystery Man.)

I urge readers not to get too wound up about this, as there are a couple of potential fallback positions. They might still claim to “get” a Stick using the reduced population of proxies that pass their professed test. Alternatively, they might now say that the “right” way of screening is to do so without detrending and “get” a Stick that way. However, they then have to face up to the “Screening Fallacy”. As noted in my earlier post, while this fallacy is understood on critical blogs, it is not understood by real_climate_scientists and I would not be surprised it Gergis et al attempt to revive their article on that basis.

One thing we do know. In my first post on Gergis et al on May 31, I had referred to the Screening Fallacy. The following day (June 1), the issue of screening on de-trended series was discussed in comment. I added the following comment in the main post ( responding to comment by Jim Bouldin and others):

Gergis et al 2012 say that their screening is done on de-trended series. This measure might mitigate the screening fallacy – but this is something that would need to be checked carefully. I haven’t yet checked on the other papers in this series.

There was a similar discussion at Bishop Hill. What the present concession means – is that my concession was premature and that the screening actually done by Gergis et al was within the four corners of the Screening Fallacy. However, no concessions have been made on this point.

This entry was written by Stephen McIntyre, posted on Jun 8, 2012 at 3:56 PM, filed under Uncategorized and tagged gergis, karoly. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

440 Comments

Gerald Machnee

Posted Jun 8, 2012 at 4:09 PM | Permalink

OK. You have now been named, thanked, and not denigrated. “Progress in Climate”.
- TerryS
  
  Posted Jun 8, 2012 at 4:26 PM | Permalink
  
  Not quite.
  
  we discovered on Tuesday 5 June
  
  After months of preparation and going through peer review they found the same error one day before it was published here on Climate Audit.
  Since they independently found this error before Steve or Jean made any mention of it here there is no need to cite Climate Audit in any corrections.
  - James Evans
    
    Posted Jun 8, 2012 at 4:30 PM | Permalink
    
    Seems to me that Jean first flagged up the problem on Jun 5, 2012 at 4:42 PM.
    - TerryS
      
      Posted Jun 8, 2012 at 4:36 PM | Permalink
      
      The comment of Jean’s that Steve links to in the ‘Gergis “Significance”’ article is dated Jun 6, 2012 at 1:42 AM so I assumed that this was the first substantive comment about the problem.
      
      Perhaps others made the same assumption…
      
      Steve: there was an earlier comment from Jean S on June 5 as well.
    - James Evans
      
      Posted Jun 8, 2012 at 4:37 PM | Permalink
      
      Though I guess that would be the morning of the 6th in Oz.
  - Steve McIntyre
    
    Posted Jun 8, 2012 at 4:55 PM | Permalink
    
    Re: TerryS (Jun 8 16:26),
    
    This problem was pointed out by Jean S here on June 5.
    
    As readers have observed above, their claim to have independently discovered the error on June 5 is very reminiscent of the Mystery Man who “independently” discovered an issue with the Harry station the same day that it had been pointed out at Climate Audit. And moved with lightning speed to expunge the error at the British Antarctic Survey (which would have made it that much harder to trace the matter.)
    
    I asked the British Antarctic Survey to thank the person who had made this “independent” discovery and it turned out to be Gavin Schmidt, who had plagiarized Climate Audit (plagiarism in the sense of failing to give credit, as opposed to copying words.)
    
    It would be interesting to know who, beside Climate Audit, “also” discovered this issue.
    - theduke
      
      Posted Jun 8, 2012 at 5:06 PM | Permalink
      
      Not quite sure it relates, but here is Jean S on June 4th:
      
      Myles Allen Calls For “Name and Shame”
    - theduke
      
      Posted Jun 8, 2012 at 5:15 PM | Permalink
      
      That comment by Jean referenced Gergis’s previous paper so never mind.
    - Steven Mosher
      
      Posted Jun 8, 2012 at 5:36 PM | Permalink
      
      haha. so funny. I was going to predict this was going to happen on the other thread
      and then thought my comment was too cynical.
    - TerryMn
      
      Posted Jun 8, 2012 at 5:46 PM | Permalink
      
      Same as it ever was. In the grand scheme of things, it’s they I don’t envy having to look in the mirror every morning.
    - Richard Drake
      
      Posted Jun 8, 2012 at 6:43 PM | Permalink
      
      Mosh:
      
      I was going to predict this was going to happen on the other thread
      and then thought my comment was too cynical.
      
      Being too cynical about climate science. Now that takes some doing.
    - Don McIlvin
      
      Posted Jun 8, 2012 at 7:57 PM | Permalink
      
      Re: Steve McIntyre (Jun 8 16:55),
      
      I do have a quibble..
      
      You said..
      
      It would be interesting to know who, beside Climate Audit, “also” discovered this issue.
      
      I agree totally with this.
      
      Clearly they have said “we” i.e. the authors (or at least one of they).
      
      But the innuendo with mention of the plagiarism regarding GS is a step to far.
      
      Especially when they did give at minimum co-credit.
      
      We would like to thank you and the participants at the ClimateAudit blog for your scrutiny of our study, which also identified this data processing issue.
      
      Thanks, David Karoly
      
      It seems quite plausible that the May 31st post bringing their paper in view of CA scrutiny, and particularly you’re June 1st Note where you bolded De-Trended following directly with what would easily by interpreted as intent to check (as Jean S readily did) – would prompt one of the authors to look closely at the detrending aspect in detail.
      
      At least some recognition of plausibility to this is in order. They have otherwise acted responsibly.
      
      Putting their own article on hold is a horse of a different color than the typical “Team” response.
      
      don
    - Steve McIntyre
      
      Posted Jun 8, 2012 at 10:11 PM | Permalink
      
      the innuendo with mention of the plagiarism regarding GS is a step to far.
      
      This reference is in connection with the Mystery Man incident, not the present incident. I’m not making an innuendo in respect to the prior incident. I’m asserting it as a fact. It was over a relatively trivial issue, but there was no doubt that Schmidt plagiarized Climate Audit.
      
      See https://climateaudit.org/2009/02/03/gavins-mystery-man/ and https://climateaudit.org/2009/02/04/gavins-mystery-man-revealed/
      
      Gavin’s Complaint
      
      http://web.archive.org/web/20090208100837/http://sciencepolicy.colorado.edu/prometheus/alls-fair-in-love-war-and-science-4929
      http://web.archive.org/web/20090208100227/http://sciencepolicy.colorado.edu/prometheus/gavin-schmidts-demands-4931
      http://web.archive.org/web/20100328162619/http://sciencepolicy.colorado.edu/prometheus/a-formal-response-to-gavin-schmidt-4936
    - Don McIlvin
      
      Posted Jun 8, 2012 at 9:12 PM | Permalink
      
      Re: Steve McIntyre (Jun 8 16:55),
      
      With regard to Eric’s “impression” on RealClimate.
      
      [Response:My impression is actually that the particular error was not first identified at Climate Audit. Karoly’s letter says “also”. But whatever,…. –eric]
      
      Wrong. It is certainly valid to state in no uncertain terms that the problem was first identified on CA. Karoly’s letter is clearly dated well after Jean S’s post, and Steve’s prominent confirmation post (of Jean S) the next day.
      
      And no doubt, Climate Audit discussion prompted the authors to re-check the data leading to “discovering” the problem.
      
      They literally announced the paper was on-hold on Climate Audit, since they asked him to post the letter.
      
      All RealClimate got was a broken link to the article. That is so satisfying given the smugness that prevails their.
      
      don
    - Robert
      
      Posted Jun 8, 2012 at 9:45 PM | Permalink
      
      I’m certain it was Karoly and company who discovered this on June 5th 4:41 blog time just before Jean!
    - Don McIlvin
      
      Posted Jun 8, 2012 at 10:50 PM | Permalink
      
      No doubt what Gavin did is as you state – plagiarism.
      
      But, injecting what Gavin did along side what Karoly has claimed implies by innuendo that Karoly’s claim is one of plagiarism too. Though you did not state it directly.
      
      It is this innuendo of plagiarism against Karoly that I am quibbling about as going a bit too far (at least without a stated caveat) – hence my point of the plausible scenario that the May 31st/June 1st posts may have prompted Karoly or a co-author to re-visit their data if only to know what serious scrutiny would reveal. And in doing so they found the data used was not a detrended data set as they claimed in the paper, hence their claim of independent discovery.
      
      They have not claimed that they got there first. So comparing this situation what what Gavin did seems quite the stretch when the subject of plagiarism is considered.
      
      There is no question, the public notice of selection issues came out on CA days before Karoly’s letter to you.
      
      What ironically brought this notion to mind, of not being so quick to jump to the negative conclusion, was some chiding you gave me a while back when I said the botched review (by PNAS?) of the Penn State investigation of Mann “may” well have been one of willful blindness as opposed to just incompetence which you were pointing to.
      
      Steve: I forget the exchange. But “wilful blindness” now seems more appropriate.
    - John Norris
      
      Posted Jun 8, 2012 at 11:52 PM | Permalink
      
      re: “We will not be entertaining any further correspondence on the matter.”
      
      When Gergis delivered that familiar gem, you have to think she was concerned their paper wouldn’t stand up to scrutiny. I am sure they doubled down on re-checking their paper. Perhaps they did find the problem then independently. Perhaps.
      
      But you can be absolutely sure they would not have found it this week if Steve and others at CA weren’t digging in and discussing.
      
      Regardless of who beat who they look just as unprofessional as Gavin by not embracing that CA scrutiny of these papers is ‘forcing’ better quality. And it is their credibility that suffers when their “we found it anyway” pride preempts professionalism.
    - Punksta
      
      Posted Jun 9, 2012 at 12:59 AM | Permalink
      
      It would be interesting to know who, beside Climate Audit, “also” discovered this issue.
      
      And what line of thinking, other than that on Climate Audit, led them to reexamine the data, only (soon) after publication ?
None

Posted Jun 8, 2012 at 4:10 PM | Permalink

Nice of him to contact you and give your blog credit Steve.
Wasn’t it Gavin ?
geo

Posted Jun 8, 2012 at 4:11 PM | Permalink

Isn’t it lovely when adults act like adults?
matthu

Posted Jun 8, 2012 at 4:13 PM | Permalink

I’m afraid I can’t attach any credibility to this development until I read it in The Guardian … /sarc off

Well done Steve, Jean S and all other contributors to this blog!
Jerold

Posted Jun 8, 2012 at 4:16 PM | Permalink

One wonders if over at real climate they’ll put their “fresh hockey sticks from Australia” article similary “on hold”?
iceman

Posted Jun 8, 2012 at 4:19 PM | Permalink

Gee maybe they will learn to consult statisticians in the future. Or maybe not.
foxgoose

Posted Jun 8, 2012 at 4:20 PM | Permalink

Oh Eee Ohh Eee oh wee ice ice ice

Raisin’ sea levels twice by twice

We’re scientists, what we speak is True.

Unlike Andrew Bolt our work is Peer Reviewed… ooohhh

Who’s a climate scientist..

I’m a climate scientist..
Adrian

Posted Jun 8, 2012 at 4:23 PM | Permalink

> Dear Stephen

He who should not be named?

> We would like to thank you and the participants at the
> ClimateAudit blog for your scrutiny of our study, which also
> identified this data processing issue.

The “also” is probably Gavinesque?
bernie1815

Posted Jun 8, 2012 at 4:23 PM | Permalink

Well done guys. However, my read of the above comment is a little different. David Karoly wrote:
We would like to thank you and the participants at the ClimateAudit blog for your scrutiny of our study, which also identified this data processing issue. emphasis added So who really gave them the heads up? I suspect some RC friends noticed the work here and warned them.
This reminds me of earlier similar events where Steve and folks here identified significant discrepancies and received less than their due for uncovering the errors.
Am I being too cynical?
- theduke
  
  Posted Jun 8, 2012 at 4:29 PM | Permalink
  
  No, unless I am also. I also picked up that word you’ve put in bold. Notice how he was also very precise on the date (June 5th) it was discovered by them?
  
  Wonder why? If we review posts here, will his date pre-date the discovery here?
  - theduke
    
    Posted Jun 8, 2012 at 4:54 PM | Permalink
    
    I’m wondering if they discovered the problem on June 5th before or after this post:
    
    Myles Allen Calls For “Name and Shame”
    
    Okay, so I am cynical . . .
HaroldW

Posted Jun 8, 2012 at 4:27 PM | Permalink

Steve –
Can you ask if the selection step used Spearman’s correlation coefficient (rather than Pearson’s) to determine the p<0.05 significance level? That was my guess.
lapogus

Posted Jun 8, 2012 at 4:27 PM | Permalink

Peer Review 0 – Blogosphere 1
Steven Mosher

Posted Jun 8, 2012 at 4:28 PM | Permalink

No gloating.

The lesson to be learned should be learned by the JOURNAL in question. Author’s will make mistakes. Ideally they catch their own mistakes. If reviewers dont demand the code and check

1. that it runs
2. that it produces the graphs and tables published
3. that it actually does what the text describes

Then they cannot approve the paper for publiction. It’s so simple that I’m stunned this has to be explained to anyone. Apparently it does.

I would suggest that people write to Journal of the climate and suggest that their editorial practices need some improvement.
- RobP
  
  Posted Jun 8, 2012 at 4:37 PM | Permalink
  
  While I agree with not gloating and the focus on the Journal to improve review practices, the fact that David Karoly wrote to inform and express gratitude to Steve et al is significant. One can argue with his “also”, but this is a great deal better than pretending that the error was noticed internally or by someone else and that Climate Audit played no role. Whether we refer to this as ‘growing up’ or not depends on your point of view, but it should be seen as a significant milestone. AMS 0 – CA 1 indeed!
  - Neil Fisher
    
    Posted Jun 8, 2012 at 10:35 PM | Permalink
    
    the fact that David Karoly wrote to inform and express gratitude to Steve et al is significant
    
    They are creeping up on acknowledging that Steve actually knows what he is talking about, IMO. I guess we shall see, but the proof of the pudding will be whether or not CA gets acknowledged in any forthcomng corrections to the paper. I suspect that if, say, Gavin had picked up this sort of error, he would be acknowledged, but SM? We’ll see… For me, I suspect that is one or two more “finds” away – perhaps I am just cynical.
- Adrian
  
  Posted Jun 8, 2012 at 4:47 PM | Permalink
  
  @Steven Mosher
  
  > No gloating.
  
  > The lesson to be learned should be learned by the JOURNAL in question. Author’s will
  > make mistakes. Ideally they catch their own mistakes. If reviewers dont demand the
  > code and check
  
  > 1. that it runs
  > 2. that it produces the graphs and tables published
  
  Even Steve has difficulty with making things “turnkey”.
  Perhaps Scientists should consult with computer professionals (as well as the already suggested statisticians) before publishing?
  
  > 3. that it actually does what the text describes
  
  This seems unlikely to happen within current peer review process, since most scientists would have little interest in digging through code to see if the written English matches the computer program.
  
  If it were to happen it would probably be delegated to a grad student.
  
  But given it is unpaid work, it is unlikely to be thorough?
  
  There is no financial incentive for it to happen and the anonymity means there are no reprisals when mistakes are overlooked.
  - Steven Mosher
    
    Posted Jun 8, 2012 at 9:00 PM | Permalink
    
    Making things turnkey is a big challenge. Personally, I’ve learned a lot from steve on this. Some journals come close to requiring it
    (Plos, for example ) and we can view it as an ideal.
    - clt510
      
      Posted Jun 9, 2012 at 10:45 PM | Permalink
      
      Steve any code or data that I release is turn-key. It requires a learning curve up front, and it requires using command line tools instead of GUI based software, but I don’t find it to be particularly that challenging or time consuming…and anyway it allows me to be able to replicate what I’ve actually done and tweak it if necessary (if it’s a checkbox in a GUI, it’s a bit hard to document that in a way that’s human readable).
      
      It benefits me as well as the end-user to produce a product in this fashion. I can “self-audit” if you will.
  - Joe Blogs
    
    Posted Jun 9, 2012 at 3:33 AM | Permalink
    
    So what Adrian and Mosher seem to be saying, is that proper auditing is too stringent and beyond the budget of ordinary peer review, ie just will not happen.
    
    So it’s the Blog Way or No Way?
- Green Sand
  
  Posted Jun 8, 2012 at 4:56 PM | Permalink
  
  Re: Steven Mosher (Jun 8 16:28),
  
  Well said Steven
  
  1. No gloating
  2. We all make mistakes
  3. Aim at those responsible – “Responsibility” means the ability to respond – The Journals
  
  4. There is a lot to earn from statements like – “This is commonly referred to as ‘research’.”
  - Green Sand
    
    Posted Jun 8, 2012 at 5:00 PM | Permalink
    
    Re: Green Sand (Jun 8 16:56),
    
    “earn” should be “learn”
- MrPete
  
  Posted Jun 8, 2012 at 6:27 PM | Permalink
  
  Re: Steven Mosher (Jun 8 16:28),
  Mosh has it right. His description of what is needed is not all that hard.
  
  Some departments at some universities are already doing this. It’s called Reproducible Research and has been mentioned here many times before.
- DocMartyn
  
  Posted Jun 8, 2012 at 7:23 PM | Permalink
  
  Got to agree with Mosher, my student is writing his first paper and one whole figure (six panels) had different color balance on each of the six.
  I caught it because I went back to the original data and checked (I always check).
  He screwed up because he was unfamiliar with the software and used three different scaling standards.
  It is not a question of intelligence, the guy is a brain surgeon finishing his residency.
  He just picked the wrong files and is too inexperienced to recheck.
  - Steven Mosher
    
    Posted Jun 9, 2012 at 12:32 AM | Permalink
    
    lucky student to have DocM checking his work.
- Anteros
  
  Posted Jun 9, 2012 at 1:24 AM | Permalink
  
  Surely the people who should be embarrassed about this are the reviewers? The journal sends out the paper to people with relevant expertees to do a specific job. Not noticing that none of the proxies had been de-trended prior to comparison is unbelievably poor reviewing.
  - DocMartyn
    
    Posted Jun 9, 2012 at 6:49 AM | Permalink
    
    “Surely the people who should be embarrassed about this are the reviewers?”
    
    Not really. You are not really expected to check the maths for trivial things. If that state ‘protein concentration was established using the Bradford method with BSA used as the protein standard’, you do not ask to see the plot. A referee has to take this on trust, if the authors state they did something, you have to trust them.
    - BatedBreath
      
      Posted Jun 9, 2012 at 7:13 AM | Permalink
      
      A referee has to take this on trust, if the authors state they did something, you have to trust them.
      
      So the real reviewing is not done by the peer-reviewers and referees, but the auditors and reproducers.
      
      Best have the song changed then :
      
      We are Climate Scientists
      We are Climate Audited
    - Jonathan Jones
      
      Posted Jun 9, 2012 at 7:32 AM | Permalink
      
      Doc Martyn is right: given the time devoted to a typical review it would be very rare to recheck many calculations etc.
      
      I have, however, started being more cautious in the reviews I write: I am now quite explicit about what parts of a paper I have and haven’t checked, so that the editor can know (rather than just guess) how thorough the review actually was. This seems to me a sensible middle way.
    - Steve McIntyre
      
      Posted Jun 9, 2012 at 7:39 AM | Permalink
      
      Jonathan, what do you think about asking reviewers to check on compliance with journal policies on data archiving? (Of course, this presumes that the journal has a policy requiring that relevant data be archived.)
      
      At present, a number of journals e.g. AGU journals have excellent data archiving policies on paper, but the editors and reviewers do not require paleoclimate scientists to comply with them. And the editors not only ignore complaints, but, behind the scenes (as we’ve seen from Climategate correspondence) sneer at efforts to require authors to comply with journal policies.
      
      In the Gergis case, I wrote to the editors of four journals (GRL, Climate Dynamics, Holocene and J Climate) asking that data be archived. J Climate editor Broccoli was bureaucratically unresponsive, obtusely referring me back to the authors who had already refused. Holocene editor Matthews said that the journal did not require authors to archive data and that the peer reviewers were content. No answer from the other two journals, even GRL, an AGU journal, which has an excellent data policy.
    - Steve McIntyre
      
      Posted Jun 9, 2012 at 7:59 AM | Permalink
      
      One interesting topic that I’ve considered from time to time. As someone who’s spent his life outside academia, I view “journal peer review” sociologically only as a (fairly limited) form of due diligence. There are other forms of due diligence e.g. financial audits, engineering reports, legal due diligence.
      
      When I was first asked to do a journal peer review, my first instinct was to look for objective standards for this form of due diligence. Instead, the standards seemed very vague and arbitrary to me – thus permitting on the one hand, pal review, and on the other hand, gatekeeping – without either necessarily being in breach of any policy.
      
      Despite the perceived importance of “peer review” as a sociological component of “peer reviewed literature”, because the peer reviews are confidential (other than a few idiosyncratic and recent journals), the peer reviews themselves are not available as a topic of empirical study.
      
      Thus discussions of journal peer review are never empirical, but tend to be self-serving and programmatic. Oddly, the Climategate documents, especially CG2, offer an interesting empirical data set of “peer reviews” that would be an excellent and unique topic of empirical study.
      
      In programmatic statements about peer review, it seems to me that academics seem far too quick to attribute modern scientific advances to peer review as practiced at journals (as opposed to, say, generous funding of scientific activity by modern society). In addition, academics seem too quick to bundle all components of peer review together as a take-it-or-leave-it package. Pal review seems pernicious to me and potentially more corrosive of the knowledge base than adversary review. I can see benefit to comments from adversaries, but it seems to me that such comments should not be conflated with independent peer review and that the authors should be entitled to treat such comments as hostile – worth paying very close attention to but not obliged to treat them as independent.
      
      : although “peer review” is an important part of the sociology of “peer reviewed
    - Richard Drake
      
      Posted Jun 9, 2012 at 8:07 AM | Permalink
      
      Steve:
      
      In programmatic statements about peer review, it seems to me that academics seem far too quick to attribute modern scientific advances to peer review as practiced at journals (as opposed to, say, generous funding of scientific activity by modern society).
      
      Badly read that I am, I’ve never seen this point made anywhere else. Much is justified on the back of ‘modern scientific advances’ that has nothing to do with them. This should be promoted to the subject of a separate CA post, IMHO.
    - Craig Loehle
      
      Posted Jun 9, 2012 at 8:45 AM | Permalink
      
      Re: peer review: as someone who has reviewed hundreds of papers, usually on modeling in ecology, here is what I look for:
      1) Do they relate their work fully to existing literature, including work that disagrees with their POV or results? Sometimes scientists from emerging nations are unaware of much of the literature. I will supply a bibliography for them if they missed this.
      2) Can I tell exactly what they did?
      3) Are there confounding factors they have missed? Often there are simplifying assumptions that will affect the results more than what they think they are studying.
      4) Can I relate their results to real world data or systems?
      
      On the other side of the process, as a recipient of hundreds of reviews, I can say that most of my response to comments is to clarify or respond to reviewers who completely did not understand what I was doing but would not admit it. Sometimes it was my fault for lack of clarity, but often I have been baffled by how obtuse reviewers can be. I think it is because people are reviewing topics rather far from their specialty. It is a distinctly small minority of comments which found an actual error in my analysis, for which I am always grateful (better before than after publication!).
    - theduke
      
      Posted Jun 9, 2012 at 9:30 AM | Permalink
      
      Steve: might I suggest you take the ideas in your posts at 7:39 and 7:59 and after expanding and polishing, send them to all the journals that publish climate science papers?
      
      Richard Drake wrote: “This should be promoted to the subject of a separate CA post, IMHO.”
      
      Agreed. And to synthesize, after blog review here, Steve could send out his conclusions to the journals in question.
    - Richard Drake
      
      Posted Jun 9, 2012 at 9:51 AM | Permalink
      
      Thanks duke. In fact since writing that I’ve been wondering if there aren’t two quite separate threads/topics in embryonic form here. The feedback from Jonathan Jones and Craig Loehle on their experience of peer review in the real world I find extremely valuable – and I’m sure others like Pat Frank could contribute a great deal on that, in the light of the Gergis story. And then there’s wider point about how peer review advocates (and I think all sorts of other malarkey) hide behind ‘modern scientific advances’ as if they on their own produce them. But the story of science doesn’t suggest such a thing at all. I’d be inclined to make these two threads but Steve may be inclined different 🙂 Certainly the second thread end up more ‘political’ than Steve normally permits. It’s the host’s call, as ever. And it’s his words that caused the lights to go on in my head (a rare event, as many will testify!)
  - HR
    
    Posted Jun 10, 2012 at 11:34 PM | Permalink
    
    I’d have to agree with DocM here on what peer-review is and isn’t. I suppose though that the real issue for people on this blog is that these scientifically reviewed papers make it into documents (IPCC) that are intended to inform policy.
    
    Rather than try to change the whole peer-review edifice it wouldn’t seem unreasonable that papers that inform the IPCC should be auditable. If the data and code can’t be checked through public disclosure then the paper should be rejected for use in the IPCC. Scientists that get protective of their own research can do so knowing the consequences that their work will be ignored by the wider community in it decision making. That way Gergis can continue playing her silly little games while the rest of us can continue to have a well-informed, adult debate.
    
    I know this is asking a lot in the present climate but nothing wrong with expecting certain standards. Having said that the Karoly email does seem to suggest that some are prepared to go the extra yard.
    
    Steve: that’s the sort of simple thing that I suggested years ago. If articles are to be cited for public policy, then the scientists have to open the kimono. They can’t insist on intellectual property rights in their data and methods and also expect to be cited by IPCC.
    - Erica
      
      Posted Jun 11, 2012 at 2:24 AM | Permalink
      
      > If the data and code can’t be checked through public disclosure then the paper should be rejected for use in the IPCC.
      
      But if this does not happen, the IPCC itself must be rejected for use in public policy.
maxberan

Posted Jun 8, 2012 at 4:32 PM | Permalink

No suggestion though that they will take on board the other key issue that was identified, namely that selection of a subset is not a statistically neutral act so needs to be allowed for in assessing significance. Ditto the other procedural selections.
- Steve McIntyre
  
  Posted Jun 8, 2012 at 5:22 PM | Permalink
  
  Re: maxberan (Jun 8 16:32),
  
  max wrote:
  
  No suggestion though that they will take on board the other key issue that was identified, namely that selection of a subset is not a statistically neutral act so needs to be allowed for in assessing significance. Ditto the other procedural selections.
  
  Quite so. Many reader comments are unduly triumphalistic and I urge readers here and at Anthony’s to dial it back.
- HAS
  
  Posted Jun 8, 2012 at 6:20 PM | Permalink
  
  Will also be curious to see if they revisit the Oroko proxy in light of the amendment of the official instrumental temperature records on which it depends.
Anthony Watts

Posted Jun 8, 2012 at 4:33 PM | Permalink

“Progress in Climate”

distinctly different from “Climate Progress”

Hearty congratulations to Steve! And, a hat tip to Nick Stokes too for letting the data speak for him.

Steve: Jean S spotted this particular discrepancy.
Salamano

Posted Jun 8, 2012 at 4:33 PM | Permalink

I wouldn’t worry about the ‘also’ … let them have the ‘also’ … It’s clear that this paper passed peer-review, and was sufficiently lauded and proclaimed all over the place (THE contribution from Australia in AR5, etc.)

Why would something like this in the methodology be randomly re-reviewed after all the reviewing and lauding originally, would it not be for an alternative disturbance in the force– separate from the normal process.

You would think that all that would be done in the ‘normal process’ would be complete, otherwise ALL future published/praised/AR5’d papers would have to have some sort of implicit, tentative, subject to further review moniker on it.

It is now more certain that “scientists” review the sort of auditing that goes on here and, more importantly, consider it. This is something that is good for science overall. Good on these scientists too. They took a higher road. Other scientists in the past would have not acknowledged any sort of issue, hidden a corrigenda somewhere, and published a new paper that ‘confirms’ the old one and make like no issue ever existed.

Who knows how the results/conclusions will change, or if they’ll change, but crowd-review seems to be gaining a bit of credibility here.
- JimBoMo
  
  Posted Jun 8, 2012 at 4:48 PM | Permalink
  
  Well said. This sort of auditing is good, good for the scientists for considering and taking the high road.
- Steven Mosher
  
  Posted Jun 8, 2012 at 9:10 PM | Permalink
  
  Agreed.
  
  Let them have their “also”
  They took a higher road ( not perhaps the highest.. but higher is good!)
  
  good for science and the field.
  
  more please.
- Joe Blogs
  
  Posted Jun 9, 2012 at 3:47 AM | Permalink
  
  Other scientists in the past would have … published a new paper that ‘confirms’ the old one and make like no issue ever existed.
  
  Folks, Eric Stick at RC guesses the correction will still confirm a HS, so the results may yet be reGergitated.
Ross McKitrick

Posted Jun 8, 2012 at 4:36 PM | Permalink

Ammann and Wahl, Climatic Change 2007, to the rescue. Section 3 of their paper is a long argument against detrending.

In short, the detrending procedure used in von Storch et al. (2004) and numerous Bürger and Cubasch (2005) and Bürger et al. (2006) variants systematically removes a good deal of what,in the end, is expected to dominate the mean hemispheric temperature variation back in time (Crowley 2000; Bürger et al. 2006).

Roman’s comment brings up the Frisch-Waugh-Lovell theorem in econometrics. My interpretation of the whole debate over detrending is that it only matters if proxies are badly selected. A well-selected proxy P is supposed to be mainly sensitive to temperature T. Suppose we think there is a low-frequency component not measured by temperature that also affects the proxy P and can be represented by a linear trend L. Then the calibration regression equation should be

(1) P = a1 + a2xT + a3xL.

If testing yields a3=0 then the proxy is well-chosen for temperature calibrations. If a3 is not zero then its low frequency behaviour is not solely driven by temperature and it’s not a good proxy for long term reconstructions. In that case a2 from the above regression will differ from a2 in

(2) P = a1 + a2xT

If we do the de-trending on (2), detrended P is denoted P/L and detrended temperature is denoted T/L. Then estimate

(3) P/L = b1 + b2xT/L.

The FWL theorem states that b2 from (3) = a2 from (1) and the residuals etc. from (1) and (3) are identical. So if b2 from (3) is not the same as a2 from (2) then the proxy is sensitive to L and is not a good choice for long-term temperature reconstructions. That’s why I take the view that detrending only matters when you have bad proxies.
- Craig Loehle
  
  Posted Jun 8, 2012 at 5:02 PM | Permalink
  
  Let us say that we are studying factors affecting college grades. It is sensible to subtract a linear increase in grades due to grade inflation before doing the rest of the analysis. Suppose one wants to do a study of the affect of age on something–subtracting a linear function of age would not be good. In the proxy case, if over the calibration period trees are growing better due to temperature you really should not detrend if temperature is what you want to study IMHO. Only when there is an extraneous factor.
- Kenneth Fritsch
  
  Posted Jun 8, 2012 at 5:08 PM | Permalink
  
  “So if b2 from (3) is not the same as a2 from (2) then the proxy is sensitive to L and is not a good choice for long-term temperature reconstructions. That’s why I take the view that detrending only matters when you have bad proxies.”
  
  I consider this point being being made here by Ross critical to this whole discussion and one that I have attempted to make to Jim Bouldin but nowhere as convincing as Ross does here.
- Nick Stokes
  
  Posted Jun 8, 2012 at 5:45 PM | Permalink
  
  I think one hazard of detrending is the possibility that a wiggles of a proxy agrees, but there is a variant or opposite trend in the instrumental period. That wouldn’t be taken into account, and it could pass.
  
  Detrending has the potential to “hide the decline”.
  - Kenneth Fritsch
    
    Posted Jun 9, 2012 at 8:13 AM | Permalink
    
    I having been saying this for some time: the divergence can pass the detrended correlation test.
- almostcertainly
  
  Posted Jun 8, 2012 at 7:24 PM | Permalink
  
  Dear Professor McK;
  
  Thanks for the lovely clear elucidation of the math of ‘detrending’.
  
  I hope you might address a question I can’t answer;
  
  Why bother to screen out any proxies at all,
  if at the end of the day,
  they are only going to make a weighted sum of proxies?
  
  or asked again differently;
  
  If their process of choosing the weights is good enough to ‘pick the winners’, why does it also not ‘ditch the losers’ without manual intervention ?
  
  Thanks,
  RR
  - mrsean2k
    
    Posted Jun 9, 2012 at 6:08 AM | Permalink
    
    @almostcertainly
    
    My understanding is that it’s perfectly acceptable to manually “ditch the losers” providing this is done using a method which is independent of method used to weight the pool of potential proxies. And in fact manually ditching may be the only option to maintain rigor.
    
    To take a trivial example; I have tree ring proxy data to consider. The metadata for these proxies shows that the equipment used to initially measure those widths was found to be poorly calibrated when it was serviced, just after those proxies were processed.
    
    We may then take a decision that these data should be taken out of the pool of potential matches and – crucially – this decision must be taken *before* they are used in an analysis that assigns weights to them. We want to remove any possibilty of deliberate or inadvertent bias.
    
    Just letting the weighting algorithm alone decide whether this is a winner or a loser isn’t enough – there is a real exterior physical effect that is unknowable by the algorithm.
    
    Now we may then revisit the errors introduced by the faulty equipment, run a series of diagnostic tests and collect empirical data that we believe will accurately allow the bias introduced by the equipment to be accounted for. Then we could reintroduce the previously rejected proxies to the weighting process. But again, we have to decide whether or not they should be included before any such weighting has taken place.
    
    Real life examples will be more subtle than this, but the same principle applies; there is information in the real world that can legitimately bear on the choice we make about which proxies are submitted for weighting that cannot be accounted for in any automatic process.
    - Craig Loehle
      
      Posted Jun 9, 2012 at 8:48 AM | Permalink
      
      You are absolutely right, but this is rarely done in the reconstruction world. They only decide post hoc that something is wrong with a data set when it diverges…
    - maxberan
      
      Posted Jun 9, 2012 at 12:20 PM | Permalink
      
      It may be “perfectly acceptable to manually “ditch the losers” but in my opinion not with the proviso that you outline. The proviso for making such a pre-selection acceptable relates to how one measures the confidence with which any inference drawn from such an “optimised” sample is made.
      
      Think of it as a numerical experiment in which you set up a target series with 62 predictor series with a range of correlations with the target (based on the real-world numbers). This is the generating population. Fom it one makes repeated random drawings. Due to sampling variation each sample will differ from every other.
      
      A conventional test would have set up the generating population with 27 predictor series and used whatever model for the historical reconstruction. Repeated sampling from this population would generate an ensemble of historical reconstructions whose (relatively confined) span of variations are the basis for measuring confidence.
      
      But this confidence will be overstated because it is not based on the actual sampling procedure. To replicate the procedure better one would start from the generating population of two paragraphs back and then pluck out the best 27 from the 62. Build the model accordingly and reconstruct the history as above. Such would be the vagaries of sampling, different predictors would pop in and out of different realisations so generating a wider span of ensemble historic traces and thus lowering the confidence.
      
      While this is a mere toy version of the actual process with its various other pre-selections and targets it hopefully serves to illustrate the bias arising from applying statistics based on one sampling scheme to a very different one, especially if the procedure involves parameter optimisation.
    - mrsean2k
      
      Posted Jun 10, 2012 at 6:51 AM | Permalink
      
      @maxberan
      
      (sorry hit the nesting limit apparently)
      
      On reflection, I don’t think I’ve answered @almostcertainly’s question at all (not even incorrectly :-))
      
      The example I give doesn’t really speak to the issue of “losers” specifically. My intention was to describe a trivial example where it is known that there is an additional confounding factor that could materially affect how a proxy is interpreted.
      
      Using that proxy without taking account of this may result in one of two broad error conditions:
      
      1) The proxy is has not actually responded to temperature, but the systematic bias means it is incorrectly identified as a winner and strongly weighted
      
      2) The proxy has responded to temperature, but the systematic bias means it is incorrectly identified as a loser and weakly weighted
      
      And many points in between, where the systematic bias has skewed the measurement and nudges the apparent strength of correlation in one direction or another.
      
      Is it not the legitimate, when a prior systematic error of some magnitude in whatever direction is known to be in play, to discard those proxies from consideration before any analysis?
      
      Wouldn’t it be cleaner (and legitimate) to study the effects of this bias as a separate and entirely discrete exercise and, once that effect is identified, to adjust for that effect and then return the data to the pool?
      
      I’ve pored over your answer a few times and think I understand what you’re describing, but I still don’t see how you’d be guaranteed to identify a skewed proxy. By ignoring an objective fact about the methodology – “these measurements were taken with faulty equipment and may be unreliable” – aren’t you ignoring legitimate information and placing too much trust in your algorithm?
    - maxberan
      
      Posted Jun 10, 2012 at 7:37 AM | Permalink
      
      We are possibly talking past each other. The issue I’m addressing is the misplaced high confidence in the outcome of the historical reconstruction due to the pre-screening of 25 from a “universe” of 62. It does not set out (and cannot) change the numbers that emerge from the procedure but it does surround the values with a more realistic estimate of the fuzz of grey representing the confidence interval. If that uncertainty band that surrounded the Hockey Stick were drawn twice as wide as published would it have been given such prominence?
      
      Such an exercise will not change the reconstruction itself, that latter is fixed by the procedure.
      
      You are right that it attaches 100% trust in the algorithm by accepting the model and randomizing from the recorded sample in the overlap period as its starting point. But then so implicitly do conventional statistical significance tables. However it is a necessary step in the direction of a more honest appraisal of the accuracy of reconstruction.
      
      You may postulate a new procedure for predicting annual temperatures as a function of the proxies and doubtless it will have its own uncertainties which would similarly be incorporated by randomising in a simulation to assess their impact on the confidence of the historical reconstruction. I was addressing what I took to be the primary focus of your piece – the specific impact of pre-selection, not the impact of missing factors in the model or an inadequate model.
    - mrsean2k
      
      Posted Jun 10, 2012 at 8:38 AM | Permalink
      
      @maxberan
      
      “We are possibly talking past each other”
      
      Yes, I think so. @almostcertainly asked whether or not effectively algorithmic screening by weight wasn’t enough to make screening before use of the algorithm unnecessary. I tried, with limited success, to think of an example where there is objective data that would lead to screening that wouldn’t be caught by the algorithm itself. And I suppose to see if my own thinking on how and when these choices are justified was reasonable.
      
      But following the angle explored in your replies has been educational, thanks.
  - mrsean2k
    
    Posted Jun 9, 2012 at 6:13 AM | Permalink
    
    OT, but I hadn’t looked at this sort of equipment before now:
    
    http://web.utk.edu/~grissino/ltrs/equipment.htm
- HaroldW
  
  Posted Jun 10, 2012 at 9:51 AM | Permalink
  
  Ross –
  It requires little work to make those calculations for the “Gergis 27” proxies. I’ve grouped them here by type. The first column is the regression slope (vs. temp) of the non-detrended data (your a2). The second column is the regression slope of the detrended data (your b2).
  
  Tree ring width proxies
  3.08 1.94 Mt Read
  1.46 0.26 Oroko
  2.13 1.31 Buckley’s Chance
  2.70 1.30 Celery Top Pine East
  2.95 2.05 Kauri
  2.19 1.02 Pink Pine South Island composite
  2.14 0.59 Mangawhero
  -1.14 -0.76 Urewera
  1.65 1.35 North Island_LIBI_Composite_1
  4.13 1.60 Takapari
  1.54 0.71 Stewart_Island_HABI_composite
  2.29 0.91 NI_LIBI_Composite_2
  
  Coral δ18O proxies
  -1.90 -1.05 Palmyra
  -2.31 -1.94 Fiji_AB
  -0.81 0.12 New_Caledonia
  -1.77 -1.15 Rarotonga
  -1.25 -0.80 Fiji_1F
  -1.57 -0.31 Bali
  -1.01 -0.37 Abrolhos
  -1.43 -0.21 Maiana
  -1.76 -2.40 Bunaken
  -2.20 -0.57 Rarotonga.3R
  -1.87 -1.57 Ningaloo
  -1.04 -0.58 Madang
  -1.78 -2.25 Laing
  
  Ice δ18O
  2.40 2.18 Vostok_d18O
  
  Ice accumulation
  1.35 1.11 Vostok_Accumulation
  
  I have a fundamental question. The sensitivity of tree ring width proxies (to temperature) is variable because growth is not always temperature limited. But for isotopic proxies, isn’t the sensitivity establishable a priori from physics considerations? In which case, regression is going to give you a noisy estimate for a known [or knowable] parameter, increasing the reconstruction error.
Steven Mosher

Posted Jun 8, 2012 at 4:36 PM | Permalink

http://www.ametsoc.org/pubs/journals/jcli/jcli_eds.html

http://www.ametsoc.org/stacpges/CommitteeDisplay/CommitteeDisplay.aspx?CC=PUBSCOM

If you decide to write mails make them polite and factual.
- Paul Matthews
  
  Posted Jun 8, 2012 at 5:00 PM | Permalink
  
  I suggest that people do not write to the editors – there is little point. Tony Broccoli tells me (see previous thread) that AMS policy does not permit him to disclose any information about the paper.
- HAS
  
  Posted Jun 8, 2012 at 5:09 PM | Permalink
  
  As I noted in an earlier comment this study is part of PAGES funded by the U.S. and Swiss National Science Foundations, and the National Oceanic and Atmospheric Administration (NOAA). It is overseen by a Scientific Steering Committee (SSC) comprised of members chosen to be representative of the major techniques, disciplines and geographic regions that contribute to paleoscience” (http://www.pages-igbp.org/about/general-overview). The SSC membership is http://www.pages-igbp.org/people/scientific-steering-committee-ssc and “[t]he authority for all PAGES policy and activities resides with the PAGES SSC”.
  
  Perhaps also give Prof. Alan C. Mix, Oregon State University, or Prof. Hubertus Fischer, Bern, the SSC co-chairs a yell.
JimBoMo

Posted Jun 8, 2012 at 4:38 PM | Permalink

Yes, indeed, “independent analysis of data and methods strengthens conclusions” and is (should be) “a normal part of science.”

Kudo’s to the technical analysts @ ClimateAudit for demonstrating the value of open review of literature using archived data accessible to the public.

My first hope that Karloy’s letter and publishing hold will call attention to necessity of archiving data and publishing code in papers. Perhaps more insight on the screening selection will follow.

My second hope is that comments here are as decent as Karloy’s thanks “to you and the participants at the ClimateAudit blog.”

Perhaps it is time for all participants to leave surliness behind. I understand there has been much misbehavior and poor manners in the past. However, if science is going to move towards open review, then we will all need to bring forward our best manners.
- Neil Fisher
  
  Posted Jun 8, 2012 at 10:46 PM | Permalink
  
  Don’t know how long you’v been reading CA Jim, but you must be joking if you think Steve has EVER been anything but polite, precise and willing to give the “other side” the benefit of the doubt. Perhaps this is finally the payback – too many people are aware of this, and it’s all well documented here at CA. So much so, that it simply cannot be ignored any more.
  - Geoff Sherrington
    
    Posted Jun 9, 2012 at 12:46 AM | Permalink
    
    OTOH, you might like to revisit CA 2008 on the subject of allowing papers to pass by without critical comment:
    ……………………………..
    Geoff Sherrington Posted Apr 3, 2008 at 3:46 AM | Permalink | Reply
    
    Because we have a Prof at Melbourne Uni now who filed a brief IPCC form starting-
    “As review editor of Chapter 9 Understanding and Attributing Climate Change of the Working Group 1 contribution to the IPCC 4th Assessment Report…..
    
    I sent an email starting –
    “A dreary subject, but in your participation with the IPCC did you submit any working papers, clarifications, work that was to be completed then forwarded, etc, or is the attached the sole document and commentary that you used to attest to your satisfaction with Chapter 9 of the 4th Assessment Report?
    
    “This seems a rote report, essentially similar to some 60 others I have seen, but markedly different to a handful where substantial comments were appended for action. Is that how consensus was achieved? … etc
    …………………………….
    The 2007-8 rote paper in question had the signature of one D Karoly, but not a trace of the other papers the IPCC required. Perhaps there was 100% consensus then. At least it can be said that Prof Karoly’s one word reply was polite and willing.
theduke

Posted Jun 8, 2012 at 4:40 PM | Permalink

Mosh says not to gloat, but you gotta love this:

http://www.realclimate.org/index.php/archives/2012/05/fresh-hockey-sticks-from-the-southern-hemisphere/comment-page-1/#comment-237312
- Neil Fisher
  
  Posted Jun 8, 2012 at 10:55 PM | Permalink
  
  http://www.realclimate.org/index.php/archives/2012/05/fresh-hockey-sticks-from-the-southern-hemisphere/comment-page-1/#comment-237342
  
  Wow! Looky Steve, you really DO exist! 😉
Jeremy

Posted Jun 8, 2012 at 4:49 PM | Permalink

The journal would be well served to re-evaluate it’s peer-review process in light of this discovery post-peer-review. I suspect they’ve found at least 3 people they shouldn’t want doing any more reviews for your journal.
Steve McIntyre

Posted Jun 8, 2012 at 5:01 PM | Permalink

Before readers get too wound up about this, I’d like to point out that they have a couple of potential fallback positions.

First, it’s entirely possible that they can re-do the analysis using the professed methodology and still get a “stick”. The number of contributing proxies will be fewer of course.

Alternatively, they may well argue that the “right” screening method is to calculate correlations without detrending. Although the Screening Fallacy has been argued here and at other blogs, it is not admitted as a fallacy by realclimate_scientists.

Please do not oversell this.
- Roger
  
  Posted Jun 8, 2012 at 5:34 PM | Permalink
  
  The significance of this event is not only in the effect the error could have on the final result. For me the key thing is that it further refutes the bogus argument that scientific discourse must take place in the peer-reviewed literature and that blogs are always to be ignored.
  
  Regarding the final result my suspicion is that the authors already know how the results will look after appropriate detrending (they’ve had all week). If the shift was minor then they would probably have not even contacted the journal. One could argue, wrongly IMO, that if the final result doesn’t change much then the journal version anyway gives a good approximation.
  
  On another note, its been interesting to watch the evolution of arguments used when you + bloggers examine temperature reconstructions:
  
  Starting with your criticism of Mann’s hockey stick
  -> the criticism is just wrong -Steve M. is an amateur, and a Canadian one at that.
  -> the criticism is scientifically correct but has no impact on the result.
  -> the criticism has an impact on the result but is unimportant since other hockey sticks can be obtained.
  
  Following auditing of other hockey sticks the argument is now :
  -> Historical temp reconstructions are controversial.
  -> (Begrudging?) thanks for pointing out mistakes
  
  Certainly the die-hards will never admit that their work could be problematic even in principle. However, there seems to be a shift amongst others in the climate community.
  - pouncer
    
    Posted Jun 8, 2012 at 6:40 PM | Permalink
    
    David Karoly’s comments do not sound “grudging” to me.
    
    I also agree that the reaction, overall, indicates progress.
    
    SLOW progress. I wish the hockey team would take on Wegman’s suggestion about including statisticians on all such work.
    
    As Mosher offered to help the BEST team with UHI analysis I would hope JeanS or Nick Stokes would work with our new friends down under to review the current work.
    
    Maybe, in the process, some of the “research” necessary to consider how selected proxies compared to rejected proxies can be reported back here.
    - Roger
      
      Posted Jun 8, 2012 at 11:29 PM | Permalink
      
      I used begrudging to refer to the attribution issue which suffered from a little weasel wording. If the problem was discovered as a result of Steve’s posts + comments then it should be clearly stated. Similarly, if it was just a coincidence and had absolutely nothing at all to do with this blog then this should also be explained. As it is, they leave a deliberate ambiguity which is more than a little disrespectful to those who’ve pondered this issue here. Climatologists may well be hostile towards sceptics and question their motives. However, serious calculations have been done here and it unquestionably been to the good of this work.
    - Steve McIntyre
      
      Posted Jun 8, 2012 at 11:40 PM | Permalink
      
      As it is, they leave a deliberate ambiguity which is more than a little disrespectful to those who’ve pondered this issue here.
      
      I agree. I also regard Karoly’s letter as somewhat disrespectful. If I had been in Karoly’s shoes and writing the letter under identical circumstances, I would have offered a straightforward acknowledgement to Climate Audit without pretending to have “independently” turned their minds to checking a result that was under scrutiny at Climate Audit. Their present claim to have “independently” found the result on the same day as Climate Audit defies credulity. Even if they did Jean S’s calculation slightly before him, they did so only because it was put at issue at Climate Audit.
    - Steve McIntyre
      
      Posted Jun 8, 2012 at 11:55 PM | Permalink
      
      Gavin Schmidt received the following from “one of the authors”:
      
      Print publication of scientific study put on hold
      
      An issue has been identified in the processing of the data used in the study, “Evidence of unusual late 20th century warming from an Australasian temperature reconstruction spanning the last millennium” by Joelle Gergis, Raphael Neukom, Stephen Phipps, Ailie Gallant and David Karoly, accepted for publication in the Journal of Climate.
      
      We are currently reviewing the data and results.
      
      Again, no credit or acknowledgement to Climate Audit.
    - theduke
      
      Posted Jun 9, 2012 at 12:07 AM | Permalink
      
      Pouncer wrote: “I wish the hockey team would take on Wegman’s suggestion about including statisticians on all such work.”
      
      But they have. They’ve included Steve and Jean and are using their work with out giving them proper credit.
      
      I agree with Steve:
      
      Gergis et al “Put on Hold”
    - Joe Blogs
      
      Posted Jun 9, 2012 at 1:30 AM | Permalink
      
      “…..they leave a deliberate ambiguity…..”
      
      ie they know full well the credit belongs to the resented blog auditors, but are loathe to publicly admit it.
  - Roger
    
    Posted Jun 9, 2012 at 12:15 AM | Permalink
    
    I don’t think they realise just how tribal and downright unscientific this must appear to a neutral outsider. Real work has been done here and real knowledge gained. As a “real” scientist, I’ve been impressed with what I’ve seen.
    
    For me science is about being dispassionate – I’m paid to report the facts. Once one adopts that attitude it becomes much easier to do uncomfortable things such as acknowledge the efforts of people that I don’t like etc. The behaviour of these people towards critics reminds me of how I used to think when I was a young postdoc i.e. seeing critics as the enemy and internally dismissing anything they would say as being wrong because such people were objectionable to me.
- Tom Anderson
  
  Posted Jun 12, 2012 at 10:06 AM | Permalink
  
  A possible complication for the Authors, in determining how to revise/correct the paper, would be their correspondence with the peer reviewers and editor during the peer review process. While it is entirely possible the paper received a cursory “pal review”, it is plausible that they would have needed to respond to certain questions and/or made changes to satisfy the reviewers and editor. If they responded strongly to questions regarding the detrending issue (i.e., it was an important part of the paper) during the peer review process, it may be difficult to just sweep those arguments under the rug and move on.
- Jimmy Haigh
  
  Posted Oct 18, 2012 at 11:08 AM | Permalink
  
  Proxies? The fewer the better.
ntesdorf

Posted Jun 8, 2012 at 5:01 PM | Permalink

I am quite certain that they ‘ discovered ‘ the problem after someone warned them of the appearance of this post:

Myles Allen Calls For “Name and Shame”

The timing of their ‘ peer review ‘ and other events makes me quite certain.
stan

Posted Jun 8, 2012 at 5:08 PM | Permalink

Isn’t it a shame that the reputation of climate scientists is so poor that we naturally wonder about his implication that his team had caught the problem before anything was posted at Climate Audit. Since we know that Climate Audit is monitored constantly by the hockey team, claims such as Karoly’s would be viewed with less suspicion if, as soon as Steve had posted, the authors had posted a comment explaining that they were aware of a problem, explained their understanding of the problem, and discussed what they actually did.

I simply find it very difficult to believe that, if they were aware of the problem on Tuesday, they never came here to mention it. Not just because it would have saved a lot of folks a lot of time, but because there is a very significant potential benefit to them to get the input of a lot of very bright folks. The silence doesn’t just cast doubt about the accuracy of their story. It seems so strange under the circumstances.
- stan
  
  Posted Jun 8, 2012 at 5:20 PM | Permalink
  
  I wonder if this would qualify as one of those “Communicating the Science Better” issues that seem to be so high on the agenda of so many scientists lately. If you want to communicate ‘better’, try communicating ‘period’.
ThomasL

Posted Jun 8, 2012 at 5:11 PM | Permalink

Overall the dialogue seems a little better. Not all the way there; but some signs of progress.
Kenneth Fritsch

Posted Jun 8, 2012 at 5:13 PM | Permalink

“Alternatively, they may well argue that the “right” screening method is to calculate correlations without detrending. Although the Screening Fallacy has been argued here and at other blogs, it is not admitted as a fallacy by realclimate_scientists.”

Which to me says that an independent analysis should continue. There is much we still do not know about these proxies and how Gergis (2012) put them together.
Brent Buckner

Posted Jun 8, 2012 at 5:14 PM | Permalink

David Karoly wrote: “The testing of scientific studies through independent analysis of data and methods strengthens the conclusions”

Sure would be nice if they made checking of the methods by which records were excluded from the final analysis easier (i.e. archiving the data for independent scrutiny of the screening method, as opposed to calling for “research”). Being as they made a mistake in the final analysis, confidence in their preliminary screening takes a further hit.
Green Sand

Posted Jun 8, 2012 at 5:29 PM | Permalink

We now know when they found the problem – 5th June, but where?

It would have to be a product of “years of our research effort based on the development of our professional networks.”

Mr McIntyre I reckon you might just have become a member of their “professional networks”

Good interesting work from all (and I mean all) involved. If somebody does not originally “put up” we go nowhere.
Tony Mach

Posted Jun 8, 2012 at 5:53 PM | Permalink

Wasn’t detrending the point of the study? If this had happened to me after being so loudmouthed, I would feel like an complete utter idiot – but then what do I know what is and isn’t a “normal part of science”.

And yes, it does smell kind of funny when proud loudmouthed authors shred their own paper right at the same time skeptics are posting the exact same issues.
Ben

Posted Jun 8, 2012 at 5:59 PM | Permalink

I think the main point to take out of this is that without Climateaudit and its contributors, that paper would have been central to the paloeclimate chapter of AR5 :

Mann (NH) + Gergis (SH) = global hockey stick, let’s roll.

Well, it all doesn’t look that simple now…
Ben

Posted Jun 8, 2012 at 6:08 PM | Permalink

“My impression is actually that the particular error was not first identified at Climate Audit. Karoly’s letter says “also”. -eric ”

It’s quite impressive the number of errors that get “independently identified” the same day they get published on ClimateAudit.

Either ClimateAudit has a BigBrother program and can hear what is said in the labs…. or the people in the labs are reading ClimateAudit. You pick what seems the most likely.
eyesonu

Posted Jun 8, 2012 at 6:23 PM | Permalink

With regards to their claim that “they” discovered an issue with the paper on June 5. It was peer reviewed @ AMS and due for publication. Why would they be reviewing the paper at this late date? Phase I of damage control?
Nicolas Nierenberg

Posted Jun 8, 2012 at 6:34 PM | Permalink

I think you all (not Steve so much) are being kind of rude about this. David Karoly wrote Steve directly which I think was a nice thing to do. Communication is a two way street and crowing and venting won’t improve the atmosphere. It is obviously very embarrassing for them to discover such a fundamental problem with their paper. You all need to give them a little space.
- Green Sand
  
  Posted Jun 8, 2012 at 7:02 PM | Permalink
  
  Nicolas Nierenberg (Jun 8 18:34), Re: Nicolas Nierenberg (Jun 8 18:34),
  
  Re: Nicolas Nierenberg (Jun 8 18:34),
  
  “You all need to give them a little space.”
  
  But surely the basis of the initial request for data and code was to give “space”?
  
  And what did this attempt to allow “space” receive?
  
  “This is commonly referred to as ‘research’.
  
  We will not be entertaining any further correspondence on the matter.”
  
  Arrival of “homo superbus”?
- Steve McIntyre
  
  Posted Jun 8, 2012 at 7:18 PM | Permalink
  
  Re: Nicolas Nierenberg (Jun 8 18:34),
  Nic,
  it seems a little “grudging” for Karoly claim that they had discovered the problem prior to Climate Audit drawing their attention to it. Too much like Gavin Schmidt here:
  
  Response: People will generally credit the person who tells them something. BAS were notified by people Sunday night who independently found the Gill/Harry mismatch. SM could have notified them but he didn’t. My ethical position is that it is far better to fix errors that are found than play around thinking about cute names for follow-on blog posts. That might just be me though. – gavin]
  
  The person who “independently” found the mismatch was, of course, Gavin Schmidt, who had learned of the problem at Climate Audit.
  
  In a bizarre bit of plagiarism, Schmidt notified the BAS without acknowledging or citing Climate Audit. A peculiar plagiarism in that the purpose was not for Schmidt to claim credit for this small point, but deny credit to Climate Audit for even the smallest point.
  
  Schmidt tried to conceal this stunt. With some amusement, I encouraged BAS to acknowledgement the Mystery Man – which they did and outed Schmidt.
  
  It’s not like the Harry/Gill mismatch proved to be a large point – although I didn’t know this at the outset. It was strange to watch Gavin’s plagiarism and subsequent contortions.
  - Brent Buckner
    
    Posted Jun 8, 2012 at 7:49 PM | Permalink
    
    I didn’t see a claim to prior discovery, for me the “also” merely represented a claim to independent discovery.
    - theduke
      
      Posted Jun 8, 2012 at 8:10 PM | Permalink
      
      I think he was being purposefully ambiguous. He gives a specific date of June 5, which is the same day as the discovery was made here. But if you are being specific, why not include the time of day?
      
      Another possibility could be that by making it ambiguous, he’s offering a draw. It allows them to say privately that yes, they actually figured it out before CA did. And it allows some of us here to say the folks here figured it out first.
      
      Of course, they had all the data readily available and extensive experience with the data so when Jean and Steve and Roman et al were giving progress reports, they might very easily have been able to determine the problem before CA did, but only because CA people were openly discussing the problem.
    - stan
      
      Posted Jun 8, 2012 at 8:39 PM | Permalink
      
      And if so, it would have been courteous to tell all these people laboring on the paper that they had identified the issue. I can’t believe they weren’t aware of what was going on at Climate Audit. Apparently they didn’t even bother to give Nick Stokes a heads up and he’s a team defender.
    - Bob Koss
      
      Posted Jun 8, 2012 at 8:50 PM | Permalink
      
      Brisbane time is GMT +10:00 while this blog I believe is GMT -6:00. That would make Jean’s June 5th post early in the day on June 6th in Australia. It seems Karoly is definitely claiming precedence on the discovery.
    - Don McIlvin
      
      Posted Jun 8, 2012 at 10:13 PM | Permalink
      
      There may be a difference in what Karoly is claiming they discovered. On June 5th and 6th posts and comments on CA draw attention to the selection of proxies based on temperature correlation were not sound.
      
      I don’t see anything that says that the claim that they correlated against detrended data was false. That is not until Steve stated it about 9am June 8th in comments on the “Significance” post. Jean S seemed to know it somewhat before that in back and forth with Nick Stokes, but didn’t quite make the statement in uncertain terms.
      
      So Karoly, perhaps hearing that there were issues identified on CA, looked at the data, and “discovered” that the data set used was not the detrended data set – causing the problems of “significance” discovered on CA. There is a small distinction here that Karoly can make his “independent” claim on.
    - Don McIlvin
      
      Posted Jun 9, 2012 at 7:08 PM | Permalink
      
      I have tried to put forth a plausible argument that could give Karoly the benefit of the doubt on his claims. But in the end his claim of “independent” discovery and the points I have tried to make on his behalf have all been shredded by the counter points made here at CA. He’ll need a defender that can make something from the straws remaining.
      
      I’m left wondering; Why do these guys (on the “team”) pull crap like that?
  - bernie1815
    
    Posted Jun 8, 2012 at 8:13 PM | Permalink
    
    The prior history certainly was what came to mind when I saw the unneeded also. But there is I believe a way to test the authenticity of Karoly’s note: Steve, Jean S, Hu, etc. could ask him to identify who else identified problem(s) and when. It might also be the time to ask for the missing proxies. If his response is constructive to those requests, then I think a corner has indeed been turned.
    - Hu McCulloch
      
      Posted Jun 9, 2012 at 12:02 PM | Permalink
      
      I take no credit here — All is due to Jean and Steve!
- clt510
  
  Posted Jun 8, 2012 at 8:58 PM | Permalink
  
  Nicolas, speaking as a scientist, there’s a cure here…
  
  If you don’t like being criticized for making major mistakes, take care not to make so many of them. (Your kerfuffle with Naomi Oreskes should tell you something about the mentality of the people involved on “that side” of the debate.)
  
  There’s a term I use for people who can’t admit to error and acknowledge the people who pointed it out to them. There’s a scene from the original Ghostbusters movie that summarizes it. Can anybody guess what scene that is?
  
  Gavin Schmidt among others is 0 for 2 now in that department. IMHO.
  - Nicolas Nierenberg
    
    Posted Jun 9, 2012 at 4:27 PM | Permalink
    
    I can assure you that if anywhere along the way I had gotten even a semi nice email from Naomi Oreskes indicating that they had discovered an error in their paper related to my work I would not have responded discourteously. I would instead have posted the letter (with her permission) and thanked her.
    - clt510
      
      Posted Jun 9, 2012 at 10:40 PM | Permalink
      
      Nicolas, I would have too.
      
      Unfortunately the arrogance of that group knows no bounds, and that leads to natural hostility on the part of people that bump heads with them.
      
      if you don’t want people to react hostilely towards you there’s a cure for that too. Don’t behave with such hostility towards reasonable requests to start with. And learn to admit publicly when you’ve made mistakes.
      
      Naomi as far as I know has never publicly retracted any of the outlandish claims in that paper (including the gaffe about Reagan being president in 1980), but that is unfortunately par for the course with these people.
neill

Posted Jun 8, 2012 at 6:40 PM | Permalink

snip – excessive
Peter

Posted Jun 8, 2012 at 6:45 PM | Permalink

Well played Sir!! I just like that phrase. Well done analysis would be more appropriate.

Regardless of the ultimate tech posturing and outcome, it still is what it is.

Know what I mean?
kim2ooo

Posted Jun 8, 2012 at 7:03 PM | Permalink

Reblogged this on Climate Ponderings.
theduke

Posted Jun 8, 2012 at 7:15 PM | Permalink

Steig weighed in with a response to Roger at Real Climate. What you’d expect:

Roger says:
8 Jun 2012 at 5:41 PM

dboston

The authors publicly thanked them. Why shouldn’t that information be disseminated ? The fact you don’t like them is irrelevant. There are many people I dislike and of whom I disapprove. However, if such people have done something worthy I’m not going to pretend that it didn’t happen and not acknowledge this.

The fact is that the error was reported at Climate Audit. In amongst the supposed innuendo etc., there was very useful quantitative work going on in reproducing results.

[Response:My impression is actually that the particular error was not first identified at Climate Audit. Karoly’s letter says “also”. But whatever, this is a perfectly good example of science working properly. My initial impression is that Gergis et al.s’ results will not wind up changing much, if at all.–eric]
- Richard Drake
  
  Posted Jun 8, 2012 at 7:26 PM | Permalink
  
  eric:
  
  My initial impression is that Gergis et al.s’ results will not wind up changing much, if at all.
  
  That’s my impression as well. The result looked like a Southern Hemisphere Hockey Stick and that we all know is the result desperately being sought for. As Steve’s already said there are multiple fallback positions. This comment speaks of Eric’s long experience as a climate scientist. We do well to listen to such impressions, to listen and to learn.
- johnl
  
  Posted Jun 9, 2012 at 12:46 AM | Permalink
  
  The results will obviously not change much. Because if they deviate from the predetermined right answer, they will simply be withdrawn or rejected.
Mpaul

Posted Jun 8, 2012 at 7:38 PM | Permalink

I suspect that they will argue that they used the “modern de-trending convention”.
MarkB

Posted Jun 8, 2012 at 7:51 PM | Permalink

It would be nice if we could step back and reflect on the nature of scientific publication. The publication of a peer reviewed paper does not validate the contents. Publication simply submits the work to the criticism of the relevant community. No scientific paper ever ‘proved’ anything. After publication, if the paper holds up to scrutiny, its contents may be considered provisionally correct.

What happened in this case is the system working. If papers in field ecology were subject to the same level of crowd-source criticism, similar results could be expected. This is what graduate students do once a week in lab meeting – go over selected papers and find the warts.

Needless to say, tribalism in paleoclimatology has prevented this sort of ‘outsider’ criticism from being respected and integrated. So much the worse for them. But the process itself is science at its best.
timetochooseagain

Posted Jun 8, 2012 at 7:54 PM | Permalink

The suggestion that this showing that science is working is a little strange-should not errors be identified, ideally, at the review stage? Should not reviewers demand code and data, rather than it being necessary for bloggers to find errors after that stage?

Just asking, really.
michael hart

Posted Jun 8, 2012 at 8:17 PM | Permalink

I am sure I do not merit being described as one of the ‘scrutineers’ that David Karoly thanks at the Climate Audit website for reviewing the paper:

Gergis et al (2012) study ‘Evidence of unusual late 20th century warming from an Australasian temperature reconstruction spanning the last millennium’

But if I was, I’m also sure that I would say “You’re Welcome” to all of the authors of the paper.
Gerald Machnee

Posted Jun 8, 2012 at 9:02 PM | Permalink

In my first comment at the top I was not looking at who found it first, but merely commenting on the fact that Karoly thanked Climateaudit without any sarcastic remarks. So some progress was made.
- eyesonu
  
  Posted Jun 8, 2012 at 9:35 PM | Permalink
  
  Some progress was made. Phase I damage control?
  
  Where did I read this; “We will not be entertaining any further correspondence on the matter.”
  
  Their hand was forced. The evidence was overwhelming that the paper was flawed. I will end with that.
unscientific lawyer

Posted Jun 8, 2012 at 9:02 PM | Permalink

“Research is what I’m doing when I don’t know what I’m doing.” – Werner von Braun
- unscientific lawyer
  
  Posted Jun 10, 2012 at 12:58 PM | Permalink
  
  “Research before you besmirch.” – Kowolski, The Penguins of Madagascar
hro001

Posted Jun 8, 2012 at 9:53 PM | Permalink

So there were five authors on this paper [J. Gergis, R. Neukom, S.J. Phipps, A.J.E. Gallant, and D.J. Karoly] and it would appear that not one of them thought it necessary to verify that to which they were appending their respective names and seals of approval, prior to submission to the journal.

Is this common practice – or more to the point, I suppose, non-practice – in this brave new field, I wonder? Or do co-authors use “intuition“, in much the same way as Phil Jones deploys his when conducting “peer review”?

Notwithstanding Karoly’s claim that putting a publication (already cited in AR5) on hold is “a normal part of science” … one would have thought that for such a high-profile paper someone, somewhere along the way, would have exercised some … uh …precautionary (!) measures.

Steve, I also wonder what – if any – was the time-lapse between Karoly’s E-mail and the first announced spotting of the missing paper and/or press release, and whether his E-mail preceded (or followed) the sighting(s)?

Not that I have suspicious mind, of course; but – in the absence of any communication from the famous five during the course of discussions and discoveries here – I thought it might be useful to have the perspective of the timelines involved … just for the record.
- theduke
  
  Posted Jun 8, 2012 at 10:44 PM | Permalink
  
  Interesting question about the timeline. Paul Matthews first posted about the paper gone missing at 9:02 am on this site and possibly a bit earlier at Bishop Hill.(He said Jean S posted about it, but I couldn’t find his post.) It’s possible the paper could have been pulled yesterday or even the day before. And as soon as it was noticed here or at Bishop Hill, they went into action to notify everyone.
  
  Of course, if you believe Karoly’s Tuesday claim, they had three days to plan their response over what would obviously be an uproar over the paper being withdrawn.
  - Lucy Skywalker
    
    Posted Jun 9, 2012 at 3:10 AM | Permalink
    
    Broken Hockey Stick post at BH
    
    Btw, does any one has an idea what happened to this web site:
    http://climatehistory.com.au/2012/05/24/1000-years-of-climate-data-confirms-australasias-warming/
    Jun 8, 2012 at 9:48 AM | Jean S
    
    Steve: http://webcache.googleusercontent.com/search?q=cache:fd-rwtS90I0J:climatehistory.com.au/2012/05/24/1000-years-of-climate-data-confirms-australasias-warming/+&cd=1&hl=en&ct=clnk&gl=ca&client=firefox-a
    - Geoff Sherrington
      
      Posted Jun 9, 2012 at 5:31 AM | Permalink
      
      Jean, I’ll try to get some info from Melbourne Uni. Remember that Joelle Gergis blogged that she had a $950,000 grant – presumably that could cover the cost of a blog site, but whether the University approved endorsement under its name is not known to me.
    - Tony Mach
      
      Posted Jun 9, 2012 at 4:14 PM | Permalink
      
      @Lucy Skywalker
      The news article may have gone down the memory hole, the video is still here:
      [video src="http://climatehistory.com.au/wp-content/uploads/2012/05/SBS-news-story-clipped-compressed.mp4" /]
      
      Warning: Contains Gergis.
  - James Evans
    
    Posted Jun 9, 2012 at 3:17 AM | Permalink
    
    The first mention I can find at BH is from Jean S dated Jun 8, 2012 at 9:48 AM on this thread: http://www.bishop-hill.net/blog/2012/6/7/another-hockey-stick-broken.html?currentPage=2#comments
    
    That’s BH time (GMT+1), not CA time (GMT-6?).
MikeN

Posted Jun 8, 2012 at 10:08 PM | Permalink

It’s possible they found more than one problem in their analysis, and one of these was that identified by ClimateAudit.
- Mike Haseler
  
  Posted Jun 9, 2012 at 2:17 AM | Permalink
  
  I would find it fascinating to know what problems were found. A full and frank write up published by Steve would do a lot to reassure us about the quality of their science.
  
  Indeed, I’d just the greatest scientists are the ones who are most open and honest about their mistakes … those who are their own best critics … because the more critical you are of your own work, the less anyone else has to criticise!
  - HAS
    
    Posted Jun 9, 2012 at 2:43 AM | Permalink
    
    In some ways I wonder if the particular methodological error that is the source of the back slapping here might be less important in the scheme of things than other issues with the paper.
    
    There is IMHO a sloppy approach to methodology that one sees exhibited here, that is common in the more “social” sciences, but isn’t normally seen in the more physical sciences. From where I sit I see climate scientists deep in detail playing with data sets without the wider view of what they are about. That wider view and the ability to do a good old reality check is what is lacking.
    
    By all means dine out on uncovering a mistake, but there is a wider problem here – the failure to put particular studies in a wider context – in this case just how much real extra information (in a somewhat technical sense of the word) are we introducing to the problem at hand?
  - pouncer
    
    Posted Jun 9, 2012 at 3:27 PM | Permalink
    
    Across the several sites I follow there have been claims (I neither attribute nor endorse, nor even offer as exhaustive)
    
    The proxies selected fall outside the geographical bounds of the area defines for the SH analysis.
    
    The dates selected for the temperature record do not correspond to the “growing season” of some proxies.
    
    The two tailed test used for selection results in inclusion of “upside down” proxies.
    
    Some long duration proxies were truncated to match each other rather than calibration periods.
    
    (what I remember off the top of my head…)
daved46

Posted Jun 8, 2012 at 10:39 PM | Permalink

We would like to thank you and the participants at the ClimateAudit blog for your scrutiny of our study, which also identified this data processing issue.

Thanks, David Karoly

Well, If we just went by this sentence, you don’t have to assume anyone else found the problem. The question is what the word “which” is referring back to. If it’s “your scrutiny” then the sentence can be taken to mean that the scrutiny “identified this data processing issue. So he could mean that he was glad for the scrutiny per se and that it identified the issue. I don’t think that’s what he meant, but if there were some reason he wanted to retreat and give CA total credit, he might say that’s what he meant. And rereading the message, I don’t see that anything in it explicitly that the first identification of the issue didn’t come from CA.
MrPete

Posted Jun 8, 2012 at 11:11 PM | Permalink

I agree with a note buried above, and would amplify it further:

I would like to encourage everyone to lay off the snark, and to take a gracious perspective on this. What was written to Steve was a gracious thank you. And it is eminently reasonable to read the “key” sentence as a further very gracious statement. Steve might even want to highlight this in the head post:

We would like to thank you and the participants at the ClimateAudit blog for your scrutiny of our study, which also identified this data processing issue.

The most straightforward parsing of this is:
* We thank you
* for your scrutiny of our study
* [and for the fact that your scrutiny] also identified this data processing issue.

It is far more reasonable (and better grammatically) to read “which” as referring to the scrutiny rather than referring to the blog.

Blessings,
Pete
- JonasM
  
  Posted Jun 8, 2012 at 11:21 PM | Permalink
  
  I would agree – I read the sentence as definiing ‘scrutiny’ as ‘discussing the selection/exclusion of proxies’. This discussion ALSO discovered this new problem. The significance question was not part of the initial discussion. Thus the word ALSO.
  
  Regardless, this is a point for science and the blogosphere (in whatever order you prefer).
  - MikeN
    
    Posted Jun 9, 2012 at 1:43 AM | Permalink
    
    So the scrutiny has done two things, one unidentified, and also found the error?
- Steve McIntyre
  
  Posted Jun 8, 2012 at 11:25 PM | Permalink
  
  A question.
  
  Karoly appears to have notified the Journal of Climate about problems with the data. I wonder if he credited Climate Audit in his letter to Journal of Climate. If he didn’t (and my guess is that he didn’t), then they would have failed to give proper credit to Climate Audit. As Eric Steig pointed out, a private email does not provide equivalent credit as a public acknowledgement.
- Eric
  
  Posted Jun 8, 2012 at 11:32 PM | Permalink
  
  completely agree but I’ll go further and say that the level of snark and gloating from the peanut gallery here is unbecoming and reflects poorly. Piling on is not what this site is about.
  
  That was a classy, gracious and bravely public letter from Dr. Karoly.
  - Tony Hansen
    
    Posted Jun 9, 2012 at 1:20 AM | Permalink
    
    Why is it brave?
  - Eric
    
    Posted Jun 9, 2012 at 11:50 AM | Permalink
    
    Eric here, I am a long time lurker. My rare comments are normally posted as EricH. This was not Eric S of RC. I just forgot the H.
    
    Karoly’s letter is breakthrough and brave bc he pubicly commends the blog review process here at CA. This cannot have made other Team members happy as it is an explicit acknowledgement that CA criticism should be taken seriously.
    
    I expect he will face reprecussions.
    As I un
  - Steven Mosher
    
    Posted Jun 9, 2012 at 5:31 PM | Permalink
    
    err the peanut gallery actually found the problem. opps I just called Jean S a peanut.
    
    that could give Josh an idea or two. so peanuts and jesters and gadflies.
    
    while gloating is out of line, I think jean s deserves a victory lap or maybe
    
    a free class in matlab.
- theduke
  
  Posted Jun 8, 2012 at 11:41 PM | Permalink
  
  You are very gracious, Mr. Pete, and you may be right. But Karoly’s letter could also be nothing more than a tactical, temporary retreat. They are obviously on the defensive here and a rude email of the type initially sent by Gergis was clearly out of the question. Posts above by Hillary, Stan, Bernie and others suggest other possible motives and strategies. And why couldn’t they have sent notice of the problem when they found it instead of keeping people in the dark?
  
  Let’s see what they do in the future. I’m not expecting any real improvement in the Team in either their papers or their tactics. AR5 is going to be a real battle. This is just an early skirmish.
  
  The following was posted today by Chris Y over at Bishop Hill:
  
  Regarding AR5, due 2013/2014, “the next Intergovernmental Panel on Climate Change report on global warming will be much worse than the last one.”
  Robert Orr, UN under secretary general for planning, November, 2010
- Joe Blogs
  
  Posted Jun 9, 2012 at 1:58 AM | Permalink
  
  Eric (of RC?)
  
  the level of snark and gloating from the peanut gallery here is unbecoming and reflects poorly
  
  Yes…but it’ll likely be a while before folks stop responding to the years of unbecoming snark and dismissal from the peanut Establishment.
  
  That was a classy, gracious and bravely public letter from Dr. Karoly.
  
  By the undemanding standards of climate scientists, perhaps. For anyone else it’d just be business as usual.
  - Richard Drake
    
    Posted Jun 9, 2012 at 3:20 AM | Permalink
    
    No, it’s not as good as business as usual. Here’s Margaret Thatcher:
    
    In war there is much to be said for magnanimity in victory. But not before victory.
    
    Steve’s right to be cautious, given past experience. If Climate Audit is fully acknowledged in all communications about the problem then that will be a first. There are plenty of signs that that’s not what is happening. Karoly was forced into a public letter to Steve because of the worldwide attention this blog now gets when such problems arise, since the machinations revealed by Climategate, not least because of the policy makers and others of influence that are very aware of this place.
    
    And it’s clear that Karoly’s letter was Plan B because of the public ridicule Plan A quickly attracted – to disappear the paper silently. Thanks Paul Matthews et al. This SH Team should have publicly acknowledged CA the moment they read about the problem here, rather than trying to slink off the scene without being noticed. (The Shh Team we should perhaps dub them.)
    
    There are big positives in this debacle – not least, as someone said on Bishop Hill, that the half life of errors in paleoclimate studies comes down and down, compared to the lengths Steve and Ross were made to go with the original IPCC hockey stick from 2001, dating back to Mann and co in 1998, through to the judgments of Wegman.
    
    I, like everyone else, congratulate Jean S and his co-workers. In the end it’s not a matter of taking sides but insisting that proper standards are maintained. But there should also be something more: a genuine joy that outsiders are helping out, for free, through the wonders of the internet, in joint pursuit of the truth. When that delight is plain for all to see then I too may find it in my heart to be magnanimous.
- Ben
  
  Posted Jun 9, 2012 at 2:22 AM | Permalink
  
  MrPete :
  
  They state in their e-mail :
  
  “we discovered on Tuesday 5 June”
  
  We discovered, not “it has been discovered at CA …”
  We is them :).
  - Tony Mach
    
    Posted Jun 10, 2012 at 6:10 AM | Permalink
    
    @Ben: We discovered, not “it has been discovered at CA …”
    We is them 🙂 .
    
    Well, if I understand the normal ways of science as explained by Gergis et al., then it is their study, their work and their data – therefore logically they are the only ones who can discover and disclose any errors. How can you say the secret sauce is wrong, when you are not the cook?
    - Richard Drake
      
      Posted Jun 10, 2012 at 6:18 AM | Permalink
      
      To parody Tony Blair: tough on openness, tough on the consequences of openness.
- bernie1815
  
  Posted Jun 9, 2012 at 12:07 PM | Permalink
  
  MrPete:
  You may or may not be correct. I tend to side with Ben given the tone of Gergis’s first response. But, as I noted before, the real test of their sincerity is whether Karoly, Gergis, etc., now release the unused proxies.
Bernd Felsche

Posted Jun 8, 2012 at 11:16 PM | Permalink

Ningaloo data set appears to be from one drill sample.

The .readme says:

DESCRIPTION:
The data set contains d18O and d13C in bimonthly temporal resolution for the time period 1878-1994. The material consists of a vertical core from a colony of Porites lutea. Samples for isotopic analyses were taken using a low-speed dental drill. Untreated samples were measured on a Finnigan MAT 251 mass spectrometer. The age model in the data set is based on the density banding of the skeleton and the seasonality in d18O.

Also, the sensitivity to temperature appears to be quite poor, perhaps moreso for temperatures in the distant past.

Interannual variability of sea surface temperatures (SST) inferred from skeletal d18O is dominated by a 9.5-year period, and may constitute a characteristic signal of the Leeuwin Current. On long-terms coral skeletal d18O indicates a near-continuous increase of sea surface temperatures at Ningaloo Reef over one century. The skeletal d18O time series was checked for the presence of seasonal cooling events resulting from major volcanic eruptions. A ~1° C cooling is evident following the eruption of Pinatubo in 1991, which reproduces the results of earlier investigations. However, only weak or no signals can be related to the eruptions of Krakatau (1883) and Agung (1963).

(bold mine)

The BoM data from nearby Onslow Airport which has temperature data from 1943 to 1973 shows a slight dip in max/min means for 1963 for the months but I’ve not had a chance to do any more than to plot the points and to fit some curves. Way short of proper stat’s to say if Agung produced an anomaly that should have affected coral.

There are also the usual uncertainties about SST not being air temperature, especially if there are substantial currents transporting heat along coastlines. Although the two are somewhat related, the differences in thermal mass are substantial. Warm air currents from over the continent are unlikely to warm the tropical waters measurably, noting the short period(s) over which there is a favourable temperature gradient. It is somewhat more plausible that heat driving continental circulation produces stronger winds and more surface evaporation, reducing SST. A negative “feedback”.
ROB MW

Posted Jun 9, 2012 at 12:18 AM | Permalink

Exactly Eric. I’m sure that Ryan O’Donnell can relate to your new luv-in/sarc.
johnl

Posted Jun 9, 2012 at 12:32 AM | Permalink

The big loser is Journal of Climate and their pall review process.
- johnl
  
  Posted Jun 9, 2012 at 12:37 AM | Permalink
  
  Mosher already said this better in https://climateaudit.org/2012/06/08/gergis-et-al-put-on-hold/#comment-336880
- Julian Flood
  
  Posted Jun 9, 2012 at 8:47 AM | Permalink
  
  johnl write
  
  “pall review”
  
  Well, you can bet that they’ll try to resurrect the paper somehow.
  
  (sorry, sorry)
  
  JF
Phil

Posted Jun 9, 2012 at 1:44 AM | Permalink

I think Dr. Karoly’s email is being over analyzed.

Congratulations to Jean S, Steve and everyone else for some great science.
- dougieh
  
  Posted Jun 9, 2012 at 4:10 PM | Permalink
  
  agree & add my thanks to the CA team for the “free” work done.
  
  @Eric – Jun 8, 2012 at 11:32 PM
  
  but I’ll go further and say that the level of snark and gloating from the peanut gallery here is unbecoming and reflects poorly. Piling on is not what this site is about.
  
  SM tries to limit this as much as possible (1 guy working for nothing), are you having a go at RC?
  - Steve McIntyre
    
    Posted Jun 9, 2012 at 4:14 PM | Permalink
    
    I discourage piling on comments and tend to delete them. I didn’t notice all that much piling on at Climate Audit, but perhaps I’ve gotten insensitive. It would be more helpful if you’d identify offending comments so that I can delete or snip them. Without such assistance, editing is too time consuming.
    
    It seems to me that any piling on has mostly taken place at other blogs.
    - Richard Drake
      
      Posted Jun 9, 2012 at 4:39 PM | Permalink
      
      ‘Eric’ went on to say, in the comment quoted by dougieh:
      
      That was a classy, gracious and bravely public letter from Dr. Karoly.
      
      I and some others here simply don’t agree with that evaluation. One has learned about the dangers of being too credulous as well as piling on in this area.
      
      I’ve had three small comments snipped in the last two days, all attempts at humour. I’m delighted as always that Steve just takes out that which he doesn’t think think enhances the thread but there is one instance I want to highlight. I agonised for about half an hour about what if anything I could add in appreciation and response to kim at 5:54 AM. I agree with how Steve left it but that one liner amused me more than anything on CA for many months – because as well as being very witty it captured the emptiness of ‘we call it research’ so well. Some things deserve the derision. I’m very grateful that Steve has always been so robust in his editing here.
Mike Haseler

Posted Jun 9, 2012 at 2:04 AM | Permalink

Steve well done.

This really shows the need to have people of differing views peer review papers. We all make mistakes, it is only human that the authors may make mistakes. But the system is supposed to pick up these mistakes by providing proper scrutiny through peer review.

IT DOES NOT!

And there are a lot of people here, who are not paid and get no credit and then when we do a service to humanity, have insults hurled at us.

The very least they can do is to acknowledge our help and then perhaps the tone of some comments would be a lot friendlier.
Geoff Cruickshank

Posted Jun 9, 2012 at 2:20 AM | Permalink

I believe a FOI request to establish what time of day the error was identified would be in order.

Seriously folks, nobody believes they were sitting around saying “Let’s just check we selected those proxies correctly” after the claimed years of strenuous research effort, strenuous networking to get the data, 3000 spaghetti permutations, peer review and publication and triumphant press release. It is not remotely plausible that anyone would re-check their very basic selection process after all that unless some problem had been raised elsewhere.

Prof Karoly’s ‘also’ is completely implausible.
We should be polite about it, as if someone audibly farted at a cocktail evening and looked around as though trying to identify the offender. Not convincing, but understandable.
- kim
  
  Posted Jun 9, 2012 at 5:54 AM | Permalink
  
  That looking around, we call it research.
  ============
- Steve McIntyre
  
  Posted Jun 9, 2012 at 7:01 AM | Permalink
  
  Prof Karoly’s ‘also’ is completely implausible.
  We should be polite about it, as if someone audibly farted at a cocktail evening and looked around as though trying to identify the offender. Not convincing, but understandable.
  
  Steve: :).
  - bernie1815
    
    Posted Jun 9, 2012 at 7:11 AM | Permalink
    
    Josh, did you catch that one?!!!
    - Josh
      
      Posted Jun 9, 2012 at 10:29 AM | Permalink
      
      Yes 😉
    - Jean S
      
      Posted Jun 9, 2012 at 10:48 AM | Permalink
      
      Josh, can’t wait to see that!
      
      This made me also laugh:
      
      The veracity of the proxy methods used is not universally accepted. Professor Graham Farquhar, a biophysicist at the ANU’s Research School of Biology, says the use of surrogates is problematic. For example, he says trees are likely to have grown faster between 1921 and 1990 due to the increased atmospheric concentrations of carbon dioxide, not just the rise in temperature. ”It is obviously very useful to have such data, but I can’t see it as being definitive,” he says.
      
      The authors dismiss this concern. Karoly says that nothing is certain in science, but the results draw from a range of sites and using state-of-the-art statistical methods can be accepted with high confidence: ”It is reinforcing that barrage of scientific information that confirms that the climate is warming and increasing greenhouse gases are the major cause.”
    - theduke
      
      Posted Jun 9, 2012 at 10:59 AM | Permalink
      
      From Jean’s link:
      
      Co-author and University of Melbourne climate science professor David Karoly says the study for the first time establishes that claims there was a substantial mediaeval warm period hotter than today had no basis in Australasia. The study uses climate proxies – surrogates for the record of observed temperatures that date back to only the early 20th century.
      
      Which brings to mind this:
Mike Haseler

Posted Jun 9, 2012 at 2:25 AM | Permalink

Steve, you’ve said “finding it on the same day defies credibility”.

To be quite fair, as soon as you know you are under scrutiny you tend to start looking to check. Perhaps it when something like this. Head: “is there anything in this criticism?” … “Not sure, I’m not sure about the stats” … “well get ??? to take a look” … ???: “oops, there are quite a few problems and you certainly can’t publish until we look at them more carefully”.

Steve: finding it “independently” is impossible to believe. Once I had asked for data to verify the screening calculation and the relevance of detrending to the Screening Fallacy had been joined, they might have checked the calculations on the 27 proxies shortly before Jean S did. but they did not do so “independently” of Climate Audit, any more than Gavin Schmidt “independently” found the splicing of the Harry station.
- Eddy
  
  Posted Jun 9, 2012 at 3:31 AM | Permalink
  
  I agree Mike. I think it was Steve who suggested that the proxies may have been chosen early in the investigation and that when they came to use them it was assumed that they had been selected using a detrended test. If it was down to a misunderstanding then that could have come out independently once someone was prompted to ask the right question. That would have been an ouch moment.
Annabelle

Posted Jun 9, 2012 at 3:18 AM | Permalink

I thought Joelle Gergis was very rude, arrogant and patronsing in her reply to you, so I am rubbing my hands in glee to see her with egg on her face.

Congratulations to you and Jean S. Millions of “little people” like me all over the world are grateful to you for your voice of reason and sanity in the climate madness.
Stephen Richards

Posted Jun 9, 2012 at 3:45 AM | Permalink

Steven Mosher

Posted Jun 8, 2012 at 4:28 PM | Permalink | Reply

No gloating.

The lesson to be learned should be learned by the JOURNAL in question. Author’s will make mistakes. Ideally they catch their own mistakes. If reviewers dont demand the code and check

1. that it runs
2. that it produces the graphs and tables published
3. that it actually does what the text describes

Then they cannot approve the paper for publiction. It’s so simple that I’m stunned this has to be explained to anyone. Apparently it does.

I would suggest that people write to Journal of the climate and suggest that their editorial practices need some improvement.

Steven

Your best post by far for a very long time. Absolutely by your side. Gloating is not required here. We should allow Karoly his dignity even though we know it was the genius that is CA that found the errors.
- Joe Blogs
  
  Posted Jun 9, 2012 at 4:28 AM | Permalink
  
  The basic rule of diplomacy is to always offer you opponent an easy out. Don’t trap him, or a messy fight to the death will ensue. What you most want is quiet surrender.
  
  So Yes, we should officially allow Karoly his dignity, as bogus as it obviously is.
- maxberan
  
  Posted Jun 9, 2012 at 5:36 AM | Permalink
  
  It would obviously lower the half-life of scientific error if it were to be so but who would agree to become a peer reviewer if they were obliged to devote time and divert their own interest in order to repeat large parts of the reviewee’s research exercise. Peer reviewers help the journal editor decide if a contributed script is acceptable for publication, not that it is correct (though superficially obvious mistakes or inconsistencies picked up during the read-through will be pointed out). Basically it boils down to answering the question, “Will the journal reader understand what is written”, not “Am I able to underwrite what is written”. Correctness or otherwise is left to the judgment of history, the latter manifests itself in science most often by just fading into oblivion, less often by direct contradiction.
  - Joe Blogs
    
    Posted Jun 9, 2012 at 5:47 AM | Permalink
    
    So bascially, the boast that some science is “peer reviewed” is just as worthless is that jejune rap number that touted it. What really counts is history, now fast-forwarded by the internet.
    - maxberan
      
      Posted Jun 9, 2012 at 10:12 AM | Permalink
      
      When you say “What really counts is history”, it depends on who is doing the counting. In my days as a working scientist (hydrology) I certainly valued journals as a window on who was doing what and staying informed on new methods and data. I’m sure the filter of the review process must have saved me from a lot of wasted time and effort. I also played my part by reviewing the work of others though it would not have occurred to me that I should repeat the research as part of my review. Users of published research who are fellow scientists are well aware of the limitations.
      
      But of course the current issue is different because hugely expensive policies rest on journal articles and a new politicised population of users are doing the counting. The additional layers of scrutiny appropriate to science in the service of policy has been fully rehearsed in this thread and others and I don’t depart from them at all.
      
      So I want there to be science journals and believe peer review within its present boundaries is worth preserving.
  - Steve McIntyre
    
    Posted Jun 9, 2012 at 7:31 AM | Permalink
    
    Max, I had this discussion with Stephen Schneider in 2004.
    
    I think that journals should have policies requiring the archiving of data and, in this day and age, source code, as it makes the process of replication far more efficient. I think that authors should be required to affirm that they have complied with journal policies on archiving and that reviewers should check that the data is archived as promised. I don’t think that reviewers should be obliged to check all the work. However, if a reviewer wants to check the work, then I think that the journal should ask the author to provide him all the work.
    - Richard Drake
      
      Posted Jun 9, 2012 at 7:59 AM | Permalink
      
      However, if a reviewer wants to check the work, then I think that the journal should ask the author to provide him all the work.
      
      What you said to Schneider in 2004 is of historical importance in showing how the IPCC and all the journals from which it selects papers could have prevented such future messes long ago. But wouldn’t you say today that data and code should be provided with any published paper in this area, for anyone to access, in the open source, open content fashion? Mosh makes key points about how turnkey what is provided should be, based largely on your own example. But even if things are not perfect as a turnkey package (that for me is stage 2) surely today complete openness of data and code should be mandated by the IPCC and all the journals on which it depends?
      
      I realise publication might be considered different from peer review but as an outsider I can’t see why there should be a different standard of openness at that point. In other words, reviewers shouldn’t have to ask, they should receive paper, data and code as a matter of course.
    - maxberan
      
      Posted Jun 9, 2012 at 11:11 AM | Permalink
      
      Sorry, I missed your comment before responding to Joe Blogs’ but I think it mostly answers the case. As usual, horses for courses: the scrutiny appropriate to bog-standard science is not appropriate to high impact policy relevant science.
      
      I wonder what my reaction would have been if a reviewer asked to see my working. A lot of what one does in environmental “science” comes under the heading of “skill and judgment”. So numerical differences between what the paper author wrote and a reviewer armed with your data concludes could rapidly deteriorate into an “I’m right, you’re wrong” slanging match even though an arbitrator would dismiss each as expressing fair comment. I feel that pointing out methodological shortcomings (including its reporting) is a more fruitful use of peer review than checking the sums.
      
      Is there also a distinction between types of research? Seeing if SH proxies yield a hockey stick is much more circumscribed in its intent, data and methods than the open-ended stuff that populates most journals most of the time. The former is perhaps more given to what you advocate. A mining analogy might be research to establish the best place to drill as opposed to research to discover new drilling procedures. Needs more thinking about!
    - dougieh
      
      Posted Jun 9, 2012 at 5:01 PM | Permalink
      
      in ENGINEERING when we check a part/product datasets (aircraft wing rib for example) we have stress docs, manufacturing & other info which should all agree before we sign them as fit for purpose (if the model/software is absent, i would not sign it of, how could I?)
      old argument I know, but back to academia 🙂
  - Steven Mosher
    
    Posted Jun 9, 2012 at 3:40 PM | Permalink
    
    I prefer the PLos approach
    
    By submitting to PLoS ONE, authors agree to make to make freely available any materials and information described in their publication that may be reasonably requested for the purpose of academic, non-commercial research. Authors should make every effort to allow immediate and unrestricted access to all data and replaceable materials that are relevant to an article, except where this would breach confidentiality rules related to human-subject research.
    
    If restrictions on access to data or materials come to light after publication, PLoS ONE reserves the right to post a correction, contact the authors’ institutions and funders, or retract the publication.
    
    PLoS ONE will not consider a study if the conclusions depend solely on the analysis of proprietary data. If proprietary data were used, and the authors are unwilling or unable to make these data public, then the paper must include an analysis of public data that validates the conclusions so others can reproduce the analysis and build on the findings. These policies have been developed in accordance with the principles established in Sharing Publication-Related Data and Materials (National Academies Press, 2003).
    
    Data for which public repositories have been established should be deposited before publication, and the appropriate accession numbers or digital object identifiers (DOIs) published in the “Methods” section of the paper. Please contact plosone [at] plos.org with any questions about appropriate discipline-specific repositories.
    
    If an appropriate repository does not exist, data should be provided in an open access institutional repository, a general data repository such as Dryad, or as Supporting Information files with the published paper. If none of these options are practical, data should be made freely available upon request.
    
    For submissions that describe new software, authors should adhere to appropriate open source standards and deposit source code in an acceptable archive. The manuscript should also provide full details of any algorithms used. See the PLoS ONE Editorial Policies for Methods, Software, and Database, and Tools Papers and the Manuscript Guidelines for further information.
    
    The PLoS policy for sharing materials, methods, and data can be found here.
    - Richard Drake
      
      Posted Jun 9, 2012 at 4:50 PM | Permalink
      
      Very helpful Steve. The openness train is gaining momentum. Climate is the hard case.
- Dave L.
  
  Posted Jun 9, 2012 at 7:26 AM | Permalink
  
  One would think that when Gergis et al resubmits the paper to the JOURMAL that Steve would be requested to be one of the reviewers — if the Editor of the JOURNAL has learned anything from this encounter. Karoly, if he is truly being sincere (and he has spoken for the Team), ought to make this request as well.
Stephen Richards

Posted Jun 9, 2012 at 3:49 AM | Permalink

Eddy

Posted Jun 9, 2012 at 3:31 AM | Permalink | Reply

I agree Mike. I think it was Steve who suggested that the proxies may have been chosen early in the investigation and that when they came to use them it was assumed that they had been selected using a detrended test. If it was down to a misunderstanding then that could have come out independently once someone was prompted to ask the right question. That would have been an ouch moment.

Eddy

I fear you are wrong here. I worked for many years in a research environment. What you suggest doesn’t happen that easily. I favour the Gavin route and a conversation that went something like.

They are turning your paper inside out on CA and have found problems with it. I think it best you follow their site and be ready to act.

xxxxx Gavin

MOSHER

Some really good comments from you today. Just like the old days.
KnR

Posted Jun 9, 2012 at 3:50 AM | Permalink

If your looking for honesty you be a dam fool to look for it in Climate Science as it only gets in the way of the ‘results’ they need .

Gergis words shows us what happens when you swap doing science for doing advocacy , you end up with a egg covered face .

I wonder is anyone on ‘the Team ‘ was a reviewer of this paper ?
lasserradepress

Posted Jun 9, 2012 at 4:30 AM | Permalink

I wonder whether the editor will now require the authors to make their whole data set available – in the interests of transparency and given that the issue is going to end up as a debate about selection. In his shoes I would do this, if only because otherwise doubts about the paper will persist indefinitely – and that’s bad news for the journal.
DR_UK

Posted Jun 9, 2012 at 4:41 AM | Permalink

I wonder if any of the unused proxies would have got through the detrended screening.
Ben

Posted Jun 9, 2012 at 4:47 AM | Permalink

Anyone knows if the corrected version will make it to AR5 ?

The absence or the reality of a hockey stick in the SH being a major point of AR5 paleoclimate chapter….
Rick Bradford

Posted Jun 9, 2012 at 5:23 AM | Permalink

From an Australian layman’s perspective, the astonishing thing about this sequence of events is David Karoly admitting any kind of an error in work he is associated with.

Usually, when Karoly and God have disputes about the nature of reality, it is God that has to back down….
Adam Gallon

Posted Jun 9, 2012 at 6:40 AM | Permalink

It must really annoy the denizens of Realclimate.
Up pops another “Hockey Stick” & it gets chopped up by Steve & his cohorts, in even less time than the previous ones!
Jonathan Jones

Posted Jun 9, 2012 at 8:21 AM | Permalink

Steve,

I don’t see why reviewers shouldn’t check archiving, but I doubt there is much mileage in requring reviewers to do it. Firstly, most academics are sick and tired of reviewing things, and it is increasingly difficult to persuade them to review at all. Secondly, most reviewers would see this sort of check as being primarily the editor’s job. Thirdly, as you say, the real problem lies with the editors: if editors are not prepared to enforce their own journal’s policies then there is little point in reviewers pointing out failures.

On the more general subject of peer review, I have rarely heard practising scientists ascribe any particular importance to peer review in the scientific enterprise; they are well aware of the many deficiencies of peer review because they see it regularly from both ends. The obsession with peer review as being somehow central to science is largely confined to scientists who feel their discipline is under siege from outsiders, and so want some excuse to be able to ignore those outsiders, and from heads of scientific societies, who want to create and maintain an privileged inner group, justifying funding and prestige.
- Richard Drake
  
  Posted Jun 9, 2012 at 8:31 AM | Permalink
  
  In my fantasy world editors would ensure punctiliously that reviewers get paper, data and code, lessening the burdens on reviewers. They would then be free to make whatever use they want to of such a package, given limited time – which should also usefully increase the ‘peer pressure’ for Mosher-endorsed turnkeyness. It should be de rigeur for reviewers to disclose which areas they have and haven’t checked, as you’ve suggested. With a process like that, we could value peer review but not overrate it. The full openness of data and code on publication is a much more important step.
  - hro001
    
    Posted Jun 9, 2012 at 3:53 PM | Permalink
    
    Richard Drake Jun 9, 2012 at 8:31 AM
    
    With a process like that, we could value peer review but not overrate it. The full openness of data and code on publication is a much more important step.
    
    Richard, I agree … except that, IMHO, it should be noted that it is not “we” who have overrated the “value” of peer review. Rather it is the likes of Pachauri and The Team (and their MSM ~~parrots~~ friends) who have cavalierly dressed it up so that it has acquired a level of authority far greater than is warranted [cf Horton’s contribution to Muir Russell].
    
    Consider also Jonathan Jones’ observation that “most academics are sick and tired of reviewing things”. YMMV, but it is been my experience those who lack enthusiasm for a task at hand are unlikely to do a thorough – or reliable – job!
    
    And as we have repeatedly seen, it is “they” who are resisting “full openness of data and code on publication”.
    
    So the view from here, so to speak, is that they want it both ways! And all we’re left with is, in effect, “trust us, we’re climate scientists and we’re right because … well … because we (and our friends) said so!”
    - Richard Drake
      
      Posted Jun 9, 2012 at 4:16 PM | Permalink
      
      Hilary, my thinking on peer review has been influenced by conversations with Andrew Montford and by what Pat Frank has written in defence of it here on CA. Andrew left me with the impression that he doesn’t think it ever does much good (he can correct if that’s off beam) but Pat thinks. from his experience as a chemist, that it’s worth preserving. I was conscious of both men when I wrote the ‘we’ – and Jonathan’s very interesting testimony above. I agree with everything you say about Pachauri and co but I do think some journalists can gradually be re-educated in this area. In an ideal world they’d be included in the ‘we’ too.
    - hro001
      
      Posted Jun 9, 2012 at 5:24 PM | Permalink
      
      Richard, I think you may be more of an optimist than I 🙂 You suggest that “some journalists can gradually be re-educated in this area”. However, it strikes me that they may well be the slowest learners of the lot – not to mention being at the forefront of the enviro-activist-advocacy line!
      
      I would hold out more hope for these journalists if they had taken on board – and publicized – Donna’s TDT and Andrew’s THSI. But, for the most part, their silence on both has been deafening, has it not?
      
      And while I’m here (and since we have been speaking of Karoly), I believe it is worth noting that Karoly is one of the designated Review Editors for AR5-WG2-Ch25 – “Australasia”.
      
      His obvious failure to check the work of that to which he had appended his name prior to submission for publication of Gergis et al, does not leave me with a particularly high level of confidence in the diligence and/or scrutiny that he is likely to apply to the task of “Review Editor”.
      
      And just for the record [h/t Peter Bobroff, wizard behind AccessIPCC], Karoly was … uh …”also” extensively involved in the production of AR4:
      
      Core Writing Team AR4-WG0-frontmatter
      
      Lead Author AR4-WG2-ts [Technical Summary]
      
      Drafting Author AR4-WG2-spm [Summary for Policymakers]
      
      Review Editor AR4-WG1-Ch09 [“Understanding and Attributing Climate Change” in which there were references to 8 papers that he led or co-authored]
      
      Lead Author AR4-WG2-Ch01 [“Assessment of Observed Changes and Responses in Natural and Managed Systems” which contained references to 2 of his own papers]
      
      Rounding out this portrait of Karoly, as Donna had discovered, WWF lists him as a member of their “Climate Witness” program’s “Scientific Advisory Panel”.
      
      Steve: in he Climategate emails, Karoly is on numerous emails with Myles Allen in connection with “detection/attribution” papers.
    - theduke
      
      Posted Jun 9, 2012 at 6:18 PM | Permalink
      
      Hillary: re Karoly’s role at the IPCC, Tim Ball linked earlier to a post by Donna in which she exposes the connections between the Journal of Climate and the IPCC:
      
      The Journal of Climate & the IPCC
      
      It’s worse than we thought.
      
      Regardless, it’s going to be …uh…”entertaining” watching them try and shoehorn this paper into AR5 after it is put off(?) “hold.”
    - hro001
      
      Posted Jun 9, 2012 at 7:00 PM | Permalink
      
      Steve: in [t]he Climategate emails, Karoly is on numerous emails with Myles Allen in connection with “detection/attribution” papers.
      
      Quite so! In Peter’s preliminary work, he identifies 4 Karoly-Allen emails in the period 1998-07-08 2008-10-14, in which “uncertainty” is a word of interest.
      
      And speaking of the Climategate emails … here’s another pixel from the Karoly portrait. Remember that Feb. 17/12 “An Open Letter to the Heartland Institute”, so kindly hosted by the Guardian?
      
      This pre-“Gleick confession” plea – generated and published in almost record time, considering the Feb. 14 “launch” of Gleick’s notorious package – was, to my mind at least, notable for the somewhat conspicuous absence of Gleick in the list of signatories.
      
      For the record, luminaries who did sign (or at least agreed to have their names appended) were:
      
      Ray Bradley
      David Karoly
      Michael Mann
      Jonathan Overpeck
      Ben Santer
      Gavin Schmidt
      Kevin Trenberth
      
      http://www.ucsusa.org/news/press_release/scientists-emails-stolen-heartland-institute-1372.html
    - Richard Drake
      
      Posted Jun 9, 2012 at 11:56 PM | Permalink
      
      Hilary, thanks for the info on Karoly. As I’ve said above, Steve is right to be cautious.
      
      Journalists have improved since Climategate but starting from a very low base – mostly the non-science ones like London’s parliamentary sketchwriters. I think there’s something about this particular story that is both hilarious and self-contained (unlike TDT and THSI, though they would both be excellent background or further reading) which makes it ideal for decent in-depth journalistic exposition. And the story is certainly far from over – so it’s worth seeing if others other than Andrew Bolt pick up on it. We must travel in hope and we must have fun in travelling. 🙂
    - hro001
      
      Posted Jun 11, 2012 at 12:09 AM | Permalink
      
      We must travel in hope and we must have fun in travelling.
      
      Agreed … so while we’re waiting for these decent in-depth journalistic expositions which will take a leaf from Bolt’s book … I decided to have some “fun” with this 🙂
      
      Climate science … sows’ ears and silk purses.
- Steve McIntyre
  
  Posted Jun 9, 2012 at 10:03 AM | Permalink
  
  Jonathan, excellent points. The situation in climate science is sui generis. It would be hard to contemplate a group of molecular biologists, even as a joke, rapping “We are molecular biologists. What we say is true. Unlike Andrew Bolt our work is peer reviewed.”
  - Richard Drake
    
    Posted Jun 9, 2012 at 11:26 AM | Permalink
    
    Strange in multiple dimensions. But just to pick one, a group of molecular biologists would be delighted that Andrew Bolt (or anyone like him, of any political stripe) had even heard of them. And took their work seriously enough to be interested. Climate is made weird by the existential thread it is assumed to be research into, by those in power. Having got the attention of the top guys, smaller fry like Bolt (or McIntyre) just don’t hack it and their interest can only do harm.
  - Jim Bouldin
    
    Posted Jun 9, 2012 at 6:34 PM | Permalink
    
    “It would be hard to contemplate a group of molecular biologists, even as a joke…”
    
    Then you don’t know the molecular biologists very well and not sure why you would compare the two groups in the first place. If they had been attacked like the paleo-climatologists have been, they might well do something like that.
    
    And no, I neither like that video nor think it accomplishes anything.
    - owqeiurowqeiuroqwieuro
      
      Posted Jun 9, 2012 at 10:36 PM | Permalink
      
      Re: Jim Bouldin (Jun 9 18:34),
      
      You don’t believe molecular biologists and their field of science hasn’t been attacked? Think about that for second.
    - Richard Drake
      
      Posted Jun 10, 2012 at 12:18 AM | Permalink
      
      Jim Bouldin:
      
      If they had been attacked like the paleo-climatologists have been, they might well do something like that.
      
      And some of the attacks have been unfair. If they had all been unfair who knows what we might expect. But there have been many criticisms that are entirely justified, that should have caused any real scientist to hang his or her head in shame, not least revelations of gaming of the peer review process in the Climategate emails. Coming out with an attack video on a sceptical journalist, before putting one’s own house in order, and specifically justifying yourself because you’re peer reviewed and he’s not, is both very weird and laughably wrong.
      
      But I don’t see a similar attack video from Gergis and Karoly in the works at the moment. Something has gone wrong, again, with some peer reviewed paleo-climatology and this time it’s been exposed before the IPCC report expected to cite it is finalised. These people should have listened to the detailed criticisms of past hockey sticks, rather than swallowed the Michael Mann fantasy version of the history, they should have opened up their data and code (not because Steve McIntyre asked but because it should be standard practice) and they should acknowledge Jean Sibelius and Climate Audit in every communication they now have to make. Without taking false confort in the persecution complex you convey so well.
John A

Posted Jun 9, 2012 at 8:33 AM | Permalink

Steve Mc:

One interesting topic that I’ve considered from time to time. As someone who’s spent his life outside academia, I view “journal peer review” sociologically only as a (fairly limited) form of due diligence.

I’m going to differ with you Steve. I regard journal peer review as not simply limited, but as scientifically meaningless and sociologically misleading.

All peer review does is filter what does and does not pass for publication in that particular journal and not whether it is correct.

Due diligence means check the numbers, sample the calculations and reproduce the results – at the very least, yet due diligence is not done in most cases because a) there’s no money in it and b) academics have a clear bias to prove original hypotheses rather than disprove someone else’s.

Peer review in journals as a filter to publication is no guarantee of quality and might well be worse than no peer review at all. I think that science would be invigorated by having fewer papers published and a vastly more open due diligence that actually ticks the box marked: “This study has been independently replicated”

Witness what happened to the result that appeared to show that neutrinos can travel slightly faster than the speed of light with my emphasis

“Although this result isn’t as exciting as some would have liked,” said Bertolucci, “it is what we all expected deep down. The story captured the public imagination, and has given people the opportunity to see the scientific method in action – an unexpected result was put up for scrutiny, thoroughly investigated and resolved in part thanks to collaboration between normally competing experiments. That’s how science moves forward.”

But not in climate science – there’s too much money and too many inflated egos at stake.

It’s the difference between trusting the statements made by mining companies and independent examination of the split core.
theduke

Posted Jun 9, 2012 at 9:12 AM | Permalink

If anyone wants to see and hear David Karoly, here is a two part interview (40+ minutes) he did in February 2009 (pre-Climategate). The interview was conducted just after all the celebration and fuss about AR4 and, yes, there is an air of “triumphalism” (to borrow a word from Steve) about it:

http://www.themonthly.com.au/climate-series-david-karoly-latest-climate-science-1492
Tim Ball

Posted Jun 9, 2012 at 9:24 AM | Permalink

Writing about the Journal of Climate, Donna Laframboise, author of “The Delinquent Teenager” said,

“It’s chief editor, Anthony J. Broccoli, was a contributing author and expert reviewer for the IPCC’s 2007 report (known as AR4).”

The Journal of Climate & the IPCC

Peer Review, Pal Review, and Broccoli
DocMartyn

Posted Jun 9, 2012 at 9:58 AM | Permalink

Re: Peer review up thread. What would it take for me, as a reviewer, to ‘believe’ that a temperature reconstruction from a proxy was a valid representation of temperature?
The essence of science is reproducibility on all levels. So lets use tree cores as an example.
Here is a start.
Question 1) How reproducible is a profile of tree ring width and tree ring density in an individual tree?
I would demand that the authors take 10 cores from a single tree. They would then use identical physical and mathematical treatments of all 10 cores and establish the statistical distribution of rings width/density.
Question 2) How reproducible is a profile of tree ring width and tree ring density group of trees in the same location?
Repeat the 10 times coreing process, but in a population of trees drawn from a very small area. The aim would be to know the variance in a tree population.
Question 3) What is the impact of non-temperature environmental changes on tree widths/densities? Humans have undertaken large scale projects which have altered the ecology in a multitude of ways. There are the known known’s; such as what happens to trees upstream/downstream of large dams, and the unknown/unknown’s; what happens when human management is completely removed from an environment; measure the trees in the Chernobyl exclusion zone.
Question 4) What are the oxygen/deuterium and carbon isotope levels throughout a trees history and how do these levels correlate with ring width and density?
Cut down a few big trees, slice into squat cylinders. Read the widths/densities and use spectroscopy to establish the O/D/C ratios. Do this with the different sections and see if the vertical height of the position of the wood changes these parameters.
- Jim Bouldin
  
  Posted Jun 9, 2012 at 6:55 PM | Permalink
  
  1 and 2) There is no need for such a large number. Two cores from at least 10 trees (which is common but on the low side for ITRDB data) is plenty enough to discern whether there is a relationship with the environment or not. You have to remember that it’s a tradeoff in information versus time spent. Every core has to extracted and prepared, and every ring has to be measured, sometimes with multiple measurements on each.
  
  3) Biogeochemical influences (CO2, N, O3) are probably the biggest issue there, as they are potentially the hardest to track and disentangle from temperature, and they are not all obvious in their amounts or effects.
  
  4) Isotopic analysis is so different in nature from ring size and density measures that these are typically conducted by entirely different research groups. Indeed, even density measures require completely different lab set ups (and hence groups), although this may be getting more within the reach of more groups because of the similarity of results from certain staining techniques. But I agree that getting more types of data is definitely a good thing.
  
  And the stem analysis (vertical profile) of any TR measure would be nice, but it’s a huge amount of work and it’s destructive sampling at any rate. That’s a forestry thing, not dendroclimatology
  - MrPete
    
    Posted Jun 9, 2012 at 10:02 PM | Permalink
    
    Re: Jim Bouldin (Jun 9 18:55),
    Jim, I assume you’ve done field tree coring work. Have you seen strip-bark BCP’s up close? What’s your take on Steve McIntyre’s observation of a 300% “growth pulse” (more discussion here on a multiply sampled tree), which appears not all that uncommon for strip-bark BCP’s that are part of the data collections used by many dendroclimatologists?
    
    We’ve received a number of assertions that scientists would not use such highly irregular data for published analyses… yet that’s exactly what has been happening. Meanwhile, the neighboring “complacent” whole bark trees tend to be avoided. Same site, same climate, but vastly different results.
    
    So here we have a situation where field workers avoided sampling the complacent/boring whole bark trees, while collecting lots of data from strip-bark trees that appear likely to be affected by a common event in the mid-1800’s. Later on, scientists working on data processing without direct reference to the field situation do analyses on the data and apparently conclude that the site is a valid representation of climate conditions.
    
    Thank you for taking valuable time to consider this. I do hope you can take a moment to examine the links and reflect on Steve’s long term assertions that some data sets really ought to be excluded from climate analysis… and our confidence in the analyses perhaps ought to be scaled back a bit.
    - Jim Bouldin
      
      Posted Jun 9, 2012 at 10:25 PM | Permalink
      
      I’ve done a fair bit MrPete, but not nearly as much as others have.
      
      Yes I’ve seen strip barked bristlecones up close. I’ll try to read those if I get time.
      
      Couple things to keep in mind. First, strip barked trees are sometimes about all you’ve got when you’re trying to find old trees, and that’s the main goal usually–finding the real old trees, so as to take the record back in time as far as possible. The reason for this is clear–these trees are constantly blasted by ice and soil particles on their windward sides, which is why the bark in usually leeward.
      
      In some locations, the only trees with a full barked circumference are the younger ones, sometimes a little older if protected by micro-topography or stand conditions.
      
      A single common event in the mid 1800s isn’t something I’d worry about. I’d worry about a bias in the long term trend if you’ve got roughly the same amount of foliage being supported by some fraction of the ordinary amount of xylem and phloem you’d have on a full cambium tree. That would be my worry. But then, maybe the leaf area declines in concert with the conducting tissues, negating this worry–I think it may well but I don’t know.
      
      The other option would be to just core the non-cambial part of the tree and sacrifice the more recent years since the cambium died. Then you are avoiding the strip bark issue, and are getting the old rings that you are after. I think that’s what I would do if it were me.
    - Jim Bouldin
      
      Posted Jun 9, 2012 at 10:38 PM | Permalink
      
      In case that wasn’t clear, the cambium is the tissue layer that produces the rings. When it dies, no more rings are produced at that spot. If you core at such locations you are sampling rings that by definition, were not the result of a strip bark phenomenon at the time they were produced.
      
      And the oldest trees tend to be the most strip barked because they’ve been exposed to the abrasion for the longest time.
      
      The point with the foliage amount is that, to the degree that there is a systematic trend in the foliage to new wood (and phloem) ratio, you could get a strong biological effect in the ring trend, not related to climate, because the +/- same amount of photosynthate produced is getting allocated to a smaller area of the tree.
    - MrPete
      
      Posted Jun 10, 2012 at 5:20 PM | Permalink
      
      Re: MrPete (Jun 9 22:02),
      Thanks for the reply, Jim.
      
      A couple of links to facilitate your rapid uptake of prior “art” here at CA 🙂
      
      Steve Mc comments on reaction wood in SB-BCP cores.
      
      One of Geoff Sherrington’s first posts that explains the complexity of tree/ring growth. One of the more fascinating factors he mentioned is auxins. Look them up to discover some amazing complexities, eg this may speak to your questions about leaf/conducting tissue, etc. Apparently, auxins (hormones) regulate growth via a type of “inhibitor” scheme: when bark is stripped / branches torn off in a storm, then less auxin reaches the roots, releasing more nutrients to flow up to the remaining live cambium. Elsewhere, G. Sherrington described how even major wounds can disappear over time as the cambial layer regrows.
      
      I like your suggestion of limiting SB-BCP sampling to the stripped areas. Combine that with same-site sampling of whole-bark trees (which of course have a complete set of recent rings) and you might get something interesting. Whether or not the data is temp-dependent is of course an entirely different matter 🙂
      
      Unfortunately, we have little if any evidence of BCP data collected on the stripped areas. People seem to typically focus on live-tree data.
      
      With all of this taken into account, I’m still a bit confused by your lack of concern about common weather events, such as major storms. If a mid-1850’s ice storm strips branches from most of the trees in an area, and data is preferentially collected from those same trees, wouldn’t that likely have an impact on our data?
  - DocMartyn
    
    Posted Jun 9, 2012 at 11:07 PM | Permalink
    
    “Jim Bouldin
    1 and 2) There is no need for such a large number. Two cores from at least 10 trees (which is common but on the low side for ITRDB data) is plenty enough to discern whether there is a relationship with the environment or not.”
    I don’t, a prior, give a damn if the ring widths/densities correlate with temperature of the level of elephant flatulence.
    I want to know 1) how reproducible a single core taken from a tree is compared to an average of 10 cores from the second tree.
    Believe it or not this is a rather important piece of information.
    I do, a prior, give a damn if a group of closely spaced tree have the same line shape with respect to ring width/density.
    These things are called controls, and controls are the fundamental building blocks of science.
    “3) Biogeochemical influences (CO2, N, O3) are probably the biggest issue there”
    Nice to know you have such certainty.
    “4) Isotopic analysis is so different in nature from ring size and density measures that these are typically conducted by entirely different research groups”
    You would be surprised at how few biochemistry departments have large cyclotrons capable of generating coherent X-Rays, but biochemists still get X-Ray crystal structures by working with physicists. We also don’t have to develop a completely new branch of mathematics to test the significance of our data, we go to these special people called ‘statisticians’ and have them aid us. We are rather promiscuous, we work with medics, vets, physicists, chemists, spectroscopists, statisticians, electrical engineers, software engineers; really anyone who can help us in what ever we are doing.
    “vertical profile) of any TR measure would be nice, but it’s a huge amount of work and it’s destructive sampling at any rate. That’s a forestry thing, not dendroclimatology”
    So you have no idea what the vertical profile is, but think it doesn’t matter for dendroclimatology. Speaking as a biologist here, if I were going to examine a tall thin conical living thing I would not assume that the bit at the bottom was going to be the same as the bit at the top.
  - Armand MacMurray
    
    Posted Jun 10, 2012 at 12:59 PM | Permalink
    
    Jim, I believe DM is referring to foundational papers that would establish the validity and properties of the technique/species/site and would then be referred to in future papers with lighter sampling such as you suggest. Have any such papers been published in your field?
Matt Skaggs

Posted Jun 9, 2012 at 10:13 AM | Permalink

Be careful when using a hockey stick as a club, the peculiar shape may cause it to deflect around and strike you in the back of the head. There must be half a dozen parameters or procedures in paleoclimate that have some degree of subjectivity. Which one will they exploit to fix this? Perhaps Mosh is uniquely qualified to predict. Like any good engineer, Steve Mc. did a quick outcome analysis and came up with a few possibilities. It is a rather delicate situation because now they have to argue against one of their own choices, or go forward with an embarassingly small number of proxies.
RomanM

Posted Jun 9, 2012 at 10:37 AM | Permalink

I think the admission that there is an error with the correlations is just a ruse to derail the audit of the rest of their work. 😉

Despite this, I have been examining the provenance of the summer HadCrut3v temperature series as calculated from the 10 by 14 ( 0S to 50S, 110E to 180E) gridded temperature data by Gergis et al.

There appears to be a possible problem because of the amount of missing monthly temperatures in the study area:

The upper plot shows the percent missing in each of the 140 grid cells. The grid cells are ordered by latitude groups: (0S to 5S,110E to 115E),(0S to 5S,115E to 120E), …, (45S to 50S,175E to 180E). The actual percentages are:

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [1,] 17.1 5.3 66.2 16.8 39.9 44.7 34.4 28.5 16.7 45.6 46.8 19.4 52.0 44.3 [2,] 4.7 3.2 8.6 7.1 7.8 8.6 31.5 3.5 19.2 36.9 32.3 37.7 44.1 23.4 [3,] 29.6 25.8 27.6 23.7 0.2 13.8 5.6 10.7 16.1 31.2 30.0 36.8 39.0 22.4 [4,] 24.8 35.4 0.0 43.1 46.8 11.3 10.4 2.3 17.3 24.2 29.3 27.3 22.5 12.9 [5,] 10.2 0.0 100.0 84.1 0.8 7.4 7.2 4.2 1.5 27.6 21.3 5.7 16.8 17.4 [6,] 3.3 23.7 3.3 100.0 64.0 36.3 14.1 0.2 0.0 13.2 11.3 7.4 19.4 17.6 [7,] 3.0 0.0 2.1 15.9 8.1 0.0 0.0 0.2 0.0 3.2 12.5 14.1 9.2 17.3 [8,] 23.3 3.3 4.8 4.5 5.1 0.0 0.0 0.0 0.3 11.1 17.3 19.2 0.0 3.0 [9,] 32.4 36.2 37.2 38.4 35.6 37.8 39.3 0.0 26.3 25.2 23.6 31.8 0.0 9.3 [10,] 55.3 59.3 61.0 65.8 64.3 63.1 58.1 57.8 58.3 55.4 32.4 1.4 0.2 38.9

The lower plot shows the frequency for each of the summer months.

What method did they use to deal with the enormous number of missing values in calculating the HadCrut 3v summer temperature series?

It also puts into question the viability of the temperature data as part of their principal component regression methodology since the amount of infilling required for the individual grid cells would be substantial.
- Jean S
  
  Posted Jun 9, 2012 at 10:55 AM | Permalink
  
  Re: RomanM (Jun 9 10:37),
  old news Roman … obviously the authors have been independently aware of that since … uuh .. at least a few hours ago! 😉
- Kenneth Fritsch
  
  Posted Jun 9, 2012 at 11:16 AM | Permalink
  
  I am with RomanM on the fact that there well could be more problems with the Gergis (2012) paper than not detrending. For one, was the non detrended calculation adjusted for df by AR1 considerations? Also how does the PCR method derive a reconstruction correlation of r=0.86 with temperature from these individual proxies which have much lower correlations with temperature.
  
  I am currently looking at correlations of the proxy series with the local grid temperature and the precursory result is that the correlations are not better than those with the regional Australasia temperatures. The non detrended proxy series have better correlations with local temperatures but a number of the proxies fail under those conditions also. I was able to find enough complete data from GHCN at KNMI to do 21 proxies correlations.
  
  David Karoly appears to have sent a friendly email to SteveM and without snark I think we can all derive our own individual views on the state of affairs in climate science from it. The question in my mind about this email is simply whether members of the community, and particularly those who are scientists/advocates, would have sufficient confidence in their arguments and evidence to have admitted to missing a rather obvious error in a paper and directly credited its discovery to a critical blog.
  - Hu McCulloch
    
    Posted Jun 9, 2012 at 1:49 PM | Permalink
    
    Also how does the PCR method derive a reconstruction correlation of r=0.86 with temperature from these individual proxies which have much lower correlations with temperature.
    
    This is possible, since if each proxy has a true (or spuriously cherry-picked) correlation with temperature, the noise will tend to cancel out when they are averaged, resulting in perhaps an even higher correlation than any of the individual proxies.
    
    This is just the Law of Large Numbers — when a bunch of iid random variables are averaged together, their average has less variance about the mean than do the individual draws.
    
    But as you say, Ken, the paper has a lot of remaining problems even aside from the one that caused withdrawal.
    
    I concur with Paul Matthews below that since the paper was published online and now is no more, it has been withdrawn. It will be interesting to see if it is automatically reaccepted after revision or if will have to be resubmitted for new review.
    - Kenneth Fritsch
      
      Posted Jun 9, 2012 at 2:53 PM | Permalink
      
      “This is possible, since if each proxy has a true (or spuriously cherry-picked) correlation with temperature, the noise will tend to cancel out when they are averaged, resulting in perhaps an even higher correlation than any of the individual proxies.”
      
      Hu, the PCR does more than just average the proxy series. My question has been whether that method forces a higher correlation than an averaging process would. I can easily average the standardized proxies and compare that result to what I see in the Gergis graph showing the reconstruction and/or correlate that result with the HadCRUT Australasia temperature used in Gergis 2012. Any bets on what that will reveal?
    - Kenneth Fritsch
      
      Posted Jun 9, 2012 at 3:39 PM | Permalink
      
      Hu, I just did a quick correlation of the mean of the Gergis 27 proxies series and correlated it with the HadCRUT Australasia temperature used in Gergis (2012) over the period 1920-2001 and obtained r=0.14 and p value =0.10. I rechecked that result and will do so again as anyone reading here can. If that result holds up, the PCR is doing something more than averaging the proxy series.
    - RomanM
      
      Posted Jun 9, 2012 at 4:33 PM | Permalink
      
      Kenneth, has it occurred to you that in order to apply PCR as described in Luterbacher (2002), you also have to be able to calculate proxy “PCs” for the reconstruction period using the coefficients calculated from the SVD in calibration. The only way you can possibly do this correctly is to have values for each of the same proxies available for every year you are reconstructing. Since the number of proxies varies over the reconstruction period, I don’t see how this can be done without further manipulation such as “infilling” each proxy for the complete time period.
      
      How do they do it? It’s a mystery.
    - Kenneth Fritsch
      
      Posted Jun 9, 2012 at 7:26 PM | Permalink
      
      Sorry but I should have rechecked again before posting. I used the wrong HadCrut temperature series. Using the correct series I obtained r=0.72 correlation (with a very low p.value) between the mean of Gergis 27 proxies and the Australasia temperature used in Gergis (2012). That compares to an average correlation of the individual series of 0.48. These correlations were calculated for non detrended series. Next I want to do the same correlations with detrended series.
    - Kenneth Fritsch
      
      Posted Jun 9, 2012 at 8:26 PM | Permalink
      
      The mean from the 27 Gergis detrended proxies correlation to the detrended Australasia temperature gave an r=0.38 and a p value=0.0002. No adjustments were made to the dfs for AR1. The average correlation of the individual detrended proxies to the detrended Australasia temperature was r=0.33.
      
      It appears that the detrending makes the correlation depend on the higher frequency part of the proxy and temperature series and that correlation is degraded from the non detrended case. The higher frequency wiggle matching as a pre-selection test (given that a pre-selection test were an acceptable practice) does not show the impact of proxies trends matching the temperature trend.
      
      If, however, a pre-selection process were aimed at matching proxy trends to instrumental temperature trends (by not detrending or some other means) then are we not right back to the original complaint at these blogs that ARIMA/ARFIMA models can readily provide series ending trends where no deterministic trend has been included and selecting for those trends in proxies invalidates the statistics required to show the proxies are reasonable thermometers.
    - Kenneth Fritsch
      
      Posted Jun 10, 2012 at 8:55 AM | Permalink
      
      As noted above in my previous post the correlation of the mean of the non detrended Gergis 27 (G27) proxy series with the non detrended Australasia temperature (ATemp)used in Gergis (2012)was a very significant r=0.72. While not as high as the correlation of the G27 PCR to ATemp listed in Gergis (2012) of 0.86, it would appear that the high correlation shows good agreement between proxies selected by mistake by Gergis (2012)without detrending.
      
      In order to determine how good the G27 mean was as a thermometer in the 1920-2001 period I compared the linear trends of the non detrended G27 means and the non detrended ATemp. The results listed below show that both the ATemp and G27 series produce a good fit to a straight line trend. The trends are, however, very different with the G27 mean producing a trend with more than twice the slope of that of the ATemp and further the trend slopes are significantly different after adjusting the SEs for dfs based on AR1.
      
      Even when selecting proxies on trend, as was done in Gergis (2012) by mistake, the proxies together show a very different response as a thermometer when compared to the instrumental measurements. Of course, it could be argued that the G27 are not representative of the Australasia region -but that is another issue.
      
      ATemp: Trend slope= 0.0075; SE=0.00073; t value=9.8; Adj R^2=0.54; AR1=0.51
      
      G27 Mean: Trend slope= 0.0177; SE=0.0016; t value=11.0; Adj R^2=0.60; AR1=0.27
    - Kenneth Fritsch
      
      Posted Jun 10, 2012 at 1:03 PM | Permalink
      
      I have made another sloppy real time error in calculating the trends above. My only hope is that I found it before anyone else did. Even so I have been thinking about the possible error for some time now. I needed to standardized both the temperature and mean proxy series before comparing trends. The corrections are below and show that the slopes are nearly identical.
      
      Based on this finding I expect to hear the Gergis authors make a strong case for their previous mistake being the correct method for selecting proxies.
      
      ATemp: Trend slope= 0.0310; SE=0.00316; t value=9.8; Adj R^2=0.54; AR1=0.51
      
      G27 Mean: Trend slope= 0.0321; SE=0.00290; t value=11.0; Adj R^2=0.60; AR1=0.27
    - Layman Lurker
      
      Posted Jun 10, 2012 at 1:45 PM | Permalink
      
      Kenneth, can you post the pdf or histogram of the G27 trends?
    - Layman Lurker
      
      Posted Jun 10, 2012 at 2:31 PM | Permalink
      
      Sorry for the room service request Kenneth. I have now downloaded the proxy data and I can do this myself.
    - Kenneth Fritsch
      
      Posted Jun 10, 2012 at 3:29 PM | Permalink
      
      LL, I think this is what you had in mind- at least in the way of raw data.
      
      The G27 series and Australasia temperature detrended and not detrended correlations are listed in the link below.
      
      The average correlation for not detrended is r=0.37 and for detrended is r=0.17. These values are lower than those I reported above but it does not change the results for correlation between the mean G27 detrended and not detrended versus Australasia temperature detrended and not detrended. The values in the link should be close to those shown by JeanS and SteveM previously although I have not yet compared the data.
      
      It is rather obvious that the only way Gergis (2012) could have selected the proxies would be based on not detrending the proxy series or the temperature when doing correlations. They would have had to flip the series with a negative correlation. Using a two sided correlation test would give at least 2 proxies that would not have passed the p=<0.05 even without an AR1 adjustment. From these results I also doubt that Gergis (2012) adjusted the df for AR1.
    - Layman Lurker
      
      Posted Jun 11, 2012 at 1:02 AM | Permalink
      
      Kenneth, here is what my room service request was about. I calculated the calibration period trends of the G27 proxies (without flipping the negative signs) and found a bimodal distribution. So I divided the G27 into positive and negative trend sub groups and plotted the separate pdf’s here. Care to hazard a guess as to where a pdf of the screened/excluded proxy trends might have fit on this graph?
    - Kenneth Fritsch
      
      Posted Jun 12, 2012 at 9:18 AM | Permalink
      
      LL, I should have noted in my previous post that your graphs of the pdf of proxy trends could be a good way of looking at the population of proxies used and not used in a reconstruction. I am currently putting together the unused proxies in the Australasia region – regardless of whether the proxies were part of the 62-27 proxies mentioned in Gergis. I also am going to look at using the entire annual mean temperature and coral proxy series to determine what that does to the derived trends in comparison with the Gergis use of SONDJF. I am currently guessing that the temperature series will show a bigger difference than the coral series.
    - Layman Lurker
      
      Posted Jun 12, 2012 at 10:18 AM | Permalink
      
      Kenneth you old dog you. A “divergence” problem in the southern hemisphere perhaps? Am I reading your comment correctly?
    - Kenneth Fritsch
      
      Posted Jun 12, 2012 at 12:18 PM | Permalink
      
      LL, I am not making any predictions about divergence here. I am always suspicious when authors of a paper decide to use certain months in their reconstructions. Obviously the TRW proxies are reported on an annual basis but coral O18 proxies are often reported on monthly or seasonal basis. When the authors are known to really have few limits on what goes into the pre-selection criteria I would not doubt that selecting the months of the year could be a criteria that makes the reconstruction “better”. The TRW proxy will not change and I have my doubts about the coral changing based on months selected or using annual data and that leaves the temperature to change. The coral O18 does change by month but then why not use O18 annual average to annual average mean instrumental temperature. TRW growth is probably more sensitive to some months of the year, but I wonder whether, with the TRW proxies in the Gergis being more tropical, how well that growth would be concentrated in a few months. Further if you do such a reconstruction you had better limit your claims of warming to the months used.
      
      In attempting to find the last coral proxy in Gergis (Fiji IF) that has alluded me so far, I ran across the following linked paper which discusses some of the coral proxies used in Gergis. Interesting that it finds a very good high frequency response from Sr/Ca ratios to instrumental temperature and differences in modern trends for Sr/Ca ratios verus O18 ratios in some of these coral proxies. Question: Why did Gergis not use coral Sr/Ca proxies?
      
      Click to access Linsley_et_al_IPO_2004.pdf
  - Kenneth Fritsch
    
    Posted Jun 9, 2012 at 3:11 PM | Permalink
    
    In the link below I show two tables with the correlation statistics of 21 of the 27 Gergis proxies with the local GHCN grid for the non detrended and detrended cases. I inverted the coral proxies so I could do a one sided test (greater) on all the 21 proxies. I used an AR1 correction to estimate a pass/fail of the p=< 0.05. My estimation is not exact but suits the purposes here. The r values and test success rates are little different than what is obtained correlating the proxy series versus the Australasia regional temperatures.
    - bernie1815
      
      Posted Jun 9, 2012 at 10:54 PM | Permalink
      
      Kenneth: What does p.value refer to in your table?
    - DocMartyn
      
      Posted Jun 10, 2012 at 1:07 PM | Permalink
      
      Kenneth, I have just reread this statement
      
      ““While the paper states that “both proxy climate and instrumental data were linearly detrended over the 1921–1990 period”, we discovered on Tuesday 5 June that the records used in the final analysis were not detrended for proxy selection, making this statement incorrect.””
      
      Do you think that it is possible that they ran the proxies, undetrended, against the detrended temperature data?
      If so, this might resolve the discrepencey in your two tables vs. their publication. You have run detrended vs detrended and undretrended vs. undetrended.
      Could you indulge me and run the undetrended proxies vs. detrented temperature?
      I would do it myself but am a bit of a dummy.
    - Kenneth Fritsch
      
      Posted Jun 10, 2012 at 1:08 PM | Permalink
      
      “Kenneth: What does p.value refer to in your table?”
      
      The p.value is the probability that the estimated correlation is greater than zero by pure chance. The lower that value the more likely one can reject the null hypothesis that the correlation is not different than zero.
    - Kenneth Fritsch
      
      Posted Jun 10, 2012 at 1:22 PM | Permalink
      
      “Could you indulge me and run the undetrended proxies vs. detrented temperature?
      I would do it myself but am a bit of a dummy.”
      
      I can but is it not apparent that running one series detrended against another not detrended is going to have a very low correlation. My calculations show that it is the trend matching here that leads to higher correlations.
      
      “If so, this might resolve the discrepencey in your two tables vs. their publication.”
      
      I do not understand what discrepencey you are referring to here, since I have not been able to locate any of these correlation values in Gergis (2012) or its SI.
      
      I do not interpret the reply you excerpted to mean that the proxy series were detrended – as I take the reference of proxy selection to be taken together as the selection process used.
    - Kenneth Fritsch
      
      Posted Jun 10, 2012 at 1:57 PM | Permalink
      
      Doc, if this table reproduces here you will see that it lists in column 1 the correlation coefficient r and in column 2 the p value. This is the correlation of the detrended Australasia temperature used in Gergis (2012) versus the not detrended 27 proxy series. The order of proxies corresponds to what I used in the tables posted above.The coral proxies were inverted before doing the correlation.
      
      [1,] 0.206381804 0.0314245786
      [2,] -0.036812582 0.6286749027
      [3,] 0.151336139 0.1006114995
      [4,] 0.121053544 0.1423919115
      [5,] 0.062061110 0.2996916128
      [6,] 0.285854789 0.0048418056
      [7,] 0.190991602 0.0438210925
      [8,] -0.036828456 0.6239414955
      [9,] -0.128995790 0.8581695550
      [10,] 0.162615898 0.0831343986
      [11,] 0.099520179 0.2027764083
      [12,] 0.073649466 0.2678783898
      [13,] 0.287326596 0.0046498427
      [14,] -0.073715592 0.7308509258
      [15,] 0.065346004 0.2941087095
      [16,] 0.120510199 0.1498779725
      [17,] 0.108259924 0.1680181260
      [18,] 0.160536934 0.0761177403
      [19,] 0.057777085 0.3088486274
      [20,] -0.004070514 0.5134378578
      [21,] 0.049133463 0.3398764422
      [22,] -0.092632135 0.7869484790
      [23,] 0.358572421 0.0010682992
      [24,] 0.079474909 0.2417241679
      [25,] 0.246416008 0.0165378461
      [26,] 0.085710779 0.2338895865
      [27,] 0.379301877 0.0004674494
    - DocMartyn
      
      Posted Jun 10, 2012 at 2:31 PM | Permalink
      
      Thank you Ken. I was just wondering how they picked the series they did, if they didn’t use the methodology they describe in the text.
    - HaroldW
      
      Posted Jun 10, 2012 at 4:45 PM | Permalink
      
      Doc –
      It is more than likely that the actual processing sequence in Gergis was:
      1. Compute the Spearman correlation coefficient for each proxy series vs. regional average temperature. [Not detrended — Karoly says this. Not the “regular” (Pearson) correlation — Nick showed in a comment on the previous post that Manang is not significant using the Pearson correlation coefficient.]
      
      2. Convert the Spearman coefficient r to a t-value via the equation t=r*sqrt( (N-2)/(1-r*r) ), where N-2=68 is the number of degrees of freedom. [No correction a la Quenouille for auto-correlation — Nick showed that this factor would render a number of the proxies not significant to p=2, a two-sided test. [Urewara, a tree ring width proxy with a negative correlation to regional temperature, passed the significance test, so it couldn’t have been a one-sided test. The smallest |t| of the Gergis 27 is just a hair greater than 2.0, hinting that this is the threshold.]
    - HaroldW
      
      Posted Jun 10, 2012 at 4:51 PM | Permalink
      
      Sorry – the end of the last post (4:45 PM) got garbled due to signs which were apparently interpreted as HTML tag markers. The comment should read:
      
      Doc –
      It is more than likely that the actual processing sequence in Gergis was:
      1. Compute the Spearman correlation coefficient for each proxy series vs. regional average temperature. [Not detrended — Karoly says this. Not the “regular” (Pearson) correlation — Nick showed in a comment on the previous post that Manang is not significant using the Pearson correlation coefficient.]
      
      2. Convert the Spearman coefficient r to a t-value via the equation t=r*sqrt( (N-2)/(1-r*r) ), where N-2=68 is the number of degrees of freedom. [No correction a la Quenouille for auto-correlation — Nick showed that this factor would render a number of the proxies not significant to p “less than or equal to” 0.05.]
      
      3. Proxy is considered significant if t “is greater than or equal to” 2, a two-sided test. [Urewara, a tree ring width proxy with a negative correlation to regional temperature, passed the significance test, so it couldn’t have been a one-sided test. The smallest |t| of the Gergis 27 is just a hair greater than 2.0, hinting that this is the threshold.]
  - Kenneth Fritsch
    
    Posted Jun 11, 2012 at 12:18 PM | Permalink
    
    LL, I need to dig further into these proxies and at a more leisurely pace to avoid errors. I want to look at the coral proxies separate from the TRW proxies as your graph, I believe does. What bothers me most about observing the individual proxy series is how different the low frequency structure is from 1800 forward. In some coral proxies there appears to be a more less monotonic upward trend starting from between 1800 to 1900 and forward.
- clt510
  
  Posted Jun 10, 2012 at 6:28 PM | Permalink
  
  RomanM, Wasn’t the irregular spacing/missing samples part of the basis of using the Loess method?
  - RomanM
    
    Posted Jun 11, 2012 at 6:26 AM | Permalink
    
    It appears to me that the only place that loess was used as a smoother for the mean temperature was in Figure S5 where it was applied to a regularly spaced mean temperature series being compared to a similar smoothing of the mean of their over-hyped 3000 reconstruction ensemble.
    
    My point in raising the issue of the extremely large number of missing measured temperature values was twofold. First, there would be issues in calculating the mean HadCRU annual series. Secondly, calculating the PCs for the gridded temperatures to be used in a PCR would be impossible without infilling an enormous number of values thereby introducing patterns into the PCs themselves.
    
    How this would have been done needs to be explained clearly in the Supplement.
- HaroldW
  
  Posted Jun 10, 2012 at 10:37 PM | Permalink
  
  I’d like to pose a different question about the Gergis paper.
  
  Figure 4 shows the temperature reconstruction. It depicts uncertainty regions as “ensemble uncertainty” and “ensemble and calibration uncertainty”. Before 1430, there are often only two proxies which contribute. For those time periods, the “ensemble uncertainty” is merely proportional to the difference between the two proxies, because all members of the ensemble can only vary the relative weights of the two proxies. For years when those proxy temperatures are close, the “ensemble uncertainty” is essentially nil. There remains calibration uncertainty, which appears to be constant during the two-proxy era, slightly less than 0.4K (2SE). But where is the uncertainty due to spatial sampling? Even if we had exact temperatures for the two sites (which are in Tasmania & New Zealand), that is, if the calibration uncertainty were zero, we wouldn’t be able to estimate the regional temperature exactly.
  
  By not including the sampling uncertainty, Figure 4 underestimates the uncertainty of the accuracy of the reconstruction.
  
  Further evidence that the uncertainty is underestimated lies in the out-of-sample reconstruction accuracy. In 1900-1920, two of the 21 years show a difference between reconstructed and instrumental temperatures which exceed the “2SError” value. For 1991-2001, two of the 11 years show an excessive difference. Thus, in 32 years for which out-of-sample accuracy can be assessed, four years exceed the “2SError” amount, which should be exceeded only 5% of the time.
Paul Matthews

Posted Jun 9, 2012 at 12:19 PM | Permalink

“put on hold” definitions:
to postpone something; to stop the progress of something.
to decide that you will leave an activity until a later time.

The paper has not been “put on hold”.
It was published online by J Clim weeks ago, and has now been withdrawn.
- John M
  
  Posted Jun 9, 2012 at 1:07 PM | Permalink
  
  Paul, now that you mention it, I’ve never heard of an “on hold” category for the scientific publishing process. I was inclined to let them have their “on hold”, but your comment prompted me to do some googling. AMS does not seem to have any published guidelines with regard to “in press” withdrawals/holds, bue Elsevier has an extensive set of policies:
  
  •Article Withdrawal: Only used for Articles in Press which represent early versions of articles and sometimes contain errors, or may have been accidentally submitted twice. Occasionally, but less frequently, the articles may represent infringements of professional ethical codes, such as multiple submission, bogus claims of authorship, plagiarism, fraudulent use of data or the like.
  •Article Retraction: Infringements of professional ethical codes, such as multiple submission, bogus claims of authorship, plagiarism, fraudulent use of data or the like.
  •Article Removal: Legal limitations upon the publisher, copyright holder or author(s).
  •Article Replacement: Identification of false or inaccurate data that, if acted upon, would pose a serious health risk.
  
  http://www.elsevier.com/wps/find/intro.cws_home/article_withdrawal
  
  Which of those would apply I guess would depend on the ultimate outcome of this thing.
  
  AGU has a retraction/withdrawal policy, but nothing about “on hold”.
  
  http://www.agu.org/pubs/authors/policies/retraction_policy.shtml
  
  Maybe “on hold” will have to be created as a special category exclusive to climate papers. I wonder if the IPCC can alter their policy in time…
- Skiphil
  
  Posted Jun 9, 2012 at 2:07 PM | Permalink
  
  On 17 May 2012, with much fanfare in a variety of media outlets, the Gergis et al team announced that this study had been “published TODAY” in the Journal of Climate. That is still the language in the original PR of 17/05/2012 on the University of Melbourne website (available online as of this moment):
  
  “The study published today in the Journal of Climate will form the Australasian region’s contribution to the 5th IPCC climate change assessment report chapter on past climate.”
  
  I’m not familiar with a term “on hold” to refer to an already published study. Can a journal “publish” a study online but not issue a formal withdrawal or retraction if serious problems arise?
  
  Can an already published study simply be revised without a formal retraction first??
  - Skiphil
    
    Posted Jun 9, 2012 at 2:53 PM | Permalink
    
    well of course their notice refers to the “print publication” which is put on hold….. this raises interesting issues in the new world of online journal publication.
    
    Are there previous cases of flaws being discovered between the time of “online” publication and print publication of journal articles?
    
    Anyone know how other journals have treated this kind of problem?
- theduke
  
  Posted Jun 9, 2012 at 2:16 PM | Permalink
  
  Looks like special IPCC rules are now in effect. Karoly has been a lead IPCC author since 1998 and Broccoli was an author for AR4. Here’s some background on Broccoli and his highly unusual choice of a reviewer for the O’Donnell paper:
  
  Peer Review, Pal Review, and Broccoli
- John M
  
  Posted Jun 9, 2012 at 2:45 PM | Permalink
  
  OK, I did find some that kinda sorta relate to “on hold” in this context.
  
  The first one refers to manuscripts during the review process, and not to “in press” papers:
  
  Manuscript on Hold Pending Further Investigation(when authors have not responded satisfactorily to the above letters)
  
  http://www.councilscienceeditors.org/i4a/pages/index.cfm?pageid=3571
  
  Probably not an example anyone would want to be associated with.
  
  Medical journals seem to have adopted “on hold” for cases where image manipulation is suspected. This is just one example:
  
  All images in Figures and Supplemental information from manuscripts accepted for publication are examined for any indication of improper manipulation or editing. Questions raised by Blood staff will be referred to the Editors, who may then request the original data from the authors for comparison with the submitted figures. Such manuscripts will be put on hold and will not be prepublished in Blood First Edition until the matter is satisfactorily resolved. If the original data cannot be produced, the acceptance of the manuscript may be revoked.
  
  http://bloodjournal.hematologylibrary.org/site/authors/authorguide.xhtml#image
  
  Another one that I doubt any author would want to be associated with.
  
  One final example deals with concerns over terrorism. You may remember the bird flu papers that were “put on hold”.
  
  http://real-agenda.com/2012/05/03/fear-sells-studies-on-mutant-h5n1-virus-finally-published/
  
  Obviously represents a special case, and the “on hold” is a term used by the blog and not necessarily the journal.
  - DocMartyn
    
    Posted Jun 9, 2012 at 6:50 PM | Permalink
    
    This image files is actually a big problem, we have no idea how widespread ‘unreal’ images are. We all know cowboys (very few cowgirls) who get images that are too crisp and have no background distortion.
    Most of us who take images keep a virgin original, with all the embedded data, then do all our background subtraction, digital manipulation on a copy.
    There are ways to alter the original image, in some file formats, without the changes being logged. There is no audit trail for Jpeg or for tiff files. The Nikon JPEG200 files do have a log and so manipulations are recorded (I keep these).
    When you see what people can do with photoshop its rather frightening.
    You are not allowed to electronically remove blemishes from images, thats a no-no, so I always stick blemishes under the image labels which is allowed.
    I looked at an electronic file of a Gel in a paper. I half of the lanes the pixels were aligned to the vertical, in the other half they were at 30 degrees. The image was a montage of two images. The data might have been quite kosher, but although you are allowed to clip an image, you are not allowed to combine two images.
    All derived X-Crystal structures now have to have the original X-Ray electronic density data submitted; so one can download the electron density and make your own crystal structure. This was done as creating the structure is a very human process and it is actual people who supply many of the inputs. It is quite possible for two people to get different final crystal structures from the same data set. You can ‘force’ the electronic density in one are to be something, and the rest of the structure deforms around it, after 3 or 4 iterations there is no anomaly, what you forced is ‘real’.
ferd berple

Posted Jun 9, 2012 at 12:24 PM | Permalink

It would appear that the Screening Fallacy may be an example of “selecting on the dependent variable”.

The assumption is that trees are a proxy for temperature. By selecting only those trees that correlate with temperature, they are over estimating the confidence levels that trees are in fact good temperature proxies.

By excluding those trees that do not correlate with temperature, they are hiding the data that shows that trees may not be very good proxies for temperature.

The problem is the underlying assumption, that trees are good temperature proxies, has not been established. Thus, selecting based on the underlying assumption can lead to bias and faulty conclusions.
Hu McCulloch

Posted Jun 9, 2012 at 12:41 PM | Permalink

Andrew Bolt takes a well-earned victory lap:

http://blogs.news.com.au/heraldsun/andrewbolt/index.php/heraldsun/comments/another_hockey_stick_snaps/
- Richard Drake
  
  Posted Jun 9, 2012 at 1:21 PM | Permalink
  
  Well-earned but rather restrained I thought – high on information, low on snark. Well done Mr Bolt.
Stephen Richards

Posted Jun 9, 2012 at 1:01 PM | Permalink

In all the papers I ever wrote the peer review process, by senior scientists, was the opportunity for them to stamp their personality (and name) on my work. I always asked my closest colleques to do my reviews before the document went upwards.
William Larson

Posted Jun 9, 2012 at 4:36 PM | Permalink

It’s interesting (to me, at least) what Mr. Karoly does NOT say in his letter to Mr. McIntyre:

1. He does not say that he (and/or any of his co-authors) has been following this discussion on CA. Yet obviously he knows about it, otherwise he could not haves written to SM in the first place.

2. He does not acknowledge that it looks strange, to say the least, that Jean S. noticed and posted about this problem AT PRECISELY THE SAME TIME as Mr. Karoly et al. “independently” discovered it. That apparent coincidence is so striking that it demands to be addressed, but he doesn’t do it.

3. He does not state that his notification to the journal included an attribution to CA, which a simultaneous coincident discovery would require.

So it does indeed bear the marks of being “Gavinesque”: “You found out about it; I take the credit for it.” Reminds me of an old Bill Cosby skit on some PBS children’s program from long ago, where he is an actor entering the scene and trying to say “All for one, and one for all!”, Musketeer-wise, but he can never get this simple line right. My favorite of his attempts, which applies here, is, “All for me, and nothing for you!”
- Richard Drake
  
  Posted Jun 9, 2012 at 4:46 PM | Permalink
  
  He does not say that he (and/or any of his co-authors) has been following this discussion on CA.
  
  And he doesn’t post his comment in the box down here, like the rest of us hoi polloi. Respect to Myles Allen for at least breaking that taboo.
- Nick Stokes
  
  Posted Jun 9, 2012 at 9:11 PM | Permalink
  
  “we discovered on Tuesday 5 June that the records used in the final analysis were not detrended for proxy selection”
  
  He may be referring to the time when they located the computer code error. It sounds like he’s reading from a change log.
  - Richard Drake
    
    Posted Jun 10, 2012 at 4:25 AM | Permalink
    
    It sounds like you’re reading from some tea leaves Nick. If you peer hard enough you can see the face of Michael Mann.
    - mrsean2k
      
      Posted Jun 10, 2012 at 12:30 PM | Permalink
      
      The inclusion of a specific date is to publicly set up the timeline in a favourable manner. That’s the only conclusion to draw from it.
      
      But while changelogs have a mention, and bearing in mind your comment on unit testing Richard, I’d settle for less than that; evidence of some rigor in version control and labeling would be a good start.
    - theduke
      
      Posted Jun 10, 2012 at 12:57 PM | Permalink
      
      mrsean2k said: “The inclusion of a specific date is to publicly set up the timeline in a favourable manner. That’s the only conclusion to draw from it.”
      
      As I commented earlier, I see intentional ambiguity in it. He could easily have said, “Tuesday 5 June, 1 PM Melbourne time” and laid unambiguous claim to having found it first, but by not doing that he allows everyone to claim a measure of victory. If you go to RC you will see they are interpreting it as Gergis et al found their error. Here most of us are not.
      
      That said, everyone knows WHY they found it, regardless of WHEN they found it. It was the work that was being done here.
Lars P.

Posted Jun 9, 2012 at 5:55 PM | Permalink

Steven Mosher says (Posted Jun 8, 2012 at 5:36 PM |):
“haha. so funny. I was going to predict this was going to happen on the other thread
and then thought my comment was too cynical.”

Well, if to continue to be cynical they might have discovered “on Tuesday 5 June that the records used in the final analysis were not detrended for proxy selection, making this statement incorrect” when reading the post at Climate audit – the statement does not explain how it was discovered.

Now am I too cynical saying this?
Jim Bouldin

Posted Jun 9, 2012 at 6:29 PM | Permalink

“In my first post on Gergis et al on May 31, I had referred to the Screening Fallacy. The following day (June 1), the issue of screening on de-trended series was discussed in comment…What the present concession means – is that my concession was premature and that the screening actually done by Gergis et al was within the four corners of the Screening Fallacy.”

Yes, you “referred” to it, via accusation that Gergis et al had violated it. An issue which I then took up, but you never actually *explained* it. I asked you twice what your null model for your red noise process was and never got an answer to that. Sometime later that day you said that I was mistaking a site-level process with a network level process; that’s neither an explanation of what your null model, nor is it likely correct no matter what the specifics of that model is.

So, once again, if you are proposing that a random, red noise process with no actual relationship to the environmental variable of interest (seasonal temperature) causes a spurious correlation with that variable over the instrumental period, then I defy you to show how such a process, with ANY level of lag 1 auto-correlation, operating on individual trees, will lead to what you claim. And if it won’t produce spurious correlations at individual sites, then it won’t produce spurious correlations with larger networks of site either.

Furthermore, whatever extremely low probabilities for such a result might occur for a “typical” site having 20 too 30 cores, is rendered impossible in any practical sense of the word, by the much higher numbers of cores collected in each of the 11 tree ring sites they used. So your contention that this study falls four square within this so called “Screening Fallacy” is just plain wrong, until you demonstrate conclusively otherwise. Instead of addressing this issue–which is the crux issue of your argument–you don’t, you just go onto to one new post after another.

I don’t know why they calibrated on detrended data (if in fact they did, which is uncertain now). There are reasons to be made both for and against such a practice, regardless of how common it is. But I do know that if they actuall calibrated without detrending, then (1) my arguments above apply and (2) both Nick Stokes and Kenneth Fritsch have provided their calculations indicating that a high percentage of their 27 sites would have passed a p = .05 screening criterion.
- DocMartyn
  
  Posted Jun 9, 2012 at 7:36 PM | Permalink
  
  “Nick Stokes and Kenneth Fritsch have provided their calculations indicating that a high percentage of their 27 sites would have passed a p = .05 screening criterion”
  
  There 62 site examined. 8 sites with detrended data appear to pass a p = .05 screening criterion; call it 13%. We would, a prior, expect that at least 3 sites would be included just because of random noise (0.05*62).
  So, 8 sites pass the test and approximately 3 are bogus noise with the right line-shape at the end of the series, but you don’t know which 3 (or 2 or 4).
  - Jim Bouldin
    
    Posted Jun 9, 2012 at 7:48 PM | Permalink
    
    “There 62 site examined. 8 sites with detrended data appear to pass a p = .05 screening criterion; call it 13%. We would, a prior, expect that at least 3 sites would be included just because of random noise (0.05*62).
    So, 8 sites pass the test and approximately 3 are bogus noise with the right line-shape at the end of the series, but you don’t know which 3 (or 2 or 4).”
    
    That’s *detrended( data you’re talking about. My comment that you quoted was w.r.t. *trended* data.
    - DocMartyn
      
      Posted Jun 9, 2012 at 9:57 PM | Permalink
      
      Up thread you can see that 13 of the of the 21 proxies give a pass at p=0.05, without detrending, but we don’t know how many of the 41 eliminated proxies would also have pass. We don’t know because the data is not publicly archived. If the data were publicly available then we could do the determination.
      
      However, I am still not sure what your point is, if you indeed have a point.
      In the publication under discussion the authors stated the methodology they used for selection, and stated why they used this methodology rather than some other method. The methodology used to select a sub-set of data for analysis is bound to biased, that is the whole point of the exercise.
      The authors were quite confident that the best way to select temperature responsive series was to use detrended data. If that is the selection criteria, then all the series they use must be examined in this way. One cannot, post-hoc, suddenly declare ‘no, we will use some other criteria because it gives a result more to our liking’.
      
      Imagine if a pharmacology company was running a drug trial and found at the end of a year that six of the 24 patients taking the drug had died. Later in the post-hoc analysis phase they found that of the six dead there were three of Polish ancestry, two Hungarian and one Austrian. You think the company could just remove all Eastern Europeans from the drug and placebo groups and report a 100% success rate for their drug?
      
      That is an example of post-hoc selection bias. Selection bias is an intrinsic human trait. Our brains have evolved for pattern recognition; the seeing of ghosts and demons in the shadows cast by the camp fire is the price we pay to be able to spot a carnivorous lion in the grass. You live a longer life if you have a huge number of false positives than you do by having a single false negative.
      Because of our innate human bias to find patterns, scientists must design procedures to remove themselves from the selection of data, or from its interpretation. Drug companies have to run double blind trials, so no one knows who has been given a drug and who has got the sugar pill until the very end.
      Selecting temperature proxies should work the same way. You accept all data sets that pass certain tests, and most importantly, the testing criteria MUST be decided before the data is analyzed.
      The authors of Gergis et al., themselves set the criteria that the proxies needed to meet before they were analyzed for a temperature reconstruction. Gergis et al., stated that the data had to be detrended. Therefore, the only selection criteria that is important is in the use of detrended data; we know this to be the case as this is what the Gergis et al., stated in their methods section.
      There is nothing more to be said.
    - Jim Bouldin
      
      Posted Jun 9, 2012 at 10:09 PM | Permalink
      
      Doc, sorry but you have no idea what you’re talking about. You’ve put your conclusions before your analysis. There ain’t much I can say that would help you.
    - Steven Mosher
      
      Posted Jun 9, 2012 at 10:23 PM | Permalink
      
      Jim,
      
      what Doc is arguing is not about putting a conclusion before the analysis. quite the opposite.
      he is arguing this.
      
      1. The authors decided ahead of their test that they should detrend. fair enough.
      a) mosher sidebar: this decision carries an unstated uncertainty.
      2. They performed their screening thinking that they used that test, and reported results.
      3. Now, they discover that they did not apply the test.
      4. They cannot get a “do over” by arguing NOW that they should not detrend.
      
      basically, the only high ground option they have is to re run everything using detrended
      and data and report the results. If somebody else wants to publish a paper using non detrended
      data, they can of course. But then comes the question: to detrend or not to detrend.
      That question cannot be answered by merely looking at differences between the two results.
      That’s a methods question and methodological uncertainty question.
      
      So, your observation about the difference between de trending and not detrending does one thing.
      it reinforces the point I make in the sidebar -a
    - Jim Bouldin
      
      Posted Jun 9, 2012 at 11:01 PM | Permalink
      
      Steven,
      
      Most calibrations are done using the annual (non-detrended) data–that’s very standard practice.
      
      Before any comments were made here–and before Steve even posted his later note in the first post saying that had calibrated with detrended data, I had read that they had done so in their paper, and wondered why, because it’s not very common. Do I think they should have explained why better than they did–yes I do. There are arguments to be made for doing so (tougher ones to make IMO for sure), and others for not doing so, and if I had been a reviewer I would definitely have required them to explain exactly why they did that, more than just what they stated about how the global warming signal over the last 100 years could inflate the significance of the calibration relationship. I didn’t understand that at all.
      
      If it turns out that they used the trended data instead of the detrended for determining the calib. relationships, then I’m fine with that, because there’s nothing obviously wrong with that. If it turns out the problem is deeper, then they will have to re-do their analysis. It happens.
    - Nick Stokes
      
      Posted Jun 9, 2012 at 11:11 PM | Permalink
      
      “They cannot get a “do over” by arguing NOW that they should not detrend.”
      I don’t agree. There’s clearly an ambarassment in backing down from the argument that they made for detrending. But in fact it isn’t a good idea, isn’t customary, and furthermore won’t work here. It’s not a novel situation.
      
      The data exists and is (mostly) public. As you say, someone else could come and do the analysis without detrending. But it’s just arithmetic, and I don’t see how it matters who does it. In fact, they would be doing the arithmetic that’s already been done.
    - DocMartyn
      
      Posted Jun 9, 2012 at 11:26 PM | Permalink
      
      Jim may I draw you attention to the ‘Monty Hall Paradox’. This is probably the best example of the statistical effect of discarding samples.
      
      ‘Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1 [but the door is not opened], and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?’
      
      The contestant should always switch to the other door. If the car is initially equally likely to be behind each door, a player who picks Door 1 and doesn’t switch has a 1 in 3 chance of winning the car while a player who picks Door 1 and does switch has a 2 in 3 chance. The host has removed an incorrect option from the unchosen doors, so contestants who switch double their chances of winning the car.
      
      http://en.wikipedia.org/wiki/Monty_Hall_problem
    - Freddy
      
      Posted Jun 9, 2012 at 11:57 PM | Permalink
      
      Everything is obvious with hindsight. Ex post arguments for which technique, trended or detrended, is correct have little credibility. If the correct method is to use trended data then this needs to be applied to new out of sample data not data for which the answer is already known.
    - HAS
      
      Posted Jun 9, 2012 at 11:57 PM | Permalink
      
      Nick Stokes Jun 9, 2012 at 11:11 PM
      
      It sort of depends what error they are putting their hand up for.
      
      If they are saying whoops we made a mistake we accidentally used the wrong data, then they can’t really run the experiment any different, report the results and draw whatever conclusions follow.
      
      If on the other hand they want to rethink their methodology i.e. they now believe that de-trending was the wrong thing for methodological reasons, I guess they need to argue that case to the community, just re-doing the sums won’t be enough.
  - Nick Stokes
    
    Posted Jun 10, 2012 at 12:41 AM | Permalink
    
    HAS,
    “I guess they need to argue that case to the community”
    
    It’s not a novel situation. Screening proxies is a basic part of the analysis, and the community has been doing it for twenty years or more, mostly, as Jim Bouldin says, without de-trending. Ross quotes Ammann and Wahl saying why de-trending is a bad idea. G et al have inadvertently used the customary method.
    
    Since they have promoted their alternative, which is more demanding, there’s plenty of egg on face in returning to the main path. But it’s an option.
    
    And Freddy, I think you have a different situation in mind. There is a limited set of published proxies available – G identified 62. All that anyone is likely to do at present is take those 62 and do a screening with or without detrending. It’s not as if they are selecting from a vast range of ways to proceed.
    - Freddy
      
      Posted Jun 10, 2012 at 1:47 AM | Permalink
      
      If there is such a limited set of data available then researchers need to be extremely careful not to make mistakes in methodology. Lack of appropriate out of sample data does not strengthen the validity of results obtained from ex-post determined methodology.
    - HAS
      
      Posted Jun 10, 2012 at 2:27 AM | Permalink
      
      Nick Stokes
      
      “Screening proxies is a basic part of the analysis.”
      
      I must say that in trying to come to grips with all this I find this is one of the things I have difficulty getting my head around. If I have it right we need to reduce the dimensionality of the model to prevent over fitting, and hence proxies get screened (correct?).
      
      Why not introduce more independent variables to better model the proxy performance? It might make hindcasting a bit more difficult (but perhaps honest). Is the problem that proxies don’t come with enough other environmental information even over their recent history? Even given this would we not be better off looking to models of species growth derived under controlled environments as a basis for fitting than mining the data? If we do need to mine the data why not reduce the dimensionality by random selection of proxies rather then non-random?
      
      Just out of interest I must have a look at what other signals have been got out of these proxies.
      
      So much to learn so little time.
    - Nick Stokes
      
      Posted Jun 10, 2012 at 3:16 AM | Permalink
      
      HAS,
      I don’t think it’s to do with dimensionality or overfitting. It’s much simpler. Some time series have been observed. There’s a hope that those that correlate with instrumental temperature in the overlap period will maintain the same relationship in earlier times. So you screen to see which ones do correlate to some specified degree, and select them. It’s just searching for measurable properties in a known set of series.
    - HAS
      
      Posted Jun 10, 2012 at 3:27 AM | Permalink
      
      Ta.
      
      if so that suggests it’s even more problematic than I thought. Must read a bit.
    - Hu McCulloch
      
      Posted Jun 10, 2012 at 8:37 AM | Permalink
      
      Nick Stokes
      
      Posted Jun 10, 2012 at 12:41 AM
      ….
      Screening proxies is a basic part of the analysis, and the community has been doing it for twenty years or more, mostly, as Jim Bouldin says, without de-trending
      ….
      
      While calibration should be done without detrending, screening the proxies first without detrending distorts the statistical results enormously, and in this case works to generate hockey sticks.
      
      As I have noted earlier, one way to correct the statistics is to include all 62 candidate proxies in the numerator degrees of freedom of the F test of joint significance of the full network. However, with only 70 observations (and therefore only 7 denominator DOF), this creates an impossibly high threshold.
      
      One solution to this dimensionality problem, which is sort of pursued by Gergis et al, is to first reduce the rank of the network to a few PCs. But this should be done on the unscreened network, not after screening on the variable of interest as Gergis et al ended up doing. And an appropriate method should be selected and reported, not buried in 3000 unspecified randomly generated variations.
      
      Screening on the detrended series, as Gergis et al said they were doing, would likely eliminate a spurious HS in the result, since its signature is a strong 20c uptrend, which would be removed. However, it will still bias the confidence intervals, since it will spuriously improve the fit of the proxies to the higher frequency portion of the target series. So even if Gergis et al “correct” their paper, it will still be wrong.
      
      Your statement that the community has been pre-screening routinely for 20 years or more is a profound condemnation of its statistical methods.
      
      Another solution to the dimensionality problem is Partial Least Squares, as used in the recent Kinnard et al. Arctic Ice paper discussed some here last Dec. This cherry picks and overfits with a vengeance, but, at least as exposited by Hermann Wold and Svante Wold, then attempts to evaluate the results for spurious fit with a cross-validation Q2 statistic. Perhaps there will be further discussion of the Kinnard paper here in the future, where this can be taken up in more detail.
    - ferd berple
      
      Posted Jun 10, 2012 at 10:28 AM | Permalink
      
      Nick Stokes
      Posted Jun 10, 2012 at 3:16 AM | Permalink
      So you screen to see which ones do correlate to some specified degree, and select them. It’s just searching for measurable properties in a known set of series.
      ========
      Not correct. What the process does is select for those trees that accidentally match the short term average, and thus gives extra weight to the noise, while giving the appearance of increased confidence in the result.
      
      Thus we see a trend during the period used for screening, and little or no trend outside the period of screening. This shape has become known as the hockey stick.
      
      It is valid to use a screening method like this when you are for example calibrating true thermometers, that actually do only respond to temperature, and you want to eliminate those that are non functioning. However, when you have trees that respond to many different factors, and respond non-linearly to temperature, you cannot use this method reliably.
      
      For example. Tree growth is inhibited by temperatures that are both too high and too low. However, climate science assumes that high temperatures always lead to increased growth and low temperatures always lead to reduced growth.
      
      Thus, a tree that appears to correspond to low temperatures may in fact be responding to high temperatures. There is no way to know this during the screening. Other trees that appear to be responding to temperature may in fact be responding to moisture. Again there is no way to know this during the screening.
      
      Steve: fred, in your commentary, please distinguish between “trees” and “site chronologies”. The Gergis screening is at a site chronology level – not at the level of individual trees, all of which go into a site chronology. The screening point stands at a chronology level – not at an individual tree level. This distinction needs to be made to persuade a Jim BOUldin.
    - HAS
      
      Posted Jun 10, 2012 at 4:54 PM | Permalink
      
      If screening is done simply to identify a temp signal in a proxy I assume the model being fitted (I find it much easier to think in these terms) is one where each proxy can be decomposed into a tmep and non-temp related component, and screening is testing if the coefficient on the temp related variable for each proxy is significantly different from “0”. Hence some of the suggestions about using partial least squares etc etc. Correct?
      
      I have a prior problem though. Given that these site chronologies have already been constructed on the basis of the individual tree rings showing some relationship to contemporary instrumental temperature series somewhere on the globe (or SH), just what is this second step doing? We know that the chronologies are sensitive to temp (that’s been designed in) so is not its primary purpose just to test the relationship between the various local temps and a regional construct? The problem with screening the chronologies is that we don’t know whether we are screening on the proxy to temp relationship or the relationship between local temp and regional construct.
      
      Wouldn’t we have increase the signal to noise ratio in this study if it had just reported the various chronologies and their local reconstructions (with errors), and reported on how well the local instrumental temp series correlated with the regional construct (with errors), and reported the cumulative errors from each inference along the way?
      
      What additional information is being provided by the study? (I can see a lot more noise in it.)
- Kenneth Fritsch
  
  Posted Jun 9, 2012 at 8:59 PM | Permalink
  
  “So, once again, if you are proposing that a random, red noise process with no actual relationship to the environmental variable of interest (seasonal temperature) causes a spurious correlation with that variable over the instrumental period, then I defy you to show how such a process, with ANY level of lag 1 auto-correlation, operating on individual trees, will lead to what you claim. And if it won’t produce spurious correlations at individual sites, then it won’t produce spurious correlations with larger networks of site either.”
  
  Jim, I think the Gergis excercise has shown exactly why ARIMA/ARFIMA model does apply here. When doing the detrended high frequency wiggle matching correlations we see a degraded relationship compared to not detrending. The difference in this case is that the proxies can better trend match than wiggle match. If one can show the low frequecy response of the ARIMA/ARFIMA model can show series ending trends (upwards and downwards) where no deterministic trends are added to the models then the simple indication is that the low frequecny response of proxies can fit the low frequency response of the models. Proxy series can be well modeled using ARIMA and ARFIMA.
  
  You have steadfastly hung your complaints on the high frequency matching of a proxy to temperature and that the models cannot do that. You are right. Proxies can respond in step to temperature but before the proxy is a valid thermometer it has to match or at least approximate the amplitude of those wiggles. In fact a good wiggle match by a proxy with temperature in time but not with amplitude is not a good indicator of a valid thermometer and can readily result in a reconstruction with divergence.
  
  Trend matching is what Gergis did by mistake and if one is allowed to do trend matching and those trends can occur by chance as in the models then selecting for trends invalidates the statistics of reconstructions.
  
  Think of the models as describing a low frequency response in the proxies with the higher frequency response of proxies replacing the noise in the models.
  - Jim Bouldin
    
    Posted Jun 9, 2012 at 9:57 PM | Permalink
    
    “Jim, I think the Gergis excercise has shown exactly why ARIMA/ARFIMA model does apply here. When doing the detrended high frequency wiggle matching correlations we see a degraded relationship compared to not detrending. The difference in this case is that the proxies can better trend match than wiggle match. If one can show the low frequecy response of the ARIMA/ARFIMA model can show series ending trends (upwards and downwards) where no deterministic trends are added to the models then the simple indication is that the low frequecny response of proxies can fit the low frequency response of the models. Proxy series can be well modeled using ARIMA and ARFIMA.
    
    You have steadfastly hung your complaints on the high frequency matching of a proxy to temperature and that the models cannot do that. You are right. Proxies can respond in step to temperature but before the proxy is a valid thermometer it has to match or at least approximate the amplitude of those wiggles. In fact a good wiggle match by a proxy with temperature in time but not with amplitude is not a good indicator of a valid thermometer and can readily result in a reconstruction with divergence.
    
    Trend matching is what Gergis did by mistake and if one is allowed to do trend matching and those trends can occur by chance as in the models then selecting for trends invalidates the statistics of reconstructions.
    
    Think of the models as describing a low frequency response in the proxies with the higher frequency response of proxies replacing the noise in the models.”
    
    Kenneth, I strongly disagree with you here, on several points.
    
    The issue of the temporal scale at which to calibrate, and the amplitudes of the variables involved in such calibrations is an interesting one and deserves more examination, but that’s not the principal issue here. I’ll return to it in a moment.
    
    The principal issue here is replication. Calibration is always performed against site *chronologies* (or groups of such, or principal components from groups of such). They are *never* performed on single trees or cores. Each site chronology derives from a set of typically, at least 20 cores, sometimes (as in this case) many more than that.
    
    An ARMA, ARIMA, or ARFIMA model will *not* except in the most extreme unlikelihood and with extremely high lag 1 ac, produce a multi-decadal “taiiing up or down” of a site chronology. It might occasionally do it for a single core, depending on the ac value, but essentially never for a typical collection of 20 to 30 cores. It that’s possible, then I want to see the definite proof of it, because all of these “screening fallacy” arguments hinge directly upon this assumption, and nothing even remotely approaching proof of it has been offered.
    
    Calibration using annual data does not necessarily favor either low or high frequency correlations; only when you filter or detrend the data are you specifically targeting one frequency response. You can calibrate at the annual scale (what you call “high frequency”) and get a very strong relationship with a climate trend over say 100 years, but a very poor relationship with the detrended data (what you call wiggle matching). The reason for this is very simple–the individual yearly indices of the cores have a lot of noise due to individual tree responses, each of which causes some deviation from the climate driver each year, but the collective response of which +/- tracks the longer term trend.
    
    Conversely, one could conceivably also get a very strong correlation using the detrended data, but a poor one at the longer, century scale, if for example temperature is acting as a seoondary, and largely independent, driver to something else. However, the other way you can get this, is if there is some bias in the detrending of the age/size effect from each tree, one which systematically affects the estimates of low frequency variance, as for example in the well known “segment length curse” where some low frequency variation is lost, but the high frequency variation is all retained.
    
    Trend matching is not what Gergis et al did by mistake, but rather “detrended matching”, i.e. they regressed the climate residuals on the chronology residuals (or at least, stated that they did. in the manusciprt). That was the error.
    - stevepostrel
      
      Posted Jun 9, 2012 at 10:32 PM | Permalink
      
      @Jim:
      
      1. All the trees in one location could easily be subject to common non-temperature influences that follow an ARIMA process of some sort.
      
      2. I don’t see why trees couldn’t have very high AR coefficients even if each tree were i.i.d. from the others.
    - Jim Bouldin
      
      Posted Jun 9, 2012 at 10:47 PM | Permalink
      
      Steve,
      
      #1 yes I agree with that. It’s just not due to the red noise process that has been argued. At dry sites you have to be particularly conscious of water, and CO2 fert is a potential issue also.
      
      #2 they can, I’m not arguing that they can’t. I’m arguing that even a very high AR coefficient will not produce a *chronology* that spuriously correlates with climate over your typical 100 year period (and also passes calibration/validation) screening: the process is random from tree to tree, by definition so they’re not going to tail up or down in concert for any significant length of time, like a few years at best.
    - Layman Lurker
      
      Posted Jun 10, 2012 at 12:35 AM | Permalink
      
      Jim, any chronology which follows a signal other than temperature is effectively “noise” when calibrated against temperature isn’t it?
    - Hu McCulloch
      
      Posted Jun 10, 2012 at 7:50 AM | Permalink
      
      Jim Bouldin
      Posted Jun 9, 2012 at 10:47 PM
      …. I’m arguing that even a very high AR coefficient will not produce a *chronology* that spuriously correlates with climate over your typical 100 year period (and also passes calibration/validation) screening: the process is random from tree to tree, by definition so they’re not going to tail up or down in concert for any significant length of time, like a few years at best.
      
      Jim —
      The average of 2 identical and independent AR(1) processes is again AR(1), with the same AR coefficient. So if individual trees have AR(1) excentricities, the site average will also be AR(1) (though with smaller variance).
      This of course leaves the interesting question, of whether the persistence in the average is due to temperature, or to precipitation, ground water, CO2 fertilization, logging, etc.
      CO2 is easy enough to control for. It’s amazing that studies like this don’t use it. (MBH99 only paid lip service to it — search Mannkovitch Bodge.)
    - Armand MacMurray
      
      Posted Jun 10, 2012 at 8:44 AM | Permalink
      
      Jim, if I understand this correctly, I think Steve Mc’s original emphasis on red noise is confusing you.
      
      If I understand your argument, you argue that since multiple cores per tree are sampled, and large numbers of trees per site (tens? hundreds?) are sampled, then any process operating randomly on a per-tree basis will essentially be averaged out once a site-wide chronology is assembled from the various cores.
      
      However, in your reply “#1” above, you agree that many/all the trees at a site might be *jointly* affected by an effect independent of the desired climate variable (temp in this case). You mentioned water and CO2 fertilization; crowding, general fertilization and disease have been mentioned by others.
      
      Thus, you should agree that when looking at a set of *chronologies* from different sites, the chronologies might indeed differ from each other on the basis of crowding, disease, and so forth varying from site to site, independent of local or distant temperature histories. In fact, non-temperature-caused variation could very well be the dominant variation in these chronologies. Much of this non-temperature-caused variation is likely to be autocorrelated, which is why Steve Mc and others have talked about *modeling* such a set of chronologies by using red noise. The red noise is used to model chronologies, not individual trees, so your earlier correlation arguments wouldn’t apply against them at this stage.
      
      Presumably, there would be no a priori way of identifying the relative magnitude of the contributions of temperature, water, crowding, Boy Scout camping or other factors to the tree ring width or density at any point in a given chronology, so no way to a priori choose cores or even whole site chronologies whose variations are due significantly to changes in temperature.
      
      OK, now you’ve got a set of chronologies as candidates for your reconstruction. If you screen them by looking simply at raw correlation with *non-detrended temps* during the instrumental temp period, what sorts of chronologies are selected for your reconstruction? (A) Clearly, chronologies with strong correlation to the instrumental temperatures come through; this is what you’re hoping for! (B) Unfortunately, you also select for chronologies that match the linear upward trend in the 20th century instrumental temps, but due to some other influence than temperature (water, bugs, crowding, so forth).
      
      This is the Screening Fallacy, since you are selecting not only chronologies that you want, you are also selecting a second group pretty much guaranteed to generate a classic Hockey Stick, although they are in fact independent of temperature. For all you know, group (B) greatly outnumbers group (A). Remember, you’re only screening the far right end (modern era) of each chronology, because that’s where the only instrumental records are available. Thus, the selection process pretty much guarantees that the selected chronologies (whether group A or group B) will have upward slopes (Hockey Stick “blades”) at the far right end, in the modern (selected-for) era.
      
      The more that the variation over pre-instrumental time *within* chronologies is caused by non-temperature factors that are not correlated between sites, the more that pre-instrumental era variation will be essentially averaged out when the chronologies are combined into a single reconstruction. Thus, a relatively low-variation pre-instrumental era “stick” of such a reconstruction may just mean that the chronologies used in its making had weak correlations with temperature, not that the true temperature during the pre-20th century era did not vary much.
      
      Gergis’ detrending procedure was presumably designed to avoid this Screening Fallacy, hoping that true temperature-correlated chronologies (Group A), once detrended, would still match the high-frequency yearly variation of the instrumental record enough to produce correlations high enough to be retained for use, while other chronologies (Group B) would be rejected.
    - Kenneth Fritsch
      
      Posted Jun 10, 2012 at 9:14 AM | Permalink
      
      Jim Bouldin, you say:
      
      “Trend matching is not what Gergis et al did by mistake, but rather “detrended matching”, i.e. they regressed the climate residuals on the chronology residuals (or at least, stated that they did. in the manusciprt). That was the error.”
      
      David Karoly in his email to SteveM says:
      
      “While the paper states that “both proxy climate and instrumental data were linearly detrended over the 1921–1990 period”, we discovered on Tuesday 5 June that the records used in the final analysis were not detrended for proxy selection, making this statement incorrect.”
      
      I think your statement needs to be reconciled with Karoly’s before we can have further intelligent conversation about this issue.
- Oakwood
  
  Posted Jun 10, 2012 at 1:34 AM | Permalink
  
  Jim Boulidin: “I don’t know why they calibrated on detrended data (if in fact they did, which is uncertain now). There are reasons to be made both for and against such a practice, regardless of how common it is. But I do know that if they actuall calibrated without detrending, then (1) my arguments above apply and (2) both Nick Stokes and Kenneth Fritsch have provided their calculations indicating that a high percentage of their 27 sites would have passed a p = .05 screening criterion.”
  
  We all really know what’s going to happen. Gergis et al will say ‘we found the mistake’. Nothing to do with the science or methodology. We just made a mistake in the write-up. We didn’t mean to say “calibrated against de-trended data”. Take that out and nothing else changes. Then a lot of rallying round by the climate science establishment to say ‘come on, just a little slip that anyone could make, whipped up by the deniers.’ In the meantime, pressure will be put on the journal to retract or tone down the ‘thanks’ given to CA.
- Paul Matthews
  
  Posted Jun 10, 2012 at 5:23 AM | Permalink
  
  Bouldin’s comments are ridiculous. He says he doesn’t know why they calibrated on de-trended data. If he’d bothered to read the paper, he’d know – it was to avoid the ‘spurious hockey stick effect’.
  He also says it is ‘uncertain’ whether they actually used de-trended data or not. No it isn’t, see the numbers posted up by various people here and the email from Karoly. And that’s why they pulled the paper.
  
  snip – [please don’t make commens like that
  - Nick Stokes
    
    Posted Jun 10, 2012 at 5:50 AM | Permalink
    
    Well, I can’t understand it either. What they say is:
    “For predictor selection, both proxy climate and instrumental data were linearly detrended over the 1921–1990 period to avoid inflating the correlation coefficient due to the presence of the global warming signal present in the observed temperature record.”
    
    But the global warming signal is temperature, and a temperature proxy should correlate with it, just as with any other temperature movement. I don’t see why it is described as inflating.
    - Armand MacMurray
      
      Posted Jun 10, 2012 at 8:53 AM | Permalink
      
      Nick, yes, that wording doesn’t make sense. They should have written something like “…to avoid inflating the correlation coefficient due to non-temperature-caused proxy trends spuriously matching the global warming trend present in …”.
    - DocMartyn
      
      Posted Jun 10, 2012 at 1:36 PM | Permalink
      
      Nick, my reading of it is that they believed they were fitting to the 1921-1990 complex line-shape, not to the post-1960 linear(ish) rise.
      As the 1920-1990 temperature is complex, a sort of ^^/ shape, rather than the post-1960 /, then they should have a better chance of picking biological processes that act as thermometers.
Manfred

Posted Jun 9, 2012 at 7:15 PM | Permalink

Is this “main” error the same error as in MBH98/99, a method that generates hockey stick’s even from noise numbers ?

If so, why did Michael Mann, creator of the first Hockey Stick, not withdraw his papers, as they are still circulated and possibly even used in expensive decision making ?
R.S.Brown

Posted Jun 9, 2012 at 8:12 PM | Permalink

Steve,

I believe Ms. Gergis’s snooty and snotty closing in her letter to you refusing data stating:

“We will not be entertaining any further correspondence on the matter.”

employed the royal “we”… not an inclusve statement covering her co-authors.

After interviews at the advent of publication her current silence if deafening.
Tom Anderson

Posted Jun 9, 2012 at 10:01 PM | Permalink

Steve, please don’t consider this piling on. I simply do not believe that someone or more than one author did not know that they described their procedure incorrectly (lied). I believe that the fact that their screened data was not detrended was known, before discovered by Jean S. These people are smart enough to know better. Too many authors for one or more not to notice. They let it fly to give the paper more (cred) credibility. Too much at stake here to not have a SH paper with a hockey stick.

I have been around the block a couple times and this is my gut feeling on the situation. Don’t care if anyone else is on board but my gut has been pretty damn good.
- Armand MacMurray
  
  Posted Jun 10, 2012 at 9:09 AM | Permalink
  
  It can be hard to distinguish between malfeasance and error without inside info. This type of error has nothing to do with being smart enough; as it is so obvious, basic, and trivial, (my guess is that) it had to do with a lack of organization or attention to detail. This error is the equivalent of walking out of the house on your way to a job interview having forgotten to put on your pants. The longer you go without noticing that, the worse for you, not the better. No-one in their right mind could expect to get away with that!
  
  Presumably, the detrending was to be accomplished by a command line switch that didn’t get set, or the non-detrended and detrended files were mixed up when passed on to the person performing the screening analysis. There wouldn’t necessarily be anything obvious in the file data itself to show if it was detrended or not, and the reconstruction itself used non-detrended data as planned, so it’s really only by doing some analysis of the actual data files as they were about to be screened that one could have spotted the error.
- John M
  
  Posted Jun 10, 2012 at 9:39 AM | Permalink
  
  Too many authors for one or more not to notice.
  
  I suspect they didn’t notice because there were too many authors.
  
  My guess? They were trying everything they could to get a HS. When one of the authors managed to get one to pop up on his/her computer screen, the team ran with it and the actual technique used got lost in the rush. The described methodology was then written around the results and gussied up with appropriate justification for the stated technique.
  
  Note that this scenario doesn’t speak to motive as much as it argues confirmation bias, coupled with “Paper Writing 101” skills.
ferd berple

Posted Jun 9, 2012 at 11:27 PM | Permalink

As readers have noted in comments, it’s interesting that Karoly says that they had independently discovered this issue on June 5 – a claim that is distinctly shall-we-say Gavinesque (See the Feb 2009 posts on the Mystery Man.)
===========
Oz is after all 14 hours ahead of Canada, so by reading Canadian Blogs they are able to discover all sorts of things as much as 14 hours sooner than anyone in Canada.
ferd berple

Posted Jun 9, 2012 at 11:53 PM | Permalink

Jim Bouldin
Posted Jun 9, 2012 at 6:29 PM | Permalink | Reply
So your contention that this study falls four square within this so called “Screening Fallacy” is just plain wrong,
=============
By your logic, if we used the twenty five states that best matched the US average GDP over the past 10 years, these states would then provide a better proxy for US GDP over the past 100 years, as compared to using all 50 states to compute the average over the past 100 years.

This is a mathematical nonsense. Over the 100 years, the average of all 50 states will exactly match the average, while the 25 that best matched the average over the past 10 years are very unlikely to match the 100 year average.

Take any time series you like, where the full history is known, it is a trivial exercise to shows that screening for samples that best fit the short term mean as a proxy for the whole reduces the accuracy of the prediction of the long term means.

That is why the process is a screening fallacy. The fallacy is that a sample that matches the short term mean is better able to predict the long term mean than the entire population. No sample can predict the long term mean better than the full sample, because the full sample itself is the long term mean. By definition you can’t improve on its accuracy.
- Steve McIntyre
  
  Posted Jun 10, 2012 at 12:01 AM | Permalink
  
  fred, your link to the article about “screening on the dependent variable” is interesting. I’ll probably post on it.
  - ferd berple
    
    Posted Jun 10, 2012 at 12:08 AM | Permalink
    
    Please do, I look forward to reading it.
    - ferd berple
      
      Posted Jun 10, 2012 at 12:34 AM | Permalink
      
      I tried posing the information to RC but they boreholed the link.
      
      link to paper
      
      Click to access Geddes1.pdf
  - ferd berple
    
    Posted Jun 10, 2012 at 12:57 AM | Permalink
    
    This looks like the original publication
    
    “How the Cases You Choose Affect the Answers You Get: Selection Bias in Comparative Politics,” Political Analysis 2 (1990), 131-50.
    
    Click to access geddes_cv_08.pdf
- maxberan
  
  Posted Jun 10, 2012 at 6:44 AM | Permalink
  
  I don’t think this analogy fits the bill here though as we have no a priori knowledge of the true relationship nor is it one where the entire variance of the dependent variable can be explained by any imaginable model.
  
  My posting of 12:20 on June 9 outlines a framework for a closer simulation of the actual process of historical reconstruction using correlated proxies.
  
  I also would emphasise that the screening fallacy (as I understand it) relates to the overstatement of the confidence that is then attached to that prediction by virtue of its neglect of hidden sources of variance rather than relating, as you are suggesting, to the predicted values themselves.
- Tom C
  
  Posted Jun 12, 2012 at 12:50 PM | Permalink
  
  Fred B – This is an excellent analogy.
ferd berple

Posted Jun 10, 2012 at 12:04 AM | Permalink

This sort of screening as used in tree proxies has been widely discredited through its wide-spread use in social sciences where is led to numerous spurious correlations.

Selecting trees that appear to correlate with temp leads to a false confidence that the trees are responding to temp. Only by including the trees that do not appear to correlate do you properly account for the other factors that can affect tree growth, and thus get an accurate measure of the confidence.

The simple fact is that trees do not respond to temperature, as though they were thermometers. Trees respond to optimum temperatures. Not too hot, not too cold. So slow growth may mean cold, or it may mean hot.
- John Whitman
  
  Posted Jun 10, 2012 at 9:34 AM | Permalink
  
  ferd berple; posted Jun 10, 2012 at 12:04 AM
  
  – – – – – – –
  
  ferd berple,
  
  Would you consider the terms ‘screening fallacy’, ‘selection bias’, ‘begging the question fallacy’ and ‘circular reasoning fallacy’ to all mean the same?
  
  To me they all appear to be describing the same fundamental error in argumentation.
  
  BTW, your explanations of statistical issues in prose are appreciated.
  
  John
- GeorgeGR
  
  Posted Jun 15, 2012 at 4:50 AM | Permalink
  
  Basic, I know, but all screening is based in the presumption that trees do respond to temperature as if they were thermometers. But they are not. That is the first fallacy in my opinion. The screening fallacy is No. 2 in line.
  
  A banal example, but still: In my garden I have a hedge with about 60 beech trees. They were all planted at the same time, some 30 years ago. They have been given the exact same care. Still, they all vary wildly in width; some are more than twice the size of the one next to it. Did the large one experience hotter weather than the one next to it? I don’t think so. What does this tell us about chronologies? Only that growth factors not only differ from area to area, but from tree to tree in the close vicinity. A lot of noise.
  
  A larger selection of trees (a chronology) does not in any way make it easier to tease out and isolate the temp signal alone, all though you may find trends in the growth rate of that selection of trees. Hot wetter does not necessarily mean a larger tree ring width if there is scarcity of other growth factors. In hot weather over time, water is usually a scarcity factor. Best growth is in temperate weather with good access to the right mix of sunshine, water and other nutrients. Trees may have sweet spots in the temperature zone, but also other factors influence growth. How temp influence on a tree relates to other changing factors is not easily understood or checked, especially on a 5 hundred year old tree.
  
  Liebig’s law comes in to play and may affect the whole chronology:
  
  A look at treemometers and tree ring growth
  
  So screening based on the temperature record will most likely not give you a set of chronologies that are temp sensitive and temp sensitive only. You may extract what seems like a signal from that chronology, but you do not know if that is a temp signal or a mixed signal. You will by screening against the temp record find those that correlate, but no clear and constant causation with temp over time can be established from that.
  
  A dumb question maybe, but are the chronologies screened as well? How is the decision made whether a tree should be included in the chronology or not?
Latimer Alder

Posted Jun 10, 2012 at 2:36 AM | Permalink

Forgive me if I am being over-simplictic, but I think I took the gist of this discussion as that the Australian guys fell into a fairly simple methodological trap and our gallant researcher JeanS was able to spot this before all the Kings Horses and All the Kings Men of the Climate Establishment noticed. Cue much embarrassment all round in the Augean Stables. Especially from the lady with the entirely unnecessary and credibility-limiting/ending ‘We call this ‘research’ and we will not entertain any more correspondence’ remarks.

If I have understood correctly, would it not be a good idea for the Climate Auditors to do a quick check of every other possible paper that may have used the same method…and fallen into the same trap?

Since every published scientific paper is, in theory at least, capable of being reproduced and replicated, there should not be any procedural obstacles placed in the way by the authors or their colleagues. And if there are it would be instructive to have them all documented in public….a website like this one would be a good place.

The climate establishment do not seem to have done a very thorough job on ‘peer-reviewing’ this paper…it seems likely that there are others. Many readers here (me included) have long called for science to be done properly and openly, not shoddily stitched up behind closed doors. Here is a golden opportunity to demonstrate in practice what it would mean.

Happy to lend my skills (A level Maths a long time ago, Science Masters (not much stats, lots of experiments ), and a bit of practical project management to such an endeavour.

We could always call it ‘research’ 🙂
- Nick Stokes
  
  Posted Jun 10, 2012 at 4:12 AM | Permalink
  
  No, there’s no subtle trap. They just made some kind of computer error so that they thought they were detrending but weren’t.
  - Venter
    
    Posted Jun 10, 2012 at 6:11 AM | Permalink
    
    [snip – gratuitous insult]
    - Nick Stokes
      
      Posted Jun 10, 2012 at 6:23 AM | Permalink
      
      Well, what do you think the error was?
    - Venter
      
      Posted Jun 10, 2012 at 6:50 AM | Permalink
      
      [snip – gratuitous insult]
    - Richard Drake
      
      Posted Jun 10, 2012 at 8:52 AM | Permalink
      
      If the host is watching the last comment could do with snipping and perhaps the previous one by the same poster. Nobody known on CA (and I include those like Jean S and Roman whose identity is no secret to the host) would say such a thing. Yet the reputation of CA is adversely affected, just as the person responsible’s isn’t. Not the best trade for anyone who thinks something of real importance is at stake here.
      
      [mod. – thanks]
  - mrsean2k
    
    Posted Jun 10, 2012 at 7:07 AM | Permalink
    
    Some accuracy please: Latimer speculates it was “a fairly simple methodological trap”, not a “subtle trap”.
    
    Subtle, it is not.
    
    If I’d made a similar mistake in my own field, I wouldn’t blush to call it a stupid mistake, and to acknowledge that if a disinterested 3rd party had tested it, there’s a good chance it would have been caught.
    
    The question is, if these things aren’t open to disinterested parties, how prevalent are stupid mistakes in general?
    
    We now have at least one datapoint that shows in a closed system of reviewers, stupid mistakes can make it into publication after months of effort.
    - Nick Stokes
      
      Posted Jun 10, 2012 at 7:30 AM | Permalink
      
      It’s not a trap of any kind. It is indeed a stupid mistake, and I don’t think Karoly was pretending otherwise. But there’s no reason to think, as LA seems to, that this mistake affects papers by other people. They may well have made mistakes too (it happens), but they won’t be related.
    - mrsean2k
      
      Posted Jun 10, 2012 at 8:27 AM | Permalink
      
      @Nick Stokes
      
      When the procedure cannot be reproduced by disinterested (or even hostile) 3rd parties, there’s no reason to trust the same (or similar) mistakes have not been made, either.
      
      Standard peer review clearly didn’t cut it in this case, and we’re agreed it was “stupid”.
    - Nick Stokes
      
      Posted Jun 10, 2012 at 8:35 AM | Permalink
      
      “When the procedure cannot be reproduced …”
      
      How do you think it was discovered?
    - RomanM
      
      Posted Jun 10, 2012 at 8:56 AM | Permalink
      
      Amazing! I didn’t realize that everything done in the paper had been reproduced. Perhaps, you could show exactly how they calculated the annual summer HadCrut3v temperature series and the grid temperature PCs. I couldn’t find the descriptions in the paper. 😉
    - mrsean2k
      
      Posted Jun 10, 2012 at 8:43 AM | Permalink
      
      @Nick Stokes
      
      The full sentence:
      
      “When the procedure cannot be reproduced by disinterested (or even hostile) 3rd parties”
      
      Nick, do you believe that your selective quotation fairly and accurately conveys the point I was making?
    - Latimer Alder
      
      Posted Jun 10, 2012 at 8:51 AM | Permalink
      
      Re: mrsean2k (Jun 10 07:07),
      
      Nick Stokes says:
      
      But there’s no reason to think, as LA seems to, that this mistake affects papers by other people. They may well have made mistakes too (it happens), but they won’t be related.
      
      Umm…seems to me – viewing from the outside – that this is a mistake, or member of a class of msitakes, that the climatological establishment are not well-attuned to spotting. The paper in question is seen as one of the most important of recent times for the Southern Hemisphere. There were senior guys involved in creating it and a fair chunk of change ($300K +) was spent on it. One might reasonably hope that some supposedly heavyweight people did the ‘peer-review’ and that they did a more serious job than idling over it while watching the latest episode of Neighbours, feeding the cat and keeping the lids away from the local kangaroos. And still, with all this ‘firepower’ thrown at it, it came to outsider ‘Jean Sibelius’ and Nick himself to note the error.
      
      Like the daft Himalayagate claim of glaciers being gone by 2035, this whole saga does not reflect well on the robustness of ‘quality control’ within climatology. So Nick may be right that we would not find the identical error in other papers. But that errors – and pretty simple ones at that – can ‘creep through’ the inadequate QC processes is now no longer a speculation but an established fact.
      
      Phil Jones of UEA, author of 200 papers, was asked by the Parliamentary Select Committee how what he did when peer-reviewers asked to see his data replied that ‘they had never asked’. Peer-review as currently conducted is clearly inadequate to detect such errors. And if I were to have an even more than normally cynical bent this afternoon, I might wonder if one of the causes of the anti-science, anti-replication and anti-openness attitudes so prevalent in refusal to publish data is that the authors know in their heart of hearts that their methods and results would not withstand really rigorous scrutiny. ‘Good enough to get a paper out of it’ is not the same as ‘Good enough to be audited by McSteve, Sibelius and Stokes’. t good enough for the IPCCto advise our governments on its coat tails.
      
      I still think that a good shakeout of all the related papers (start with all of them cited in Gergis et al (DNS/DNF) 2012) looking for similar failings would be very productive. And even more revealing would be those who refused to allow their data to be checked.
  - Latimer Alder
    
    Posted Jun 10, 2012 at 7:28 AM | Permalink
    
    Re: Nick Stokes (Jun 10 04:12),
    
    Nick Stokes
    
    Andrew Montford has written an accessible summary here
    
    http://bishophill.squarespace.com/blog/2012/6/7/another-hockey-stick-broken.html
    
    What he describes is not ‘some kind of computer error’, but a methodological boob of serious proportions.
    
    I’m sure you know from Montford’s previous work that he has a flair for being able to make complex technical stuff accessible to educated laymen like me. And it seems to hang together as a narrative.
    
    Do you have any reason/evidence to suppose that his explanation is incorrect? If so, please let us know. If not, I’m going with his until better evidence emerges. Thanks
    - Nick Stokes
      
      Posted Jun 10, 2012 at 7:39 AM | Permalink
      
      Well, I can’t figure out what his explanation explains. If it’s accessible to you, could you pass it on?
      
      The only part that made sense to me was
      “This seems to suggest that Gergis’s declaration that the correlations were based on detrended data was false”
      and Karoly has confirmed that. They meant to detrend, but mistakenly ended up doing it like most other people do.
    - Salamano
      
      Posted Jun 10, 2012 at 7:53 AM | Permalink
      
      But they also proclaimed the detrending as a better way of doing things, in case they got a spurious relationship to temperature. The way it’s been promoted (at least at first) by Jim Bouldin and these authors, was that this was a new technique that deviated from what has been typically been done because statistically it was to produce a more robust and less criticisable result. Did it not?
      
      Now it seems like they’re going to go back and just ‘do it over the usual way’ which (a) puts us back at square one of not advancing beyond the typical complaints people raise regarding tree proxies, (b) they’ll have to convince folks that the old way does not need any ‘robustifying’ even though they explicitly stated that it opens itself up for spuriousity.
      
      If they wanted to instead do what they say they initially wanted to do, more than likely they’ll have to go back to the unarchived cutting-room-floor to find out if some of the trees they’ve trashed actually qualify under the new screening methods– unless they’re just going to work backwards from the result and re-explain their methodology in light of what they actually ended up doing, rather than going back and doing what they said they did but didnt 🙂 (my guess is that in light of the result, it’s simpler to just re-write the methodology section without changing a thing with regard to the results). The original idea what that this paper was to have the added interestingness of a more robust methodology.
      
      It might be good ‘for science’ that they explain the results that would have showed up had they done it the way they said they wanted to initially, because it’s not possible for anyone else to do it themselves if the trees that didn’t qualify are not archived anywhere. It’s even more difficult if owners of any of these cores, in the interest of science, are ‘not entertaining any more correspondence on the matter’.
    - John M
      
      Posted Jun 10, 2012 at 9:01 AM | Permalink
      
      Salameno,
      
      The way it’s been promoted (at least at first) by Jim Bouldin and these authors, was that this was a new technique that deviated from what has been typically been done because statistically it was to produce a more robust and less criticisable result.
      
      I don’t know that Jim Bouldin “promoted” the technique, but it certainly is interesting to go back to the Myles Allen “Name and Shame” thread and search it for “detrend”.
      
      Remarkable how the pre-Jean-S-discovery commentary from Jim changes relative to the post discovery commentary.
    - Duster
      
      Posted Jun 11, 2012 at 2:07 AM | Permalink
      
      There are TWO issues. Gergis et al. delineate their methodology. However, by some mischance, they failed to implement the procedures specified by the methodlogy. There at least two ways this could happen – and has to others in other fields as well. 1) The methodology as written is mistaken in that author did not communicate properly with the analyst who was implementing methods and no “mistake” was made in execution, which happens way more frequently than we wold like. 2) The methodology describes the planned analysis properly, providing the – in the authors’ eyes – appropriate rationale, but the execution was flawed – how NASA managed to hit Mars with a spacecraft by mistake. Given the postponement of publication, I would suspect the latter.
      
      The OTHER issue is whether the methodology is viable in any fashion. How reasonable is it to tune any model to a chunk of data with controls at one end only? Won’t that implicitly result in increasing error with distance from the “fixed” end? The fact is, instrument records and “proxy” records are not the same thing. In fact some investigations of the direct effects of temperature on tree growth (as well as CO2)do not have any effect on ring increment, e.g. Rasmussen. Beier and Bergstedt (2000):
      
      http://www.sciencedirect.com/science/article/pii/S0378112700006770
      
      where the author’s conclude:
      
      “The results showed that increased CO2 and/or temperature did not significantly influence tree growth—measured as tree ring increment. ”
      
      The paper is especially interesting because the authors are arguing the forests are not as a efficient at carbon sequestration as the “laidbackists”* argued and that therefore the “worrywarts”** had something to worry about. At the same time the work, which was experimental and not a computer model showed that tree rings are not measures of temperature. Apparently, as far as trees are concerned, temperature is not a proxy of climate.
      
      * Laidbackist – climate sceptic, denier, ect.
      
      ** Worrywart – warmist, AGW believer, etc.
Cassio

Posted Jun 10, 2012 at 6:39 AM | Permalink

Nick Stokes/

Jun 10, 2012 at 4:12 AM “No, there’s no subtle trap. They just made some kind of computer error so that they thought they were detrending but weren’t.”

Jun 10, 2012 at 6:23 AM “Well, what do you think the error was?”

What evidence do you have that it was an error (overlooked by all the authors, and indeed the reviewers!) ?

Or, if simply your opinion, on what do you base it ?
- Nick Stokes
  
  Posted Jun 10, 2012 at 6:51 AM | Permalink
  
  Because Dr Karoly said so. And there’s no earthly reason why they would say they took the unorthodox step of detrending and then deliberately not do it.
  - Richard Drake
    
    Posted Jun 10, 2012 at 9:09 AM | Permalink
    
    I tend to agree. And is there any earthly reason why they shouldn’t explain the exact sequence of events very soon, making crystal clearly whether their finding of the problem was truly independent from CA?
  - Kenneth Fritsch
    
    Posted Jun 10, 2012 at 10:09 AM | Permalink
    
    “Because Dr Karoly said so. And there’s no earthly reason why they would say they took the unorthodox step of detrending and then deliberately not do it.”
    
    I think it is human nature to look less critically at a result that agrees with your views going into an analysis. I posted an incorrect correlation in a post above and was probably too quick to post it, since, although it looked too low at the time, it was in agreement with my view that the Gergis PCR was more than a simple averaging method of the individual proxies. While I promptly corrected the result and I give myself some slack for doing a real time analysis, it shows sloppiness on my part.
    
    With the Gergis (2012) error, there is no slack given for real time analysis, since it was not done that way, and, in fact, was the premise of the paper and it was peer-reviewed. It was hardly caught by the offending party, but rather by what transpired at CA. While I give myself poor marks for sloppiness in my post above I have to give uber-bad sloppiness marks for the authors of Gergis (2012) and to the peer-reviewers. I suspect the same human weakness was operating whereby an expected/good result is not given sufficient critique.
  - DocMartyn
    
    Posted Jun 10, 2012 at 2:06 PM | Permalink
    
    “Nick Stokes
    Because Dr Karoly said so. And there’s no earthly reason why they would say they took the unorthodox step of detrending and then deliberately not do it.”
    
    You are wrong, yet again, Nick. You know not the difference between deliberately and intentionally.
    They quite deliberately used a computer program to dissect each proxy and examine the entrails. It was their intent that the program would examine only detrended entrails, but for what ever reason this did not happen in 62>n>0 cases.
    The authors, including Dr Karoly, believed that they had cross-correlated, and then established the statistical significance of 62 possible temperature sensitive detrended proxies against the detrended 1920-1990 regional temperature.
    
    Dr Karol has since stated that their belief was ill-founded. We have no idea how each the 62 proxies were transformed, either being detrended or undetrended; nor do we know if they were compared to the detrended or undetrended temperature record.
    The authors have not stated how each individual data set were treated, before being accepted as valid temperature proxies or discarded.
    So, at bare minimium, the authors will be required to release the 62 series in the raw. They will then be required to show the 62 detrended series from 1920-1990, they will then have to show the statistical tests they used to establish which datasets are temperature sensitive and which are not. In making a cut-off, the authors must also state why the cut-off was chosen.
    The world awaits.
- Armand MacMurray
  
  Posted Jun 10, 2012 at 9:49 AM | Permalink
  
  Cassio, making such an obvious and trivial error is terribly embarrassing (as I wrote above, as if you forgot to wear your pants to a job interview). However, since both original and detrended data are just files of numbers, just looking at them wouldn’t necessarily reveal the error without some analysis. Also, as soon as the proxies were selected by the initial screening process, all reconstruction work (the bulk of the time spent in preparing the paper) would have been done with the original non-detrended chronologies, so no natural opportunity to spot the error at that point.
  
  Having worked in a lab environment, it seems quite plausible to me that someone left out a “detrending” switch in the code, or mixed up detrended with non-detrended data files when feeding the screening process.
  
  Of course, this shows yet again why having all the data and code in a turnkey package is so useful: having to prepare the whole process for duplication by others would have ensured additional re-scrutiny of the initial steps, increasing the likelihood of spotting the error in-house.
  - Richard Drake
    
    Posted Jun 10, 2012 at 10:02 AM | Permalink
    
    Key point. Writing unit tests and running them in an automated way as regression tests every time one makes a change has a similar effect, as any agile programmer can tell you. And in the GitHub open source world such test suites are de rigeur as software gets passed on (forked) and improved by multiple teams faster than some of us old hands ever expected to witness. By the changes to the set of regression tests you can at once see the intention of the latest changes to the code. This is a world away from climate at the moment – but it simply shouldn’t be. Best practice is essential if this is the greatest threat humanity has ever faced. (Or perhaps it isn’t. But you can’t have it both ways.)
  - Paul Matthews
    
    Posted Jun 10, 2012 at 10:26 AM | Permalink
    
    ” However, since both original and detrended data are just files of numbers, just looking at them wouldn’t necessarily reveal the error without some analysis. ”
    
    I don’t really agree with this. One of the key things I tell my PhD students to do is to plot graphs of everything. In this case that would have been all the proxies, as someone has kindly done here, the de-trended proxies, the instrumental target, and the de-trended instrumental target. If they had done this their elementary error would have immediately been apparent.
    - Armand MacMurray
      
      Posted Jun 10, 2012 at 10:37 AM | Permalink
      
      You’re right of course; plotting and so forth is what I had meant to imply by “some analysis”. As you suggest, I think we can pretty confidently infer that the work there didn’t quite live up to the goal of a full, well-organized analysis. It’s sad as by comparison with some of the more usual suspects discussed on CA, this group did have the data for the *selected* proxies set up and publicly available, and a decent if not exhaustively-detailed description of their intended analysis.
    - Richard Drake
      
      Posted Jun 10, 2012 at 10:56 AM | Permalink
      
      I think Armand’s wider point stands: that the discipline of getting data and code ready for the world to see and to use concentrates the mind wonderfully and through that errors are discovered. A big part of the agile software movement is about the earliest discovery and correction of errors, allowing development not only to proceed apace but to change direction in an uncertain business environment. Graphs clearly are an indispensible tool for early discovery in scientific programming. But if the leaders of these projects (reportedly three years and $300,000 worth in this case) knew that at the end there’d be full diclosure of data and code it would (or certainly should) have an effect on the process from day one. ‘Maximum verification’ as we called one of the seven pillars as my company Objective promoted and practiced agile methods during the 90s.
  - Latimer Alder
    
    Posted Jun 10, 2012 at 1:41 PM | Permalink
    
    Re: Armand MacMurray (Jun 10 09:49),
    
    The casual observer, and/or one who works on important, but non-academic stuff, might find it hard to believe that an error of this importance was not spotted anywhere along the three year path from conception to publication. Nor that five quite senior authors did not design a research process with enough checks and balances built-in throughout to find and eliminate such things.
    
    Increasingly I wonder if the academic model of small teams working with paper publication as their ultimate goal is the best way to research the ‘most important problem humanity has ever faced’.
    
    And increasingly I conclude that it is a deeply flawed model. It fails on numerous levels. Most obviously if publication, not ‘truth’ is the objective, then gamesmanship and corner cutting will be rife. Even more so when the researchers are encouraged to produce a regular supply of small poor papers rather than fewer better ones. That academic teams are small (<10 in this case) means that they cannot/do not get on board the requisite expertise outside the very core area. We see this frequently with home-grown statistical methods (anything to do with hockey sticks), with inept and amateur IT practices (Harry_Read_Me), with a determination to 'do it all oneself' that is detrimental to the work quality but beneficial to the publication record.
    
    And all these deficiencies..and more..would apply equally to the most high-minded and rigorous set of researchers one could imagine. Even without individual failings, this is not an organisational structure that leads itself to getting high-quality results.
    
    The saving grace is that, much though the professional academics may resent and dislike it, the internet has actually helped to remove some of these disadvantages. Though it may be personally unpalatable, any team's work surely benefits from having near instant scrutiny by the likes of McSteve, Ross, Jean S bringing their expertise to bear on the problem. Their conclusions may be unwelcome, but the work is better because of it.
    And, as an IT guy used to running data centres, I wept and swore with Harry, as well as having some relevant experience that could help him.
    
    Outside of academe, there are plenty of organisational models that encourage and reward good work without 'publication' being the overriding objective. Seems to me that climatology could well do with taking a look at how it is organised and see how it could adapt itself.
  - DocMartyn
    
    Posted Jun 10, 2012 at 2:21 PM | Permalink
    
    “Having worked in a lab environment, it seems quite plausible to me that someone left out a “detrending” switch in the code, or mixed up detrended with non-detrended data files when feeding the screening process”
    
    I would not allow you in my lab.
    As a carpenter I measure twice and cut once.
    As a scientist I check each calculation twice, then run in reverse.
    Then one other author also checks it.
    If you are doing statistics one see a senior colleague or the unit statistician.
    In my paper that is going in press I removed a p=0.05 from one of my figures. Using a Tukey’s post hoc I got 0.048, but after being sent to the referee I looked at the data and applied the Holm–Bonferroni test; this was just over 0.051. I took the p<0.05 from the figure and stated in the text the two results.
    
    We used to have a bronze plaque on the wall of our microbiology lab at Chelsea.
    "Sterility, like virginity, is absolute"
    You either have aseptic technique or you don't. The data analysis is either right or it isn't.
Jessie

Posted Jun 10, 2012 at 7:57 AM | Permalink

Why has the original grant proposed to research
‘Reconstructing pre-20th century rainfall, temperature and pressure for south-eastern Australia using palaeoclimate, documentary and early weather station data’

Click to access Mel_Uni.pdf

changed from ‘south-east Australia’ to ‘Australasian’ please?

Also the funding guidelines (2009) stated in the Australian Research Council Act 2001 (ARC is a Statutory Authority) encourage depositing of data
4.4.5.3 p13

Click to access LP09_FundingRules_Superseded.pdf

source: http://www.arc.gov.au/ncgp/lp/lp_fundingrules.htm
maxberan

Posted Jun 10, 2012 at 8:44 AM | Permalink

To my shame I do not know what folk mean here by the term “detrending”.

True my own dendrochronologising days are back in the 1970s and then we followed Fritts by removing a negative exponential function from the whole-year tree-ring widths and analysed the residual from that fitted line. Climate data is detrended generally by removing a regression (on year number) line or perhaps polynomial curve and again inspecting the residuals.

Is that sort of thing what most people here mean by detrending? I see a lot of references to ARMA etc which to my understanding concerns short-term persistence or memory but that wouldn’t normally come under the heading of “detrending”, the words I’d use are “filtering” or “smoothing” or “removal of short-term memory”.
- Hu McCulloch
  
  Posted Jun 10, 2012 at 8:59 AM | Permalink
  
  Max —
  “Detrending” here just means looking at the residuals of a least squares regression on time (year number). Gergis said she was selecting proxies by picking high correlations of detrended proxy with detrended temperature, but in fact was using correlations of the raw data. (If I understand the issue here correctly.)
GregK

Posted Jun 10, 2012 at 9:03 AM | Permalink

Discussion of whether or not to detrend passes my understanding however I am puzzled by a mean of 0.09 with an error of plus or minus 0.19.
Allowing that all of Gergis et al’s computations were actually correct they come to the conclusion that the peak of the medieval warming was between 0.28C warmer to 0.10C cooler than 1961-1990. Statistically probably a bit iffy ?
GregK

Posted Jun 10, 2012 at 9:05 AM | Permalink

Whoops, that was the other way around.
ferd berple

Posted Jun 10, 2012 at 10:44 AM | Permalink

The dependent variable problem is well recognized in other fields. In climate science, temperature is the independent variable and tree rings are the dependent variable.

Climate science selects only those cases where the dependent variable correlates with the independent variable. Substitute “climate science” for “comparative politics” in the paper below:

How the Cases You Choose Affect the Answers You Get:

Click to access Geddes1.pdf

This is not to say that studies of cases selected on the dependent variable have no place in comparative politics. They are ideal for digging into the details of how phenomena come about and for developing insights. They identify plausible causal variables. They bring to light anomalies that current theories cannot accommodate. In so doing, they contribute to building and revising theories. By themselves, however, they cannot test the theories they propose and, hence, cannot contribute to the accumulation of theoretical knowledge (compare Achen and SnidaI1989). To develop and test theories, one must select cases in a way that does not undermine the logic of explanation.

If we want to begin accumulating a body of theoretical knowledge in comparative politics, we need to change the conventions governing the kinds of evidence we regard as theoretically relevant. Speculative arguments based on cases selected on the dependent variable have a long and distinguished history in the subfield, and they will continue to be important as generators of
insights and hypotheses. For arguments with knowledge-building pretensions,however, more rigorous standards of evidence are essential.
- HAS
  
  Posted Jun 10, 2012 at 4:02 PM | Permalink
  
  Amen to that. At best all this stuff seems to do is create hypothetical structures worth further investigation, and not a basis for pronouncements on what was colder when.
André van Delft

Posted Jun 10, 2012 at 3:59 PM | Permalink

One aspect of the Gergis temperature reconstruction, and probably many others as well, surprises me: the reconstruction method has not been validated at all. We have severe grounds to criticize the method, but even if we would not have known such grounds, we just would not know how the method performs.

The “scientists” should have validated their method using several unrelated datasets. For each dataset they should reconstruct temperatures and compare these with instrumental records. Then we me get some confidence in the method, and apply it to temperature reconstructions of periods for which no instrumental records exist.
Skiphil

Posted Jun 10, 2012 at 5:34 PM | Permalink

Steve and all, I think you might find it of interest to know how sweeping some of the statements of Karoly, Gergis, and Phipps were to the media event on May 17, 2012. I’ve posted my notes and some quotations from listening to this online 30 min. event:

David Karoly tells journalists that 0.09C difference means there was no Medieval Warm Period for Australia

Whatever may happen with the paper now, I think it more than puzzling that Karoly thinks even the original paper abolished any idea of a MWP in Australia. He seems to be assuming that unless medieval temps were significantly higher than present there is no MWP and no issue at all. They also wave away any dendro issues and make some claims that seem hyped.
HR

Posted Jun 11, 2012 at 12:53 AM | Permalink

I posted a longer version of this but it hasn’t appeared so I’ll summarize.

Gargis seems to be funded by an ARC Linkage grant. Those grants have rules. Since Dec 2006 those rules have included a section called “Dissemination of Research Outputs” which states

“The Australian Government makes a major investment in research to support its
essential role in improving the wellbeing of our society. To maximise the benefits
from research, findings need to be disseminated as broadly as possible to allow access
by other researchers and the wider community.”

and

“Taking heed of these considerations, the ARC endeavours to ensure the widest possible dissemination of the research supported under its funding, in the most effective manner and at the earliest opportunity.”

and

“The ARC therefore encourages researchers to consider the benefits of depositing their
data and any publications arising from a research project in an appropriate subject
and/or institutional repository.”

See for example the 2008 PDF from http://www.arc.gov.au/ncgp/lp/lp_fundingrules.htm

Gergis petty refusal to share data seems to go against the spirit, if not the letter, of the ARC’s rules. As the ARC rules allude to ultimately it’s not her research , it’s for the wider community.
Roger

Posted Jun 11, 2012 at 5:57 AM | Permalink

That a published paper can be “on hold” is a little baffling to me. I recently had the unfortunate experience of spotting a mistake in one of my papers. The mistake was to do with a missing term from an equation. We had followed the correct procedure but wrote down the equation wrongly in the draft. Even though we alerted the journal to this mistake only a few days after online publication we still had to write an errata. This was also what I expected. Once something is published its part of the record. One may write another document to correct the record but one may never change the original record.

Its a little odd that this procedure isn’t being followed here. It would be useful to find out what the journal’s policies are to deal with this type of thing happening.
- Erica
  
  Posted Jun 11, 2012 at 6:23 AM | Permalink
  
  A corrected paper being easier to follow that an erroneous one followed by a errata, perhaps what is (also) needed is for papers to have versions.
  So the faulty “on hold” one might be version 1, the corrected one version 2, etc.
Skiphil

Posted Jun 11, 2012 at 5:44 PM | Permalink

Climate Audit gets some mentions in discussion of the Gergis et al (2012) paper now “on hold” etc. Both Revkin at NY Times blog and the Retraction Watch blog have picked up the story:

Revkin in NY Times blog on Gergis et al 2012 paper

Retraction Watch on the Gergis et al 2012 paper
- Skiphil
  
  Posted Jun 11, 2012 at 6:08 PM | Permalink
  
  Steve and all, reading this Comment in Nature by the guys who run the Retraction Watch blog (Marcus and Oransky) makes me think this is an excellent time for anyone who’s able to make the case to various scientific and media entities that Climate Audit is a valuable part of this process:
  
  Nature comment on post-publication review by Marcus and Oransky
  
  Science publishing: The paper is not sacred
  
  Adam Marcus & Ivan Oransky
  
  Nature 480,
  449–450
  (22 December 2011)
  doi:10.1038/480449a
  
  Published online
  21 December 2011
  
  Peer review continues long after a paper is published, and that analysis should become part of the scientific record, say Adam Marcus and Ivan Oransky.
Steve McIntyre

Posted Jun 12, 2012 at 10:28 AM | Permalink

J Climate editor Broccoli wrote:

“The article in question did not “disappear” from the Journal of Climate. It was removed from a preprint server for accepted manuscripts that have not yet been published. AMS maintains this preprint server for the convenience of authors and readers. The manuscripts on this server are labeled as “preliminary” and are not in final form.”
- Roger
  
  Posted Jun 12, 2012 at 11:13 AM | Permalink
  
  This is sophistry. The difference between “accepted for publication” is in the proofs. A paper is “preliminary” only in the sense that the formatting may change. They are not “preliminary” in the sense that the results are about to change.
  
  Is Prof Broccoli really implying that there is a further layer of checks of the integrity of the study after acceptance ?
  
  Steve – the webpage is entitled “Early Online Release.” Maybe it’s a trick.
- Paul Matthews
  
  Posted Jun 12, 2012 at 12:11 PM | Permalink
  
  “Not yet published”? The Melbourne Uni press release on 17 May said “The study published today in the Journal of Climate…”. Phipps’s web page of publications describes the paper as “published online May 2012”.
  
  “Preprint server”? Is he referring to the “Early Online Releases of Papers in Press”?
  
  And as for the statement that the paper that disappeared on Friday didn’t disappear, that’s in the ‘Comical Ali’ category.
- Bob Koss
  
  Posted Jun 12, 2012 at 1:23 PM | Permalink
  
  If they don’t re-issue their paper before the cut-off date for ar5 submissions, it wouldn’t surprise me if they use the original online publication date in order to get it accepted into ar5.
- Skiphil
  
  Posted Jun 12, 2012 at 2:50 PM | Permalink
  
  Well if that was not supposed to be a form of “online publication” by the Journal of Climate then some corrections need to be made by someone, because the co-authors claimed in a variety of places that their study had been “published” on May 17, 2012. The Australian Science and Media Centre considered it “published” and held an online press event with Gergis, Karoly, and Phipps to announce the historic event.
  
  The University of Melbourne’s own PR reads:
  
  May 17, 2012: “The study published today in the Journal of Climate….”
  
  “Our study revealed that recent warming in a 1000 year context is highly unusual and cannot be explained by natural factors alone, suggesting a strong influence of human-caused climate change in the Australasian region,” she said.
  
  The study published today in the Journal of Climate will form the Australasian region’s contribution to the 5th IPCC climate change assessment report chapter on past climate. [emphasis added]
  
  Steve: I agree. Maybe editor Broccoli has been taking night classes on communications from the University of East Anglia.
  - Michael Kottek
    
    Posted Jul 11, 2012 at 8:32 PM | Permalink
    
    As a Melbourne Uni alumnus, the mailman just delivered “Broadcast: The Latest News from the Faculty of Science” for July 2012. This highlights the Gergis et al paper and states it is ‘published’ in the Journal of Climate.
    - Steve McIntyre
      
      Posted Jul 11, 2012 at 8:59 PM | Permalink
      
      Can you scan the relevant pages?
Bob Koss

Posted Jul 10, 2012 at 2:16 PM | Permalink

It only took a month since Karoly’s thank you email for him to slag Steve in print.

https://www.australianbookreview.com.au/feature-articles/1063-343-features-karoly

I posted this comment over there, but suspect it won’t be published.

I find it strange David Karoly includes Stephen McIntyre in a list of “Commentators with no scientific expertise …” when just one month ago he closed an email to McIntyre by saying …
“We would like to thank you and the participants at the ClimateAudit blog for your scrutiny of our study, which also identified this data processing issue.
Thanks, David Karoly”

The full email can be found in this thread. https://climateaudit.org/2012/06/08/gergis-et-al-put-on-hold/

It leads me to wonder how much “artistic license” was used in writing the rest of this article.
- Skiphil
  
  Posted Jul 10, 2012 at 9:14 PM | Permalink
  
  That “book review” defies belief. While passing over all actual issues without one word of substance, Karoly cannot even imagine that Michael “Mother Theresa” Mann could be questioned or criticized about anything, ever. Not even for rhetorical purposes of explaining why a criticism is wrong. I want to note that the claim there is a diabolical “Serengeti stratgy” which Mann promotes and which Karoly adopts uncritically is in fact silly and incoherent. Mann and his work were in no sense “at the edge of the herd” but in the very center of it with maximum protection. If Mann’s work was “vulnerable” it was not due to any possible “Serengeti strategy” (when Mann has always had the maximum protection of the entire herd to this day), but rather due to its own inherent problems. The stuff about “Serengeti strategy” is a low propaganda meme, not a serious intellectual claim.
  
  [Karoly] “Mann was subjected to what he describes as the ‘Serengeti strategy’, in which predators ‘look for the most vulnerable animal at the edge of a herd’. In his case the predators were climate change ‘confusionists’, politicians and commentators who wish to confuse the public understanding of climate change science and delay action on reducing industrial emissions of green-house gases.
  - Steve McIntyre
    
    Posted Jul 10, 2012 at 9:58 PM | Permalink
    
    here is a picture of Karoly at the opening of the Hepburn Community Wind Farm in Victoria, Australia on November 5, 2011. The slogan on his shirt was the slogan of the radical group, the Weather Underground, in the late 1960s when I was at university. Their manifesto is here. Lots of stuff about pigs and imperialists.
    - Eddy
      
      Posted Jul 10, 2012 at 10:07 PM | Permalink
      
      Well, it’s originally a line from a Bob Dylan song, isn’t it? Subterranean Homesick Blues.
      
      Don’t think Karoly is necessarily promoting armed struggle or whatever.
    - Steve McIntyre
      
      Posted Jul 10, 2012 at 10:22 PM | Permalink
      
      Maybe it’s age-specific. For someone who grew up in the period, the phrase and the radical movement were inseparably linked. Wikipedia has an interesting article on the faction http://en.wikipedia.org/wiki/Weather_Underground.
      
      I still have my vinyl Dylan album with Subterranean Homesick Blues on it.
    - Eddy
      
      Posted Jul 10, 2012 at 10:35 PM | Permalink
      
      “Steve McIntyre
      
      Posted Jul 10, 2012 at 10:22 PM | Permalink
      
      Maybe it’s age-specific. For someone who grew up in the period, the phrase and the radical movement were inseparably linked.”
      
      Yup, but I think the phrase nowadays has been watered down by usage to, well, T-shirt slogan status.
      
      From aother wikipedia piece:
      
      In a 2007 study of legal opinions and briefs that found Bob Dylan was quoted by judges and lawyer more than any other songwriter, “you don’t need a weatherman…” was distinguished as the line most often cited.
      
      http://en.wikipedia.org/wiki/Subterranean_Homesick_Blues
      
      I get impression Karoly is a a pretty run-of-the-mill watered down old lefty, not eg somebody storing weapons in his mother’s basement against The Day. But you never know 🙂
    - Steve McIntyre
      
      Posted Jul 10, 2012 at 10:52 PM | Permalink
      
      I’m sure that you’re right about this. The original Weather Underground appropriated the phrase from Dylan. I guess that Dylan is entitled to have it back after the passage of time. I suspect that Karoly thinks more often of government grants than about revolution.
    - Speed
      
      Posted Jul 11, 2012 at 5:51 AM | Permalink
      
      The answer, my friend, is blowin’ in the wind.
    - theduke
      
      Posted Jul 11, 2012 at 11:05 AM | Permalink
      
      Gergis et al “Put on Hold”
      
      Personally, I prefer “The pump don’t work cuz the vandals took the handle.”
    - Peter West
      
      Posted Jul 11, 2012 at 11:22 PM | Permalink
      
      Here’s another image. The t-shirt seems to be an AGW “argument,” bit there is not enough visible detail to be sure.
  - Kenneth Fritsch
    
    Posted Jul 11, 2012 at 9:06 AM | Permalink
    
    “That “book review” defies belief. While passing over all actual issues without one word of substance, Karoly cannot even imagine that Michael “Mother Theresa” Mann could be questioned or criticized about anything, ever. Not even for rhetorical purposes of explaining why a criticism is wrong. I want to note that the claim there is a diabolical “Serengeti stratgy” which Mann promotes and which Karoly adopts uncritically is in fact silly and incoherent.”
    
    I think what you point to here, Skiphil, is part and parcel to what maintains the consensus thinking on AGW. Keep it general and keep it vague. If we never have to face the details head on, we can all agree to almost anything including that those who would want to discuss and analyze those details are merely unworthy nit picking obstructionists.
- Skiphil
  
  Posted Jul 11, 2012 at 8:33 AM | Permalink
  
  re: Karoly’s review of Mann’s book
  
  One thing that strikes me is how one-sided the hagiography of Mann’s role in the field has been. If one considers the “state of the field” in 1999 as reflected in Barnett et al (1999) [h/t reader “berniel” at WUWT] then is it so obvious that Mann has improved the field in a unqualifed way? Or has he also thrown a lot of sand and created some dubious levels of confidence about reconstructions that did not exist to the same extent before?
  
  Here is how Barnett et al (1999), including Phil Jones and Ben Santer as co-authors, describe the situation just before the impact of Mann really begins to shape (distort) the field:
  
  Barnett et al (1999) on state of the research
  
  [emphasis added]
  
  [Barnett et al (July 1999)]: “Recent compilations of paleoclimatic data have offered the first opportunity to analyze this type of data on a global scale. Straightforward comparisons via cross-spectral analysis of the recent paleodata with the instrumental record show that most of the paleodata are not simple proxies of temperature (Barnett et al. 1996; Jones 1998; Jones et al. 1998; see Table 1). Indeed, only a few of the tree-ring records from mid to high-latitude sites can be interpreted directly as temperature changes. Attempts by Jones et al. (1998) to use these “good” records to construct a record of Northern Hemisphere (NH) temperature over the last five centuries are shown in Fig. 2. Also shown is a different reconstruction created using a full compilation of proxy data (Mann et al. 1998). The disparity between these reconstructions at some times over the last 400 years is as large as the observed changes in global temperature over the last 100 years. Some of the differences are due to different compilations of proxy data and also differences in the seasons reconstructed, but most of the disparity simply represents uncertainties in our knowledge of past changes in NH average temperature.”
  - Skiphil
    
    Posted Jul 11, 2012 at 8:38 AM | Permalink
    
    i.e., in 1999 it was still possible to get a more honest statement of “uncertainties” from some leaders in the field.
- Bob Koss
  
  Posted Jul 11, 2012 at 7:48 PM | Permalink
  
  Heh! 🙂
  
  Just tried the link to Karoly’s article to see what comments have been made. It is no longer available. I’m getting the error “404 – Article #1063 not found”. Articles on the main page prompt you to subscribe to read them, so it is unlikely they just moved it behind the paywall.
  
  I guess it has been “put on hold”. Sort of like Gergis et al.
  - Skiphil
    
    Posted Jul 11, 2012 at 8:34 PM | Permalink
    
    Bob, this is bizarre, it definitely looks like it’s been removed from the site, because yesterday it was on the main page for the current issues right under the first review of the book on Murdoch:
    
    https://www.australianbookreview.com.au/
    
    ABR JULY–AUGUST 2012, NO. 343
    
    Now it’s gone!! It’s not just some link problem, because the Home Page shows up fine and I know that yesterday that review article by Karoly on Mann was just below the one on Rupert Murdoch. Strange things happen when Karoly, Mann, Gergis et al are around.
    
    Also, at this moment when I search for “Karoly” in the search box on the site I get a glimpse of YOUR comment on the review (I can clearly see words of your comment), in the list of search results, but then when I click on your comment I get the ‘404’ error for the review itself.
    
    To take a wild guess, maybe either ABR or Karoly himself is intensely embarrassed by the reaction to the review? I’ve never seen a published book review seemingly “withdrawn” like this before. Maybe Karoly did submit it before the fuss with Gergis et al (2012), maybe didn’t even think about it again, and now is intensely embarrassed?? I don’t know but this is bizarre to see the review vanish like this.
    - Steve McIntyre
      
      Posted Jul 11, 2012 at 8:52 PM | Permalink
      
      I wrote a letter of complaint to Karoly with a copy to the Dean of Research. I’m busy on a submission to the Information Tribunal in the UK right now but will provide more particulars on this in the near future.
    - Brandon Shollenberger
      
      Posted Jul 11, 2012 at 8:59 PM | Permalink
      
      I noticed the same thing when I went to comment on the page. It’s good to hear why it was taken down, though I look forward to hearing more details when Steve gets the time.
    - Lucy Skywalker
      
      Posted Jul 12, 2012 at 3:26 AM | Permalink
      
      Indeedy, poof! gone!
      
      When it was there, I was very pleased to note that all comments were from people highly skeptical of Karoly (though reasonably courteous), several familiar names. Pleased, too, to see that ABR did publish the comments.
      
      No such good news can be said for comments at Mann’s Amazon book review page.
    - theduke
      
      Posted Jul 12, 2012 at 9:55 AM | Permalink
      
      Could this have been submitted for publication before the collapse of the Gergis paper? It doesn’t reflect the public pose he’s assumed since.
    - sue
      
      Posted Jul 12, 2012 at 10:15 AM | Permalink
      
      theduke, I think so, from this quote :
      
      “I am not so sure that this is the case in Australia. With the introduction of legislation setting a price on greenhouse gas emissions from human activities, the Climate Wars have heated up here, with coordinated misinformation campaigns from politicians, from media commentators…”
      
      When did that bill pass?
  - Steve McIntyre
    
    Posted Jul 11, 2012 at 8:55 PM | Permalink
    
    http://webcache.googleusercontent.com/search?q=cache:WiybdRjsoU0J:https://www.australianbookreview.com.au/feature-articles/1063-343-features-karoly+&cd=1&hl=en&ct=clnk&gl=ca&client=firefox-a
    - LC
      
      Posted Jul 11, 2012 at 10:53 PM | Permalink
      
      I’m guessing the Australian legal system is based on the UK model and is still operated in pretty much the same way. Perhaps some of the Aussie commenters here might be able to tell us. If I am indeed guessing correctly, then I’m surprised that Karoly’s review ever got into print in the first place. Not only does he accuse Steve and others of “repeatedly promulgated misinformation”, but he also accuses several well known Australian media and science personalities of being involved in “coordinated misinformation campaigns”. I know it’s overused in the “climate wars” but the word libellous comes to mind.
    - maxberan
      
      Posted Jul 12, 2012 at 12:50 PM | Permalink
      
      Perhaps it was that “vested interest” accusation that the publishers thought an insult too far.
      
      Being labelled “climate confusionist” leaves me unmoved, perhaps even an epiphet I could welcome if it can be interpreted as one who wishes to see restored a sense of complexity to this “settled” science, simplified and linearised to the point where multiples of national GDPs are expended on mitigation policies.
    - theduke
      
      Posted Jul 12, 2012 at 2:30 PM | Permalink
      
      Apart from the fact that the review is just more hollow rhetoric from the “consensus” side, the use of the term “obscurantism” to describe Steve and others is particularly appalling. Karoly seems to believe they are opposed to progress and enlightenment and only he and his comrades are the shining light leading the way to scientific truth.
    - jeez
      
      Posted Jul 12, 2012 at 4:37 PM | Permalink
      
      Re: Steve McIntyre (Jul 11 20:55), There’s a semi-innocent explanation for this review going up and then being pulled that people seemed to have missed, or at least I haven’t read. Dr. Karoly may have written this review more than a month ago before ever really reading CA or learning about Steve McIntyre first hand. He may have been parroting information fed to him and believed it at the time.
      
      After seeing the contribution CA made with the Gergis, et al paper, and yes finding flaws is a contribution in science, Dr. Karoly may have changed his opinion about Steve McIntyre and more. The review went live when previously scheduled and it is possible, having changed his mind in the intervening period, Karoly had it pulled. A more positive aspect of this is that Dr. Karoly may be having second thoughts about a lot of things, [cough, Mann, cough].
      
      Time will tell if this positive spin has any merit or not.
    - Steve McIntyre
      
      Posted Jul 12, 2012 at 4:40 PM | Permalink
      
      An interesting story from a commenter at WUWT:
      
      Apparently, Karoly’s admiration for Mann was so great that he offered Mann a kangaroo skin as a gift when Mann visited Australia. But Mann, being a vegetarian, said no thanks. The next day, a headline in the Sydney Morning Herald claimed that ‘Mann had Declined the Hide’.
    - Steve McIntyre
      
      Posted Jul 12, 2012 at 4:42 PM | Permalink
      
      I think that my complaint to the university probably had more to do with it.
    - jeez
      
      Posted Jul 12, 2012 at 5:10 PM | Permalink
      
      I was just trying to be nice. My life coach has been beating the negativity out me.
  - hro001
    
    Posted Jul 11, 2012 at 10:15 PM | Permalink
    
    Perhaps we are witnessing the birth of a new strategy in the art and artifice of climate science: The Upside-Down-Under Publish’em and Hold’em. One wonders how long it will be before this technique of publication becomes known as a well-documented “standard practice” in the field … not unlike that which has been asserted regarding the use of “trick”.
BFJ

Posted Jul 11, 2012 at 9:08 AM | Permalink

Seems to me this ‘Serengeti’ approach is exactly what is needed in science – the weakest ideas are the ones that get killed off.
- Skiphil
  
  Posted Jul 11, 2012 at 9:15 AM | Permalink
  
  yes, agreed, but Mann and Karoly present it as though it is a “political” approach to looking for someone who is personally vulnerable on the edge of the herd, rather than looking at weakness of scientific claims. My point is simply that Mann enjoyed maximal support, acclaim, protection, and promotion from within the IPCC and climate science, very early on, so whatever is debateable about his work it’s not that it was somehow more vulnerable to attack by *uninformed* outsiders.
  - Steve McIntyre
    
    Posted Jul 11, 2012 at 11:17 AM | Permalink
    
    Quite so. I looked at the Mann reconstruction because it was regarded as the most definitive, not because it was the regarded as the weakest. And because it was used in IPCC and government promotions, which presumably had their choice of arguments and chose one that they believed to be one of the strongest.
theduke

Posted Jul 11, 2012 at 11:20 AM | Permalink

I’m having a difficult time posting. I type a sentence and it takes 45 seconds to show up on screen. It seems to only happen here and at WUWT. Been like this off and on for a week or two.

Anyone else??
- Bob Koss
  
  Posted Jul 11, 2012 at 1:17 PM | Permalink
  
  No problems with my WinXP machine.
  
  Sounds like some program is consuming all your CPU time. Virus maybe? You might want to virus scan your machine. I vaguely remember hearing about one virus which was targeted at wordpress.
Marion

Posted Jul 11, 2012 at 5:54 PM | Permalink

(reposted from a comment I made at WUWT)

Well Karoly has form for dishonesty. His critique of Bob Carter’s book “Climate: The Counter Consensus” was nothing short of disgraceful and to my mind his misrepresentation of his colleague’s book should come under academic misconduct.

This for example –

“Lets fall through a rabbit hole and enter a different world: the “Carter reality”. In that world, it is OK to select any evidence that supports your ideas and ignore all other evidence….
In the Carter reality, “there has been no net warming between 1958 and 2005.“ Of course, in the real world, there is no basis for this statement from scientific analysis of observational data. The decade of the 2000s was warmer than the 1990s, which was warmer than the 1980s, which was warmer than the 1970s, which was warmer than the 1960s.
So where does Carter’s statement come from? In the Carter reality, he finds a hot year early in the period and a cold year much later, and says there’s been no warming. This would be like saying that winter is not colder than summer because a very hot day in winter might happen to have much the same temperature as a very cold day in summer, ignoring all the other days.”

http://theconversation.edu.au/bob-carters-climate-counter-consensus-is-an-alternate-reality-1553

This sort of thing is lapped up by non-critical AGW supporters, who pay undeserved homage to ‘voices from authority’ but it takes us sceptics to pursue the actual reality –

The term “no net warming between 1958 and 2005″ comes from a Weather Balloon graph on p.61 of Carter’s book entitled “Lower atmosphere mean global temperature radiosonde record HadAT2 (from Thorne et al., 2005)
.
The caption reads –

“Fig. 11a Estimated lower atmosphere global temperature records since 1958, based on measurements from weather balloon. Note the presence of (i) cooling from 1958 to 1977; (ii) warming, mostly as a step in 1977, from 1977-2005; and (iii) no net warming between 1958 and 2005. Over the same time period there has been an 18% increase in atmospheric carbon dioxide. Black dots denote times at which the temperature falls upon the zero anomaly line, ie. no net change has occurred between them.”

And Carter makes it quite clear in the text that the temperature records from weather balloons “whilst highly accurate, are available only since 1958″

P. 59 Climate the Counter Consensus by Bob Carter.

Yet Karoly claims “there is no basis for this statement from scientific analysis of observational data” !!!
and tries to pass it off as
“In the Carter reality, he finds a hot year early in the period and a cold year much later, and says there’s been no warming. This would be like saying that winter is not colder than summer because a very hot day in winter might happen to have much the same temperature as a very cold day in summer, ignoring all the other days”

And who exactly is Karoly – well Donna Laframboise tell us on her excellent site –
He’s an IPCC Insider –

“The IPCC Insiders Club
… certain names pop up again and again in IPCC reports. If shadowy interests were trying to “control the message” in these documents, entrusting key tasks to a small group of people might be an effective strategy… Australian meteorologist David Karoly filled six separate IPCC roles. He served as a lead author and as a review editor. Along with Rosenzweig he was a lead author of a Technical Summary, a drafting author of a Summary for Policymakers, a member of the core writing team for the Synthesis Report, and was also an expert reviewer.”

The IPCC Insiders Club
sue

Posted Jul 14, 2012 at 1:26 AM | Permalink

This is a comment by David Karoly:

http://www.skepticalscience.com/news.php?n=1538

dkaroly at 08:06 AM on 14 July, 2012
This is a very welcome initiative. The threats of legal action and FOI requests are not just occurring in North America. In Australia, I have just received a threat of legal action from Steve McIntyre in Canada and am currently dealing with 6 different FOI requests.

Is it true that you’ve ‘threatened’ legal action against Karoly?
- Mooloo
  
  Posted Jul 14, 2012 at 2:29 AM | Permalink
  
  I wonder if he will “deal” with the FOI requests the quick and easy way — which is to release the data. (I used to deal with NZ Official Information Act requests, and they were a piece of cake if all you had to do was release the information. It was only withholding that caused you time and effort.)
  
  The “threats of legal action” is a red herring. Being threatened with legal action when you have done nothing wrong is hardly a problem. It costs you nothing about from a bad feeling for a short while. Actually it’s not really a problem if you have done something wrong. Only if there is legal action do you have any real issues.
Paul Matthews

Posted Jul 24, 2012 at 9:14 AM | Permalink

Some Gergis emails have been released, see Warwick Hughes’s blog.

Claims that ‘our team discovered an error’.
‘When we went to recheck this on Tuesday 5 June, we discovered that the records used in the final analysis were not detrended’.

Why did they go back and recheck this on June 5 I wonder.
- Skiphil
  
  Posted Jul 24, 2012 at 9:40 AM | Permalink
  
  Oh yeah, sure, they just happened to be re-checking on June 5, her email says it was because they were posting some unpublished data to a NOAA site, is that credible??
  
  here’s a hyperlink to the Gergis emails posted at the Warwick Hughes blog:
  
  Gergis email comments at the Warwick Hughes blog, per FOI request
  
  h/t reader “March” at Bishop Hill
- matthu
  
  Posted Jul 24, 2012 at 10:11 AM | Permalink
  
  At least we now know how Gergis intends to handle the issue.
  
  1. the results are unlikely to change substantially (i.e. we know this already without revisiting the data)
  
  2. we will be rerunning everything using both detrended and non-detrended data before comparing the results and deciding which method to use and how to justify our choice of method.
  
  3. rest assured that it is highly unlikely our core conclusions will change
Skiphil

Posted Jul 24, 2012 at 10:58 AM | Permalink

For FOI requests in Australia, one thing that is particularly needed is any document or email on or around June 5, 2012 (say date range May 17 – June 8 or beyond) which pertains to the Gergis assertion that Gergis and/or Neukom found the problem independently of Jean S. and Climate Audit.

i.e., IF the Gergis claim is true that the authors identified the problem independently while preparing unpublished data for the NOAA site, surely there would be some contemporaneous email corroboration among the co-authors (“oh, no, look at what I just noticed….”).

Of course this pretense is highly implausible at best, but FOI might at get at any emails from that day, week, and multi-week period to prove it if Gergis is telling the truth). Can’t prove the negative from absence of evidence, but it would contribute to doubt of the Gergis/Karoly tale on this if they cannot provide any evidence.
- Skiphil
  
  Posted Jul 24, 2012 at 11:32 AM | Permalink
  
  p.s. and was there any Gergis et al submission of data to NOAA on or soon after June 5??
  
  Of course, they might pretend that they never got around to submitting data at that time because they noticed the problem with the paper, but surely posting data sets to the NOAA site is quite different from reviewing the methodology of the paper, so it could be interesting to know whether or not Gergis, Neukom, Phipps, Karoly, or Gallant was posting any data related to the paper on or around June 5 on NOAA.
  - JohnH
    
    Posted Jul 24, 2012 at 1:01 PM | Permalink
    
    Her email has guilt written all over it, why mention CA in the first section (which is just an excuse to slag off Steve) but then launch straight into how ‘They’ found the problem without pausing breath or saying why the mention of CA in the previous para.
    
    Activist written all over it and playing to the Team.
    - Skiphil
      
      Posted Jul 24, 2012 at 1:49 PM | Permalink
      
      Yes, why mention Steve or CA at all in that context, her email has a strong odor of “guys, here’s our official story for anyone who asks” (well also a strong odor of b.s.). She was already in spin mode rather than a scientific mode of “let’s review all data and methods all over again from the start.”
Michael K

Posted Oct 22, 2012 at 11:14 PM | Permalink

Well, Raphael Neukom refers to discovering the error ‘today’ on email sent around 10AM on Wednesday 6 June. Not sure what time zone the email refers to.
- Skiphil
  
  Posted Oct 22, 2012 at 11:29 PM | Permalink
  
  Re: Michael K (Oct 22 23:14),
  
  where did you view this email??
- Skiphil
  
  Posted Oct 23, 2012 at 12:01 AM | Permalink
  
  re time of Neukom email (Michael K, do you have a link/source??)
  
  IF there is such a Neukom email for a time of around 10 am on Wed. June 6 and IF in fact the Neukom/Gergis email server was time stamping for Melbourne time (which I believe is 16 hours ahead, to the next date, compared to Climate Audit time), THEN it may well be that Neukom’s email came after the Jean S post. Details must be established….
  
  This is speculative to discuss the timeline without firm facts, and of course Neukom et al could claim they had already found the problem blah blah, but it does appear a key Jean S post was at 8:42 AM in Melbourne time (4:42 PM the previous day in Climate Audit time).
  
  Of course there was an energetic discussion on CA by quite a few regulars already expressing all kinds of issues and puzzles for trying to make sense of what Gergis et al (2012) had done, which could already have sparked re-analysis by any of the Gergis group, but perhaps the comment from Jean S that set things on the track toward withdrawal of the paper was this one:
  
  key Jean S comment Posted Jun 5, 2012 at 4:42 PM
  - Steve McIntyre
    
    Posted Oct 23, 2012 at 12:39 AM | Permalink
    
    the idea that they located the error ‘independently” of Climate Audit is ludicrous. The screening issue had been raised at CA well before June 5; Gergis was aware of the CA criticism.
  - Michael K
    
    Posted Oct 23, 2012 at 1:00 AM | Permalink
    
    From an FOI request, I will provide more details when I am able to.
    
    Cheers
    
    Michael K
    - Michael Kottek
      
      Posted Oct 28, 2012 at 7:04 AM | Permalink
      
      The results of my FOI request to the University of Melbourne can be seen here:
      
      http://tinyurl.com/96ey5dt
      
      I requested all correspondence between the authors and the journal regarding the paper. The referees reports were exempted as were documents relating to the resubmitted paper.
      
      I also requested correspondence between the authors after the paper was accepted. Once again emails relating to the resubmitted paper were exempted, and personal material redacted.
      
      I note that emails regarding the paper that were received by one author and not forwarded to the others would not have been covered by my request.
      
      Despite the embarrassment of the withdrawn paper, the University is to be commended for their no nonsense approach to this request. As an alumunus, I am pleased that the response is far more sensible than the approach taken by the UEA and UVa.
    - redcords
      
      Posted Oct 28, 2012 at 8:09 AM | Permalink
      
      Quite a lot of interesting reading in all of this.
      
      David Karoly says that if they redo the paper according to the described method (detrending) then “only about 9” proxies remain and only 1 before 1400. “No reliable reconstruction before 1400”.
    - redcords
      
      Posted Oct 28, 2012 at 8:19 AM | Permalink
      
      One more comment if I may.
      
      Michael Mann, on the 9th June 2012, when trying to steer them away from detrending:
      
      “Well I’m afraid McIntyre has probably already leaked this anyway. I probably don’t have to tell you this, but don’t trust him to behave ethically or honestly here, and assume that anything you tell him will be cherry-picked in a way that maximally discredits the study and will be leaked as suits his purposes.”
      
      CA is mentioned a number of times, including Karoly taking advice from CA comments to question Neukom.
    - redcords
      
      Posted Oct 28, 2012 at 8:59 AM | Permalink
      
      David Karoly, 7th June 2012 to Gergis and Neukom:
      
      “The same argument applies for the Australasian proxy selection. If the selection is done on the proxies without detrending ie the full proxy records over the 20th century, then records with strong trends will be selected and that will effectively force a hockey stick result. Then Stephen McIntyre criticism is valid. I think that it is really important to use detrended proxy data for the selection, and then choose proxies that exceed a threshold for correlations over the calibration period for either interannual variability or decadal variability for detrended data. I would be happy for the proxy selection to be based on decadal correlations, rather than interannual correlations, but it needs to be with detrended data, in my opinion. The criticism that the selection process forces a hockey stick result will be valid if the trend is not excluded in the proxy selection step.”
      
      Raphael Neukom, same day to Karoly:
      
      “I agree, we don’t have enough strong proxy data with significant correlations after detrending to get a reasonable reconstruction.”
      
      There’s plenty more but I’ll leave it up to our host to sift through.
- Jean S
  
  Posted Oct 23, 2012 at 3:15 AM | Permalink
  
  Re: Michael K (Oct 22 23:14),
  if such an email exists, then he is somewhat contradicting Gergis, who wrote:
  
  When we went to recheck this on Tuesday 5 June, we discovered that the records used in the final analysis were not detrended.
  
  If they had found the error on June 5th (Melbourne time), Neukom would defenitely have written ‘yesterday’ as he was reporting the error 10 AM in the morning. On the other hand, if they had independently found the error in the early morning of June 6th (Melbourne time; coinciding the exact time I reported the problem here), I find it highly unlikely that one is already reporting it to someone a maximum of 3-4 hours (depending on their working habbits) later. When you find such an error, first your triple check your own calculations/archieves, then you contact your (main) co-authors (by phone) to let them to check it a few times more.
  
  Just for curiosity, I actually found the error a few days earlier (Saturday evening Finnish time). I was on our summer cottage with only my mini Laptop with me, so I decided to redo the analysis a couple of times with desktop when at home before announcing it. I simply could not believe they had done such a stupid error, and therefore, I was very careful when writing my “announcement” (as I was still, even after checking the thing about 10 times, partly fearing that I had myself done some mysterious error or somehow misunderstood their text).
  - Michael Kottek
    
    Posted Oct 23, 2012 at 5:04 AM | Permalink
    
    Two emails relating to the timing of the discovery are here: http://tinyurl.com/93un4uz
    
    There are many more I will probably release, but I thought it best to let the people affected know of my intention to do so.
    - Jean S
      
      Posted Oct 23, 2012 at 6:20 AM | Permalink
      
      Re: Michael Kottek (Oct 23 05:04),
      Thanks! So we have a timeline (please recheck these):
      – Jun 5 at 4:42 PM ClimateAudit Time (is that UTC/GMT -5 hours ???) [Steve – yes] (= Jun 6 7:42 AM (Melbourne time) (UTC/GMT +10 hours as there was no Daylight saving time back then there) ???) [Steve – 15 hours between blog time and Melbourne time appears right to me as well] I posted my “announcement”, which includes a link to turnkey R code and data to replicate the problem (and went to bed afterwards as I’m living UTC/GMT +3 hours = 00:42 AM).
      – 9:46 AM (Melbourne time) (= 6:46 PM ClimateAudit time ???) [Steve- agreed]Raphael Neukom sends an email to Gergis and Karoly stating:
      
      As just discussed with joelle on skype, I found a mistake in our paper in journal of climate today.
      
      But Neukom is in Switzerland, so his email time is 1:46 AM (UTC/GMT +2 hours), and his “today” likely refers to June 5th (in Switzerland).
      
      So If my timeline is correct, Neukom had over two hours after my post to check the issue, make a Skype call to Gergis, and to write the e-mail. But of course, it is possible that he found the issue on his own coincedentally within hours of me making the post. Maybe we have some type of telepathic connection unknown to me? 🙂
    - Steve McIntyre
      
      Posted Oct 23, 2012 at 6:42 AM | Permalink
      
      Climate Audit blog time is one hour further west than Eastern Daylight Time i.e. minus 5 hours to UTC and thus minus 15 hours to Melbourne time. Your post at 2012-06-05 16:42 blog time would have been 2012-06-06 07:42 Melbourne time, two hours before the Neukom-Gergis email and Skype.
    - Steve McIntyre
      
      Posted Oct 23, 2012 at 11:04 AM | Permalink
      
      Jean S,
      Take another look at the following statement in Karoly’s email of June 8 quoted in the post:
      
      While the paper states that “both proxy climate and instrumental data were linearly detrended over the 1921–1990 period”, we discovered on Tuesday 5 June that the records used in the final analysis were not detrended for proxy selection, making this statement incorrect.
      
      At the time, commenters observed the peculiar precision of Karoly’s date stamp – an assertion both clarified and repudiated by the Neukom email. June 5 in Australia had come and gone without Gergis and Karoly being aware of the error. They did not become aware of the error until Neukom notified them on June 6, Australia time.
    - Steve McIntyre
      
      Posted Oct 23, 2012 at 6:36 AM | Permalink
      
      From Michel K here http://tinyurl.com/93un4uz
      
      S
      
      ubject: Mistake in the Australasian TT paper
      Date: Wednesday, 6 June 2012 9:46AM
      From: Raphael Neukom
      To: Joelle Gergis , David Karoly
      Conversation: Mistake in the Australasian TT paper
      Hi Joelle and David,
      As just discussed with joelle on skype, I found a mistake in our paper
      in journal of climate today .
      It is related to the proxy screening, so it is a delicate issue. In the
      paper we write that we do the correlation analysis for the screening
      based on detrended {instrumental and proxy) data, but in reality we did
      not use detrended data.
      73
      The origin of the mistake is that at the stage when we were writing the
      paper my approaches have already evolved and I had made the proxy
      selection for the SH reconstruction based on detrended data. I therefore
      had in my mind that we had done the same for Australasia month s ago and
      was very negligent not to check this carefully .
      . Using detrended data would only select very few proxy records that would
      not allow a reason able reconstruction. I think it is basically
      justifiable to do the screening without detrending but changing these
      words may cause troubles.
      Fortunately we have not received the proofs yet. So my suggestion is to
      write to the editor, explain the mistake and ask for permission to
      correct the error, if necessary via sending it out to review again.
      I apologize for the mistake and the troubles it may cause and hope that
      we can find a good way to correct it.
      Oavid your advice on this would be very much appreciated
      Thanks a lot and b est regards
      Rap hi
    - Jean S
      
      Posted Oct 23, 2012 at 6:56 AM | Permalink
      
      Re: Michael Kottek (Oct 23 05:04),
      
      this is good one:
      
      Using detrended data would only select very few proxy records that would not allow a reasonable reconstruction. I think it is basically justifiable to do the screening without detrending but changing these words may cause troubles.
      
      Recall that on June 13th Gergis wrote:
      
      The results are unlikely to change substantially (a drop from 27 to 22 records based on an initial assessment) but there was a mistake in the sentence describing the method so we felt that we needed to voluntarily withdraw the paper in press with the journal, amend the text and add some extra supplementary material justifying our method.
      
      Why there is a drop if they use non-detrended screening (as was actually done in the paper)? I also recall Karoly firmly stating that they are going to stick to detrended screening (as described in the paper), but can’t find a link right now.
    - S. Geiger
      
      Posted Oct 23, 2012 at 12:13 PM | Permalink
      
      “Raphi got to bed at 2 am going through all of this…” (consistent with reference to June 5?
    - Jean S
      
      Posted Oct 23, 2012 at 2:39 PM | Permalink
      
      Re: Steve McIntyre (Oct 23 11:04),
      notice also the fact the Neukom first skyped Gergis. If you find a huge mistake in your work while you know your co-author(s) is (are) sleeping, do you wait until she wakes up in order to skype her? Or do you send her/them an e-mail explaining the situation and asking them to skype you ASAP (and perhaps saying that you will be awake until a certain time)? Given that “Raphi” informed first Gergis by skyping leaves IMO only two possibilities:
      1) He skyped Gergis almost immediately after being aware of the problem (since she was already online anyhow)
      2) There was something he wanted to discuss with Gergis privately before contacting the senior author (Karoly).
      Given the tone in Karoly’s response and the fact that Neukom informed Karoly about the skype-call, I have hard time believing 2).
    - Spence_UK
      
      Posted Oct 28, 2012 at 5:11 PM | Permalink
      
      Exact timings and coincidences have always been very important to these international men of mystery 🙂
  - Jean S
    
    Posted Oct 23, 2012 at 2:16 PM | Permalink
    
    Re: S. Geiger (Oct 23 12:13),
    He went to bed after sendind the email to Gergis and Karoly?
Dr K.A. Rodgers

Posted Oct 28, 2012 at 2:52 PM | Permalink

Jean S
“But of course, it is possible that he found the issue on his own coincedentally within hours of me making the post. Maybe we have some type of telepathic connection unknown to me?”

Sounds like Sheldrake’s morphic resonance to me.
Paul in CT

Posted Mar 20, 2013 at 2:10 PM | Permalink

Does anyone know the current status of this article?

Thanks,
Paul

32 Trackbacks

By American Meteorological Society disappears Gergis et al paper on proxy temperature reconstruction after post peer review finds fatal flaws | Watts Up With That? on Jun 8, 2012 at 4:28 PM

[…] Steve McIntyre reports the paper has been put “on hold” https://climateaudit.org/2012/06/08/gergis-et-al-put-on-hold/ Rate this:Share this:TwitterFacebookStumbleUponRedditDiggEmailLike this:LikeBe the first to like […]
By Aussie Hockey Stick paper 'put on hold' | Australian Climate Madness on Jun 8, 2012 at 6:25 PM

[…] Read Steve's post here. […]
By Niche Modeling » Gergis’ hockeystick “on hold” on Jun 8, 2012 at 6:59 PM

[…] may by now have heard here or here that “Evidence of unusual late 20th century warming from an Australasian temperature […]
By Another warming “hockey stick” is withdrawn / “put on hold” for bad data « The k2p blog on Jun 9, 2012 at 12:15 AM

[…] Society Journal is withdrawn / “put on hold” after publication when the on-line community (Jean S / Steve McIntyre) find the authors to have cherry picked and improperly “massaged their data, it says 2 […]
By La blogosfera limpia la muy sucia “ciencia del cambio climático”. « PlazaMoyua.com on Jun 9, 2012 at 12:37 AM

[…] Gergis et al “Put on Hold” […]
By Climate Conversation Group » Climate warrior’s only sword is science on Jun 9, 2012 at 12:40 AM

[…] at WUWT describes the story and Steve McIntyre at Climate Audit is the story. Here’s the letter to Steve from the paper’s senior author, David Karoly. […]
By Niche Modeling » Gergis' hockeystick “on hold” on Jun 9, 2012 at 2:01 AM

[…] may by now have heard here or here that “Evidence of unusual late 20th century warming from an Australasian temperature […]
By Niche Modeling » Gergis' hockeystick “on hold” | Modeling on Jun 9, 2012 at 4:12 AM

[…] may by now have heard here or here that “Evidence of unusual late 20th century warming from an Australasian temperature […]
By Climatemonitor on Jun 9, 2012 at 10:04 AM

[…] Qui la spiegazione 8GIU0Tweet […]
By Jo Nova chronicles the snapping of the Gergis hockey stick | Watts Up With That? on Jun 9, 2012 at 1:14 PM

[…] have been looking through the paper, and three weeks after it was published a team at Climate Audit uncovered a problem so significant that the authors announced that this paper is “on hold”. It […]
By A Saturday In June « Gonna' Say It on Jun 9, 2012 at 4:41 PM

[…] Montford at Bishop Hill gives a good layman’s explanation. Steve McIntyre and colleagues at Climate Audit did the […]
By Gergisgate en mijn debat met Van Dorland (KNMI) on Jun 9, 2012 at 5:55 PM

[…] op ClimateAudit.org is verpulverd. Lees het hele sneue verhaal op Bishop Hill en Joanne Nova. Hierrrr het laatste stukkie op ClimateAudit.org waarin we zien dat de printpublicatie in de Journal of Climate on hold is gezet. McIntyre gniffelt […]
By blog reviews, winners…media, total fail | pindanpost on Jun 9, 2012 at 7:29 PM

[…] have been looking through the paper, and three weeks after it was published a team at Climate Audit uncovered a problem so significant that the authors announced that this paper is “on hold”. It […]
By Question: Where Is Real Peer Review Conducted? Answer: The Blogosphere | suyts space on Jun 9, 2012 at 10:42 PM

[…] Queen’s “Another one bites the dust“) Well, this is what happened…. Gergis et al “Put on Hold” . Well, […]
By 300,000 Dollars and Three Years to Produce a Paper that Lasted Three Weeks: Gergis | Sovereign Independent UK on Jun 10, 2012 at 5:45 AM

[…] have been looking through the paper, and three weeks after it was published a team at Climate Audit uncovered a problem so significant that the authors announced that this paper is “on hold”. It […]
By German Scientists: “Joelle Gergis Has Lost All Critical Distance To Her Research Results” on Jun 10, 2012 at 7:40 AM

[…] of thanking McIntyre for the important information, Gergis curtly wrote back saying she wanted no further E-mail contact. Also Gergis refused to provide further data for the purpose of […]
By Global Warming – Another hockey stick Broken « Newsbeat1 on Jun 10, 2012 at 10:36 AM

[…] Steve McIntyre […]
By Climate science … sows’ ears and silk purses « The View From Here on Jun 10, 2012 at 6:37 PM

[…] Gergis and her co-authors (who happened to include well-seasoned IPCC “Review Editor”, David Karoly) could not find a quick-fix which would countermand – and enable them to dismiss – the […]
By Niche Modeling » Screening on the dependent, auto-correlated variable on Jun 10, 2012 at 8:54 PM

[…] side, represented by Jim Boulden say, says screening is just fine. So, once again, if you are proposing that a random, red noise […]
By Paper claiming hottest 60-year-span in 1,000 years put on hold after being published online « Retraction Watch on Jun 11, 2012 at 8:58 AM

[…] change skeptic site Climate Audit on May 31 (second half of post). On Friday, June 8, McIntyre reported that the study had been put “on hold” and published this email from co-author David Karoly: Dear […]
By Australian Warming, Hockey Sticks and Open Review - NYTimes.com on Jun 11, 2012 at 5:29 PM

[…] at the request of the authors, has been “put on hold” by the Journal of Climate after questions were raised publicly about one of the researchers’ methods, starting with a comment on Steve […]
By The Climate Change Debate Thread - Page 1288 on Jun 12, 2012 at 5:10 AM

[…] […]
By Jennifer Marohasy » Australia’s Hottest: Withdrawn on Jun 12, 2012 at 6:11 AM

[…] 2] Gergis et al Put on Hold, Steve McIntyre, June 8, 2012 https://climateaudit.org/2012/06/08/gergis-et-al-put-on-hold/ […]
By David Karoly – leader of the ‘climate underground’? | Watts Up With That? on Jul 11, 2012 at 5:30 AM

[…] Source: https://climateaudit.org/2012/06/08/gergis-et-al-put-on-hold/ […]
By ABR pulls Karoly’s review of Michael Mann’s book | Watts Up With That? on Jul 12, 2012 at 6:12 AM

[…] Steve McIntyre writes on Climate Audit: Posted Jul 11, 2012 at 8:52 PM | Permalink I wrote a letter of complaint to Karoly with a copy to the Dean of Research. I’m busy on a […]
By See if you can find the ‘legal threat’ to David Karoly | Watts Up With That? on Jul 14, 2012 at 10:31 AM

[…] learned of this article from a CA reader here. Karoly’s slagging seemed particularly cheeky given the role of Climate Audit in the recent […]
By Another Untrue Allegation by Karoly « Climate Audit on Jul 17, 2012 at 7:45 AM

[…] learned of this article from a CA reader here. Karoly’s slagging seemed particularly cheeky given the role of Climate Audit in the recent […]
By Revkin screens out cops’ Climategate screening exercises « The View From Here on Jul 22, 2012 at 7:38 PM

[…] month, on the heels of the Gergis and Karoly paper being “put on hold”, Revkin felt obliged to […]
By Gergis et al hockey stick paper withdrawn – finally | Watts Up With That? on Oct 18, 2012 at 10:37 AM

[…] may recall Steve McIntyre’s evisceration of Gergis et al. Steve’s question has now been answered. In retrospect, it looks like David Karoly’s […]
By McIntyre’s triumph over Gergis, Karoly, and Mann | Watts Up With That? on Oct 31, 2012 at 2:34 AM

[…] Posted Oct 28, 2012 at 7:04 AM | Permalink […]
By Gergis et al Correspondence « Climate Audit on Aug 13, 2013 at 10:42 PM

[…] Kottek writes in the comment […]
By a ‘clusterf’ of Australian science papers … F for fail | pindanpost on Jul 3, 2014 at 5:22 AM

[…] Canada’s senior climate auditor, fresh from demolishing the science and statistics of a number of Australian papers from Cook (Threats from the University of Queensland) and Lewandowski (The “Ethics Application” for Lewandowsky’s Fury), to Marcott (April Fools’ Day for Marcott et al) and Gergis (Gergis et al “Put on Hold”), […]

Climate Audit