Climategate II Tools

A searchable version of Climategate 2 is online at and at

I’ve made a time-concordance of the new emails and placed it online at {Note – I’ve updated this to include julian seconds to match to climategate1.)

For R-users, I’ve made an R-concordance of the 5349 emails as text files and placed it online at You can download to your own file location e.g. “d:/data/climategate2/”


  1. Posted Nov 23, 2011 at 10:23 AM | Permalink

    There do appear to be some duplicates from the first dossier.

    For example, by Ed Cook (you can find a lot of his stuff if you search on a certain term that starts with “f” and ends in “ck” and has a vowel in the middle bit.)

    That email is one I’ve bought up before, in which Ed so eloquently exposes the skill of dendro recons, at least that’s what it seems like to me.

  2. Steve McIntyre
    Posted Nov 23, 2011 at 10:25 AM | Permalink

    I would like to be able to write text direct to pdf in R. I can do plots this way through

    But the same strategy doesn’t stick if I use write or print. Does anyone know how to do this?

    • Posted Nov 23, 2011 at 10:28 AM | Permalink

      try cat

    • Posted Nov 23, 2011 at 10:34 AM | Permalink

      also chek textplot() in the gplots package.

      also sweave

    • Posted Nov 23, 2011 at 1:34 PM | Permalink

      If you have Linux (or Cygwin on Windows) this does the job, i.e. write a text file, then convert it to PDF.

      a2ps –media=Letter -o – 4550.txt | ps2pdf – 4550.pdf

      (leave out the media bit if you want A4 pdf files)

    • steven mosher
      Posted Nov 23, 2011 at 4:07 PM | Permalink


      attach the gplots package

      use textplots, should work


    • phil
      Posted Nov 24, 2011 at 8:05 AM | Permalink

      Yes — using textplot() from the gplots package. Cut & paste from its

      ### Make a nice 4 way display with two plots and two text summaries

           plot( Sepal.Length ~ Species, data=iris, border="blue", col="cyan",
                 main="Boxplot of Sepal Length by Species" )
           plotmeans( Sepal.Length ~ Species, data=iris, barwidth=2, connect=FALSE,
                      main="Means and 95% Confidence Intervals\nof Sepal Length by Species")
           info <- sapply( split(iris$Sepal.Length, iris$Species),
                           function(x) round(c(Mean=mean(x), SD=sd(x), N=gdata::nobs(x)),2) )
           textplot( info, valign="top"  )
           title("Sepal Length by Species")
           reg <- lm( Sepal.Length ~ Species, data=iris )
           textplot( capture.output(summary(reg)), valign="top")
           title("Regression of Sepal Length by Species")

      So textplot(capture.output(summary(xlm)) may become be your new best friend.

      Hth, Dirk

  3. Posted Nov 23, 2011 at 10:31 AM | Permalink


    An older email, August 99. Ed Cook actually agreeing with Fred Singer about a Wigley paper. Heaven forbid that he or any of the others mentioned in this email as agreeing that Wigley’s paper was horrible, would speak up on the record.

  4. John F. Pittman
    Posted Nov 23, 2011 at 10:50 AM | Permalink

    Steve McI and Craig (and Chris Colose), have you looked at 0114 and 0237, perhaps 0251 and 0262?

  5. Ken
    Posted Nov 23, 2011 at 10:55 AM | Permalink

    The incremental release MIGHT be indicative of a passive-aggressive manipulation: Release some data, let those involved take a firm stand…then…after those positions are firmly locked in….release more that exposes contradictions.

    If I was trying to maintain anonymity AND expose truth that’s how I’d go about it.

    WHICH MEANS: if some do a thorough scrub of the data released, cross-referencing with prior releases & and the defenses/explanations those prompted, one is likely to find some very incriminating inconsistencies.

    The value isn’t in any particular e-mails & associated attestations, its in the piecing together of the right puzzle pieces to expose a more significant picture.

    Like finding needles in a haystack & then arranging them in their proper context. That’s better than finding gold, its more like finding the recipe for making gold…but a lot more work to sort out. I truly hope someone/some people have the patience & bookkeeping skills to tease out & compile the bigger pictures waiting to be found & exposed.

    • TerryMN
      Posted Nov 23, 2011 at 11:22 AM | Permalink

      Interesting theory. Also referred to as “Giving them enough rope to hang themselves.”

    • Posted Nov 23, 2011 at 1:16 PM | Permalink

      Again Keith Briffa comes out of it all fairly well – though one shouldn’t speculate…

    • ChE
      Posted Nov 23, 2011 at 1:34 PM | Permalink

      I suspect that this is it. The reason I say that is the encrypted portion is a third payload to be delivered at a later date. Releasing the encrypted archive was brilliant – it assured that no one can ever delete any of this material.

      • ChE
        Posted Nov 23, 2011 at 1:37 PM | Permalink

        Matter of fact, to take that a step further, It’s not beyond possibility that FOIA delivered the key to certain TEAM members, so that they are aware of what is already out in the wilds of the internet, for want only of a key that can fit on a business card.

        This has spy novel potential.

        • Posted Nov 24, 2011 at 2:03 AM | Permalink

          I bet he gave the key to swifthack

          that would be a devilish thing to do

    • Steve Garcia
      Posted Nov 23, 2011 at 1:58 PM | Permalink

      See my comment below at 1:57pm (EST?).

  6. Jason
    Posted Nov 23, 2011 at 11:24 AM | Permalink

    Has anyone yet merged all the emails from both releases into one dat-ordered archive? That would help.

  7. Dave
    Posted Nov 23, 2011 at 11:26 AM | Permalink

    Phil Jones writes:
    “I don’t consider myself a public servant, and I doubt many working in the University sector in the UK would either. University workers in the UK are not what we call civil servants.”

    In one sense he is correct, he does not work for the UK government in the sense that someone working for the Home office is. In another real sense he is dead wrong, because he is mostly funded and administered from UK government (taxpayer) funds, and therefore ought to be treated the same as if he were a civil servent.


    • jeff taylor
      Posted Nov 23, 2011 at 12:05 PM | Permalink

      He is incorrect. All those that draw compensation from Government are by definition public servants.

      • Steve Garcia
        Posted Nov 23, 2011 at 1:57 PM | Permalink

        I am pretty much convinced at this point that FOIA was moved to this release by the BEST papers and the attention they received. That gave him just enough tome to review and extract – just before Durbin and just after the 2nd anniversary of Climategate I.

        BEST seemed to give momentum back to the warmers, and seeing what is in the emails, I don’t imagine FOIA wanted that momentum to last very long, not when he (she?) was sitting on all this.

        Ken above talked about FOIA exhibiting “passive-aggressive” characteristics. I disagree. I think FOIA is sitting on his stash and is using it as he sees fit. If Ken doesn’t like that, whoop de doo.

        I don’t even see where

        Release some data, let those involved take a firm stand…then…after those positions are firmly locked in….release more that exposes contradictions.

        is passive aggressive.

        But I also don’t see where FOIA would wait so long before releasing them if Ken’s analysis is right. The time for them to feel comfortable with the new lay of the land was a long time ago – EXCEPT FOR BEST. First of all, I don’t think they have EVER felt comfortable, not since Nov 19, 2009. They had the world on a string up till then, and they have not had it that way at all since then – up until BEST, which was the first significant shift since Climategate I. Until BEST, a good percentage of journalists had became wary of them (good thing), and they had lost their monopoly at the public podium.

        But I DO agree with Ken, if one considers that BEST gave them some momentum to the point where they would feel that they’ve got the upper hand again. THEN, yes, knock them back on their heels again – but also with the zip file now all over the world, ready to become their very own Pandora’s Box/plague of locusts, they should realize that there is plenty of more ammo.

        But one other thing this timing does is that it shows that the basis of BEST’s work – the databases – has been mishandled by CRU, that the science that CRU did was fumbled, screwed up, sloppy, and not even enough to convince their own friends, not without a lot of leaning on them. (If it hadn’t been for the IPCC connection, it is probably doubtful that CRU’s people would have much standing; too many around them thought their work sloppy.)

        And if BEST agreed with CRU’s numbers, that is a weakening of the BEST=AGW-IS-RIGHT meme. If one’s study agrees with someone known to be sloppy, what does that say about one’s work, after all? And with the rush job BEST did (and the criticism from Judith Curry, the #2 on the papers), one must begin to wonder. Muller really did not look confident about his work, and for a semi-emeritus 40+ years into his career, that is not usual.

        I think mo is on our side again, even if not by a lot.

        • Steve Garcia
          Posted Nov 23, 2011 at 2:00 PM | Permalink

          This was not supposed to go in this sub-thread… (I know how it happened, but it was an accident.)

        • Posted Nov 24, 2011 at 2:01 AM | Permalink

          The BEST data doesnt rely very heavily on CRU.

          and cru unique data amounts to nothing

    • cdquarles
      Posted Nov 23, 2011 at 12:33 PM | Permalink

      Re: Dave (Nov 23 11:26), My definition of public servant is anyone who meets the needs of the public via voluntary transactions, but that’s just me 🙂 and anyone who is directly or indirectly paid by the government is a government agent. In this context, Phil Jones is a public servant in the sense of government agent.

  8. Bob Koss
    Posted Nov 23, 2011 at 11:37 AM | Permalink


    Your time-concordance link doesn’t work. Need to replace with

    • Bob Koss
      Posted Nov 23, 2011 at 12:29 PM | Permalink

      I noticed three of the email dates are incorrect in the time-concordance. None of those emails are significant.

      id, concordance date, actual date
      338, 2094-05-03, 2001-05-03
      663, 0200-05-12, 2004-05-12
      1380, 0199-02-24, 1997-02-24

  9. Stacey
    Posted Nov 23, 2011 at 11:39 AM | Permalink

    @ Dave

    In one sense he is definately correct because a Public Servant well serves the public 🙂

  10. Posted Nov 23, 2011 at 11:46 AM | Permalink

    : The team hearts industry funding, when the need arises.

    • Posted Nov 23, 2011 at 11:46 AM | Permalink

      “:” is supposed to be 0243.txt.

      • Steve Garcia
        Posted Nov 23, 2011 at 2:16 PM | Permalink

        Yes, if you or Steve M or Anthony Watts had such a smoking gun email, showing an organizing of meetings and corporate/oil funding, the warmers would be all over you.

        CLIVAR (mentioned) is a warmist company.

        I’d suggest that Electrical Power Research Institute (mentioned) apparently was trying to find common cause with the warmists, in the face of the (then seeming) inevitability of having to deal with carbon credits and/or some level of Kyoto governmental policies.

        I find it amazing that you and Steve and Anthony were able to get enough traction for so long, in the face of the AGW monopoly. How you all did it is hard to fathom. When you think of it, though, if it hadn’t been for FOIA the first time, where would we all be right now? It has taken both – your longevity and Climategate I – to have the world the way it is now. Those and the failure of Copenhagen.

  11. Rickard B
    Posted Nov 23, 2011 at 11:55 AM | Permalink

    Here is a torrent with FOIA2011, no problem with the download compared to the other sites:

  12. genealogymaster
    Posted Nov 23, 2011 at 12:01 PM | Permalink

    I have the second set of emails if someone has the first i could string them together let someone upload it as a file.

    • Posted Nov 23, 2011 at 1:05 PM | Permalink

      write to steve.

    • Posted Nov 23, 2011 at 2:30 PM | Permalink

      My website has both the first and 2nd set of emails. In a short while I hope to add dates to the 2nd set so that they get placed in order when you search.

      Steve: I’ve done the work of extracting the dates. I’ve also extracted julian times for the new batch and can match 125 emails to the prior nomenclature.

      • steven mosher
        Posted Nov 23, 2011 at 6:35 PM | Permalink

        what would be nice is for people to be able to add tags to the files.

        tags that are tied to topics or narratives.

        narratives are critical

        Steve or I can suggest narrative tags

  13. Sean Inglis
    Posted Nov 23, 2011 at 1:16 PM | Permalink

    For those with local caches of the release on Linux, Midnight Commander (mc) is an excellent way to browse. Set it with left hand pane as file list, right hand as preview, and your can scroll up and down skimming easily.

    • ChE
      Posted Nov 23, 2011 at 1:41 PM | Permalink

      And if you don’t have a linux box, it’s pretty simply to download and burn a bootable CD of Ubuntu (or even a thumb drive, but that’s more work).

  14. Barclay E MacDonald
    Posted Nov 23, 2011 at 2:12 PM | Permalink

    This is very interesting to watch, the blogosphere as a conduit for a massively parallel, multithreaded attack on a fairly large and complex problem. Thanks to all contributors.

  15. Robert S.
    Posted Nov 23, 2011 at 2:23 PM | Permalink

    For anybody that’s browsing the /documents/ folder I would suggest having your word processor ‘show changes’. It should provide some insight into the editing those documents.

  16. Interested Observer
    Posted Nov 23, 2011 at 2:32 PM | Permalink

    I wonder how long the 7z archive will remain closed.

  17. Kevin
    Posted Nov 23, 2011 at 2:32 PM | Permalink

    Steve, any comment on this?

    When I use I searched on “Vincent Courtillot” from France.

    It comes up with an interesting look at how they treat qualified skeptics. In one note Phil Jones says he doesn’t allow public access of data for the stations because of “McIntyre” and that you had posts about Courtillot’s analysis.

  18. genealogymaster
    Posted Nov 23, 2011 at 2:37 PM | Permalink

    I have strung all the emails from batch one and batch two together for comparison, would anyone like the file?

  19. Steve Garcia
    Posted Nov 23, 2011 at 2:55 PM | Permalink

    Eamil 2002 is a discussion on methodologies. I have to think that FOIA included this one to point out some shortcoming that is significant:

    cc: “Parker David (Met Office)” , David Easterling
    date: Thu Oct 7 10:46:07 2004
    from: Phil Jones
    subject: Re: IPCC base period question
    to: “Russell Vose” , Kevin Trenberth

    We did have a criterion for calculating 61-90 normals (in the paper in J. Climate in 2003 by Jones and Moberg). I don’t think this is crucial but we need a background reference for you method if you’ve made any important changes since the last publication on the subject.
    With the HC we are updating our dataset (will go to HadCRUT3) – some new stations, but mainly some more work on outliers, checking all normals and importantly producing errors on all grid boxes as well as on the NH/SH and global series. Incorporating errors due to homogeneity checks, errors in normals and errors from the bucket/intake adjustments as well as the sampling errors done before. Paper on all this in the New Year, so forget this for the first draft.

    So, answer, do what you think is reasonable and have it documented – either in a paper, or we can add something in the Appendix we have on many of these error issues. The paper I mentioned above will give a first attempt at some of the errors we’ve not considered before. It will be a first attempt though as many of the estimates we use are a little ad hoc. We will have a methodology to see what will happen with a range of estimates.
    Getting as many of the time series with error bars is important as Kevin says. We will also have spatial patterns as well, but these will be less crucial as we only plan to look at patterns over 1979-2003 (5 or 6 eventually) and also 1901-2003-5 as patterns won’t change much from previous reports.
    I briefly talked to Dave about colour maps of trends (and not the dot format). We can look at this later. Need to be able to be clear where the missing areas are and the dots don’t always make this obvious.

    At 22:23 06/10/2004, Russell Vose wrote:

    Thanks, Kevin.
    I’m pretty sure that Phil used a few specific criteria in some of his papers (e.g., a minimum of 20 years of data, 4 years in each decade). I just didn’t know if these details had already been ironed out for IPCC (and if not, then I guess we’re the ones to decide).
    Kevin Trenberth wrote:

    Russell, No doubt Phil will comment more. The baseline is to establish a common period for anomalies and thus it anables maps of anomalies to be more coherent if the same procedures are used everywhere. If there is missing data, this will clearly upset this consistency spatially. In principle this should still not be a problem as long as uncertainties: error bars, are appropriately calculated that fully account for the missing data. But while those details can perhaps be gone into in a paper it is difficult to do it well in IPCC. I think there are fairly standard sorts or requirements for % of data required in a month, months in season, and years etc for the result to have some credibility and it should really be linked to error bars. But most decisions have been ad hoc.
    This is where things like reanalysis can help fill spatial and temporal gaps if done right. But maybe that’s too ambitious here. All this is by way of saying I don’t know the answer. My guess is that yes it should have data in each decade and that it is OK to estimate a base value. But I would let Phil rule on this: if he is available (may be next week). But please do track the error bars.
    Russell Vose wrote:

    Hi guys…
    Dave Easterling thought I should drop the three of you an e-mail with a “base period” question. As I understand it, the idea is to use 1961-90 as the baseline for IPCC. But have any other subcriteria been discussed or otherwise set in stone? For instance, how many years of data must a station have during that period? Must it have at least some data in each decade? Is it okay to estimate a normal if the station lacks sufficient data during the base period?
    Thoughts/feedback appreciated.

    Russell S. Vose, Chief
    Climate Analysis Branch
    National Climatic Data Center
    151 Patton Avenue
    Asheville, North Carolina 28801
    Phone: (828) 271-4311
    Fax: (828) 271-4328

    Now, I have a problem if in the instrument period a station is acceptable if 4 years out of any decade is the minimum threshold and if 20 years out of the 40 years is okay.

    Yes, they talk about errors and missing data, and it clearly appears they are addressing those in terms of sizing the error bars. They then mention that a paper would explain the error bars, but for the IPCC, that is not feasible.

    What I have a problem with is two-fold. One, that the interpolation to infill the missing data is afterward treated as if it is solid, real data, and just as precise. The other is that with the anomaly graphs no one pays any attention to the error bars and everyone assumes the thin line trace is an exact number, down to the half of a tenth of a degree C. Especially the IPCC pols do.

    But 4 years out of 10 and 20 years out of 40 multiplies out to 2 years out 10 or 8 years out of the 40 years. And how many days in those 8 years actually have data?

    And this is in the instrument period… Don’t even ASK me about what I think of the few data points they have to cover the globe in the proxy periods. Do wider error bars suffice? For statisticians, perhaps….

  20. RomanM
    Posted Nov 23, 2011 at 4:16 PM | Permalink

    It seems that a number of the emails contain embedded attachments of various types (pdf,excel, word, etc.). I have tried to find a decent MIME decoder on the web, but couldn’t find one that could extract the attachments. Can anybody suggest a method for easily doing the job?

    E.g.These emails contained the phrase “Content-Transfer-Encoding” although some of these do not have attachments because the phrase appears to be from an email being replied to.

    [1] “0042” “0046” “0060” “0140” “0141” “0211” “0252” “0304” “0379” “0406” “0591” “0689” “0795” “0841” “0851” “0860” “0867” “0897” “0938” “0947” “1081” “1090” “1135”
    [24] “1427” “1521” “1615” “1676” “1723” “1909” “2053” “2056” “2074” “2078” “2089” “2141” “2161” “2358” “2363” “2367” “2376” “2410” “2450” “2544” “2596” “2599” “2642”
    [47] “2791” “2839” “2891” “2928” “2938” “2984” “3122” “3222” “3237” “3343” “3383” “3390” “3405” “3408” “3417” “3418” “3680” “3796” “3823” “3940” “3953” “4006” “4256”
    [70] “4364” “4503” “4602” “4677” “4722” “4842” “4936” “5056” “5135” “5145” “5146” “5155” “5209” “5337”

    • Diogenes
      Posted Nov 23, 2011 at 8:08 PM | Permalink

      Try this for decoding:

      It worked on 0060, 0250, 0304, 0406, first of two files in 0591, second failed.
      Haven’t tried the rest.
      Paste the code in the window, save the file according to the filename listed.

      [RomanM: Thanks. I will give it a try.]

  21. DocMartyn
    Posted Nov 23, 2011 at 4:23 PM | Permalink

    in 2868.txt (2003)

    Tim Osborn send the met office the Mann et al. 1999 series and uncertainties for comparison with HadCM3.

    contains the unfiltered series from 1000 to 1980, calibrated, so represent K anomalies

    is 1 and 2 standard errors.

  22. John Whitman
    Posted Nov 23, 2011 at 5:25 PM | Permalink

    I suggest the self-named ‘we’ who released the info of climategate 1.0 (Nov ’09) and of 2.0 (Nov ’11) did not time both the releases based on upcoming IPCC conferences.

    For the 1.0 climategate release it was likely timed wrt to the climatic buildup to the Copenhagen IPCC conference. Since Copenhagen there have been IPCC meetings/conferences prior to the upcoming Durban with no releases and the imminent Durban conference looks to be impotent at best. So, Durban does not appear to be a significant reason for the major 2.0 climategate release.

    For this current 2.0 climategate release the timing, to me anyway, appears more likely based on the intervention by Mann in the court case of ATI’s FOI request for Mann’s info while at UVa. The evidence of this reason for the timing of release 2.0 is suggested because many of the emails are focused on Mann and cover the period Mann was at UVa; as well as other periods and other ‘Team’ members.

    As to whether ‘we’ is a single person or a number of people. Looking back at the professional cool execution of the releases and the patient strategy then I find it more likely ‘we’ indeed is a of professional people.


  23. Posted Nov 23, 2011 at 9:27 PM | Permalink

    A compendium in PDF format of the complete collection of mail messages:

    (It is 40mb)

    (May be convenient for iPad, Nook, Kindle, etc. users).

    • Posted Nov 24, 2011 at 12:23 AM | Permalink

      And here is a PDF in portrait form:

      • Posted Nov 24, 2011 at 1:58 AM | Permalink

        a concordance would be cool

        • Posted Nov 24, 2011 at 11:51 AM | Permalink

          Some kind of key word in context index should be quite doable…I’ll think about it.

        • Posted Nov 25, 2011 at 9:28 PM | Permalink

          I made a ‘key word in context’ (kwic) index from the mail messages, called kwic.txt, which is posted here:

          This file is 105mb. Once you have on your machine commands like:

          grep -i “cheque” kwic.txt

          …will give you a rapid sense of who is picking up what in the way of extra-curricular payments (sorry consulting). The benefit of the kwic index is that it is fast, and you see the context of the word. I also made a PDF of this file here:

          (The PDF file is 44mb)

        • Posted Nov 26, 2011 at 1:55 PM | Permalink

          I had a mistake in the script that create kwic.txt. I’ve fixed it now, and the update file sizes are kwic.txt (197mb) and kwic.pdf (79mb). Yes, these files are large. They have a lot of redundancy because each sentence is reproduced several times with a different keyword highlighted. Their merit is that they quickly allow you to assess the context of a given term.

  24. MrPete
    Posted Nov 24, 2011 at 8:59 AM | Permalink

    I tried searching the online databases for IPPR and get a blank page. Anyone know how to do this search correctly?

    • Posted Nov 26, 2011 at 11:07 AM | Permalink

      You can use ‘grep’ in a linux or cygwin environment. Probably ‘midnight commander’ (as discussed here) can also be used. The files that contain the string ‘ippr’ are 1748.txt 3213.txt 4573.txt.

  25. Buffy Minton
    Posted Nov 24, 2011 at 9:11 AM | Permalink

    Apologies if this has already been done:

    Here’s the freshly decoded Mime attachments to the Climategate 2011 emails

    • RomanM
      Posted Nov 24, 2011 at 10:11 AM | Permalink

      Nicely done. Thank you. It will save a lot of time and effort.

      I notice that at least one of the attachments (a rather large ~10MB pdf) is included in the documents folder accompanying the CG2 emails. Most of the other attachments seem not to be otherwise reproduced in that folder.

      • Buffy Minton
        Posted Nov 24, 2011 at 10:25 AM | Permalink

        That’s quite possible……I only thought that there might be some duplication after I uploaded. There’s also various extraneous files generated by the decoder (Munpack on Linux) which I should have removed….but glad to be of help, anyway!

  26. jazznick
    Posted Nov 25, 2011 at 3:59 AM | Permalink

    Excellent e-mail search facility at

    This combines both Climategate 1 & 2 e-mails with a search engine. !!!

  27. matthu
    Posted Nov 25, 2011 at 4:09 AM | Permalink

    Advice needed please!

    Let’s suppose we are using one of the following:

    Can anyone advise on advanced search techniques?
    e.g. looking for compound term, what delimiters can one use?

    use of wild cards?

    Suppose I want to search for BP – this seems to get rejected as being too short?


    • Bob Koss
      Posted Nov 25, 2011 at 12:06 PM | Permalink

      I don’t know about your other questions, but returns results for BP if you surround it with spaces.

      • matthu
        Posted Nov 25, 2011 at 1:51 PM | Permalink

        Thanks – I have since realised Ecowho seems to interpret compound phrases without any delimiters as well.

  28. Alix James
    Posted Nov 25, 2011 at 12:40 PM | Permalink

    Just wondering. At:

    there files are “missing”: is this a result of some not being released yet?


    • Bob Koss
      Posted Nov 25, 2011 at 1:06 PM | Permalink

      My understanding is missing emails probably concern strictly personal information such as health/family related.

%d bloggers like this: