"Unprecedented" Data Purge At CRU

On July 31, 2009, the purge of public data at CRU reached levels “unprecedented” in its recorded history. Climate Audit reader Super-Grover said that the data purge was “worse” than we expected.

On Monday, July 27, 2009, as reported in a prior thread, CRU deleted three files pertaining to station data from their public directory ftp://ftp.cru.uea.ac.uk/.
The next day, on July 28, Phil Jones deleted data from his public file – see screenshot with timestemp in post here, leaving online a variety of files from the 1990s as shown in the following screenshot taken on July 28, 2009.

The following day, the following listing of station data available since 1996 (discussed in my post CRU Then and Now) was deleted from public access: ftp://ftp.cru.uea.ac.uk/projects/advance10k/cruwlda2.zip, though other data in the file remained.

This morning, everything in Dr Phil’s directory had been removed.

This is part of a broader lockdown at CRU. Ian Harris, Dave Lister, Kate Willett, Tim Osborn, Dimitrios, Clive Wilkinson and Colin Harpham all altered their FTP directories this morning. Only one directory (Tim Osborn -see below) has added material.

Revisiting the Advance 10K webpage this morning, all Advance 10K data was deleted from their FTP site. None of the Advance 10K data links at http://www.cru.uea.ac.uk/advance10k/climdata.htm work any more.

If you go to the directory page ftp://ftp.cru.uea.ac.uk/projects which formerly hosted ftp://ftp.cru.uea.ac.uk/projects/advance10k directory, it now contains only two directories between Sept 1999 and the present, both dated 8/1/2008, but containing data from 2001.

On July 31, 2009 at 10:41 am, Tim Osborn published a webpage entitled “controversy.htm”. It is located in a folder entitled ftp://www.cru.uea.ac.uk/people/timosborn/censored/ and the webpage ftp://www.cru.uea.ac.uk/people/timosborn/censored/controversy.htm itself is of course censored. [Update: Later on July 31, Tim Osborn, obviously a faithful Climate Audit reader, censored the censored folder and even the existence of the censored folder (and the controversy webpage) is now censored.]

I presume that the data has not been totally destroyed, only that, after many years of public availability, it has been put under lock and key. It’s as though CRU is having a collective temper tantrum.

157 Comments

  1. Michael Ried
    Posted Jul 31, 2009 at 7:12 AM | Permalink

    So Thursday has come and gone. Did you ever get any kind of
    ‘destroy all files’request from CRU?

    • henry
      Posted Jul 31, 2009 at 9:06 AM | Permalink

      Re: Michael Ried (#1),

      Michael Ried said: Did you ever get any kind of ‘destroy all files’ request from CRU?

      Seems so, only they misunderstood the request – Steve said he would destroy HIS copy. CRU agreed, and destroyed THEIRS.

      This will make the FOI requests more interesting, though.

  2. JustPassing
    Posted Jul 31, 2009 at 7:13 AM | Permalink

    Maybe those office party pics were a little too sensitive. 🙂

  3. Severian
    Posted Jul 31, 2009 at 7:14 AM | Permalink

    Hmmm…I can see a scenario playing out here:

    “You’ve deleted the data, but I have a copy from when it was up, and it doesn’t support your conclusions, there are errors.”

    “You don’t have the real, complete data, we have the complete data and it supports our conclusions.”

    “Show me the complete data then if mine is erroneous.”

    “We can’t show you the data, it’s secret, we have confidentiality agreements, showing it to you would destroy Science.”

    “I’ll file a FOI request.”

    “Oops, we deleted the data completely! Our bad.”

    Or some variation of the above. This is getting out of hand.

    • Steve McIntyre
      Posted Jul 31, 2009 at 7:24 AM | Permalink

      Re: Severian (#3),

      As I’ve repeatedly said (and I wish that readers would stop making such suggestions on my behalf), I have no reason to believe that there is any smoking gun in the withheld data. I’ve repeatedly said that the obstruction is entirely for CRU’s own commercial purposes – it would show how little work they actually do in the preparation of their climate index.

  4. Ryan O
    Posted Jul 31, 2009 at 7:34 AM | Permalink

    Isn’t this a bit like hitting the brakes after you’ve gone off the cliff?

  5. Tamara
    Posted Jul 31, 2009 at 7:36 AM | Permalink

    As my company’s legal department constantly warns us, even deleted files are discoverable. What is CRU accomplishing by doing this?

  6. Posted Jul 31, 2009 at 7:59 AM | Permalink

    So the timeline has passed without request for deletion?

    Steve: Yes, as I mentioned on the other thread. I’m thinking of contacting the FOI officer at CRU as to his position on the files that I downloaded already. It would probably make sense to FOI the deleted files as well.

  7. Barry
    Posted Jul 31, 2009 at 8:22 AM | Permalink

    I don’t know about the good folks of Great Britain. But if Dr. Phil Jones were a scientist on my US tax payer dollar I would be demanding his head.

    I could easily claim that pigs can fly and that my data analysis can prove it, but if I never release the data and methods what good is my claim. Somebody needs to verify my conclusions for my assertions to start the journey down the road from hypothesis to scientific fact. Just imagine if Sir Isaac Newton had claimed that his findings were “confidential”.

  8. Henry chance
    Posted Jul 31, 2009 at 8:22 AM | Permalink

    In the course of destruction of files and reports, the groundwork for indictments in financial fraud rely on data backup systems. Destruction of files can get people criminal charges. As peple get near to a meltdown or some major confrontatiopn, in panic it is common to see people destry data. Just remember if they get fired or ever get taken to court, this brings them great emotional misery when folks ask them “what were you thinking when you destroyed the files”?
    Read up on interviews that are conducted on criminals by trained investigators.

  9. Steve McIntyre
    Posted Jul 31, 2009 at 8:37 AM | Permalink

    AS others have observed, it is unlikely that data has been “destroyed”; more likely, it has been put under lock and key. At this point, all we can say is that the behavior is very unseemly. I’m not familiar enough with the provisions of the Freedom of Information Act or the Environmental Information Regulations to say whether this behavior is problematic under that legislation and I doubt that most readers are either. Nor do I see any purpose in such speculations until the provisions of that legislation is examined.

    CRU’s behavior is very strange for people used to business-like organizations. It’s as if the organization is having a temper tantrum. More like 4-year old child than a professional organization.

    • DaveR
      Posted Jul 31, 2009 at 9:04 AM | Permalink

      Re: Steve McIntyre (#10),

      At this point, all we can say is that the behavior is very unseemly.

      I’m not even sure we can say that. To me it looks like a petty squabble between you and Phil Jones from which neither party emerges with much credit.

    • Greg F
      Posted Jul 31, 2009 at 9:40 AM | Permalink

      Re: Steve McIntyre (#10),

      I would suggest that everything on the FTP server is a copy. Put there so another “academic” researcher could download it.

  10. Fred
    Posted Jul 31, 2009 at 8:55 AM | Permalink

    1. Lock down the data fort.

    2. Hold breath until blue.

    3. Pack belongings and threaten to run away from home.

    The kiddies are having a moment.

  11. John Luft
    Posted Jul 31, 2009 at 9:07 AM | Permalink

    This is the climate equivalent of the Enron paper shredders.

  12. clazy
    Posted Jul 31, 2009 at 9:12 AM | Permalink

    SM: Thanks for being a paragon of civility and reason. You do a great job protecting your comment section from the base rhetoric diminishing discussion on so many other climate blogs.

  13. Posted Jul 31, 2009 at 9:22 AM | Permalink

    Here’s a theory: People at CRU were being sloppy and storing things they aren’t supposed to store on the public ftp site. (Example: the pdf file for chapter 15 of Robok. The publisher might be displeased to see a copy freely available on the web.)

    Steve’s discussion made someone higher up remind them that this sort of stuff shouldn’t be on the public ftp server. So, everyone was given a directive to clean their ftp directories up. Things will now be archived where they belong.

    This would make disappearance of the files in nearly all directories both innocent and natural. (Still, sort of funny as heck to watch from afar. But innocent enough.)

    • Patrick M.
      Posted Jul 31, 2009 at 9:31 AM | Permalink

      Re: lucia (#16),

      Yeah, I think you got it right. I imagine the network admin told them to get all their files off the ftp server NOW! They will figure out what belongs on the ftp server later and I imagine the files that belong there will start showing up again, (after the network admin okays it).

    • Jonathan
      Posted Jul 31, 2009 at 10:39 AM | Permalink

      Re: lucia (#16), during my first postdoc I used to store all sorts of things on the ftp server which weren’t really “meant” to be there. Firstly, it was an easy way to ensure that I could get to critical files from anywhere (this was it the old days when we were still using tools like Archie and Veronica for file sharing), and secondly it was a nice way of bypassing quota limits. For researchers of a certain age this sort of behaviour is unsurprising.

  14. stan
    Posted Jul 31, 2009 at 9:33 AM | Permalink

    DaveR,

    This isn’t about a petty squabble. It’s about whether Phil Jones does science.

    If you don’t make your work transparent so others can audit and replicate it, it should NEVER be the source of public policy. The reasons cited by certain “scientists” for refusing to share their data, methods, stats, etc., may or may not make sense. Regardless, their work becomes inappropriate for public decision-making because the work becomes untrustworthy. It has been placed outside the realm of the scientific method. It’s no longer science. It’s hot air.

    Science never rests on anyone’s unsubstantiated work. In real science, unsubstantiated claims amount to nothing more than untested hypotheses. If scientist X claims his work demonstrates Y but he won’t allow his work to be examined critically, his claim is nothing more than an interesting rumor. No rational person would ever reach a conclusion on such a claim. To do so would make even less sense than entrusting investment money today with Bernie Madoff.

    That climate scientists do is beyond my ability to understand.

    • Jaye Bass
      Posted Aug 1, 2009 at 4:36 PM | Permalink

      Re: stan (#18), Exactly.

      Here is what is really going on. Certain factions are trying to use environmentalism like commerce (extension of the definition of interstate commerce) was used earlier in the century to chip away at the constitution. Blogs, an inquisitive public and the scientific method are just impediments to these people.

  15. rephelan
    Posted Jul 31, 2009 at 9:50 AM | Permalink

    Maybe they wanted to work from home and the only way to get to the data was to put it on the FTP site because those over-officious, security-obsessed administrators wouldn’t grant access through the fire walls…. yeah, there may be a really good reason why stuff is disappearing from the public directories.

  16. Scott
    Posted Jul 31, 2009 at 9:55 AM | Permalink

    Yeah, I doubt that have anything “requiring” them to host the information out on a public FTP site. I would see it more of a courtesy that they did more than anything. Removing it now strikes me as childish, very much closing the barn door after all the horses are already gone.

    More disturbing is their continued resistance to FOI requests.

  17. DERise
    Posted Jul 31, 2009 at 10:14 AM | Permalink

    An external auditing system is the bedrock of a quality program. I work within and am part of a program with a robust audit system involving government and private industry. Being on the recieving end can be like a painful prostrate exam, but our accident loss rate is zero since program start over 40 years ago.
    NASA, however, is an insular program without an external audit program and the results are telling, and fatal.
    For a true quality outcome, anyone should welcome a knowlegeable audit (unless propriatary information is involved).

    snip – attribution of motive

    • Patrick M.
      Posted Jul 31, 2009 at 10:43 AM | Permalink

      Re: DERise (#22),

      [sniip]

      .
      .
      Those who want to claim the high ground of “Science”, need to remember that “Science” means keeping an open mind and waiting for the facts before coming to a conclusion.

      • DERise
        Posted Jul 31, 2009 at 11:48 AM | Permalink

        Re: Patrick M. (#24),
        snip

        External audits, as a practice, though painful are a improvement process where errors are found, corrected, and root cause found and correct to prevent re-occurance, i.e. improve the product be it rockets, valves, or climate predictions.

        The NASA example I gave is an organization which following an accident with loss of life, adopted a program modeled on our program, but it does not have a robust audit system. They have since had additional fatal accidents.

        Phil Jones has stated that he doesn’t wasn’t someone else to find errors in it, –
        snip

        Though in cosideration, upon reading your comment from many different sides, I would conclude that the reasons that a person/organization would not welcome an audit would be: arrogance, complacency, fear, malfeasance, [snip], cost and feel free to offer any other excuses.

      • Artifex
        Posted Jul 31, 2009 at 1:18 PM | Permalink

        Re: Patrick M. (#24),

        Those who want to claim the high ground of “Science”, need to remember that “Science” means keeping an open mind and waiting for the facts before coming to a conclusion.

        Based on your prior comments, I believe you may labor under some misconceptions. I would claim “Science” is a bit more than keeping an open mind else we would all believe in aromatherapy, dianetics and haruspicy. The very soul of science is to model an event with theory and then examine how that theory stands up to other known facts.

        You are correct that I don’t know Dr. Jone’s motives. I probably never will unless he clarifies things. I can theorize however, and Lucia’s theory looks pretty compelling from where I sit. Given your commendable dedication to scientific principles, care to forward a theory that takes into account what we know about Dr. Jone’s stonewalling and prior action and paints Jones in a positive light ? I must admit I am struggling with this one. I can visualize theories based on everything from fraud to extreme annoyance that one of the hoi polloi would dare question the high priesthood, but I am having issues finding theories that fit in with pure motives. I will probably never know, but I am certainly allowed to speculate.

        Even the above is a bit of a distraction from the real issue. If we are going to make massive changes to Western civilization, what possible legitimate reason could one have for hiding data and methods ? In the long run, maybe the only weapon that Steve has against this sort of institutionalized stonewalling is ridicule. While you may be less than pleased to see your sacred ox gored, it seems that the ridicule is well deserved. It sure seems to me that you are directing more vitriol towards the man with the spotlight than the clown in the ring. All that needs to happen is for the clown to take off his nose, and floppy shoes and hand over the data. We can then all go back to watching the acrobats in the spotlight, eat our popcorn and see how the real science shakes out.

        • Patrick M.
          Posted Jul 31, 2009 at 3:11 PM | Permalink

          Re: Artifex (#35),

          Given your commendable dedication to scientific principles, care to forward a theory that takes into account what we know about Dr. Jone’s stonewalling and prior action and paints Jones in a positive light ?

          No, I’m not going to guess about Jones’ motivation because it’s not productive and it’s not allowed on this blog.

          Based on your prior comments, I believe you may labor under some misconceptions. I would claim “Science” is a bit more than keeping an open mind

          My comment about keeping an opening mind and waiting for the facts was directed at people assuming that because Jones has not released the data there must be something wrong with the data. Steve has already hinted that he suspects that there may be nothing “wrong” with the data.
          .
          .
          I have no problem with people pointing out that Science is based on reproducibility and that Jones’ obstruction of reproducibility prevents Science from being applied in this case. Indeed, that goes straight to the core of Steve’s argument.

          Steve: It’s not that there are no issues. IMO, the key issues pertain to things like potentially excessive reliance on urban airports (the CRU_TAR index) as opposed to something like Mannian principal components. Maybe it matters, maybe it doesn’t. The point is that you need to be working with common data sets to do the analysis.

          One huge advantage of dealing with CRU rather than GISS is that the CRU algorithm, though still unseen, seems far more transparent than the endless GISS blending where it’svery hard to replicate the calculation even with the code available, let alone analyze it for sensitivty.

        • Artifex
          Posted Jul 31, 2009 at 3:52 PM | Permalink

          Re: Patrick M. (#44),

          My comment about keeping an opening mind and waiting for the facts was directed at people assuming that because Jones has not released the data there must be something wrong with the data. Steve has already hinted that he suspects that there may be nothing “wrong” with the data.

          This seems completely reasonable. That being said, I have never seen a “perfect” data set in my life. The main problem with this one is its absence. Once we have it, we can resume quibbling over how important its inevitable imperfections actually are.

        • steven mosher
          Posted Aug 1, 2009 at 1:45 AM | Permalink

          Re: Artifex (#46),

          Let me put SteveM’s comments about the CRU data in perspective. If you compare the two major temperature indexes ( CRU and GISS) you will find very little difference. There are four major differences that I see.

          1. Stations used ( and they are not that different)
          2. How they treat the polar region ( a larger difference)
          3. How they treate grids that have both a land and sea component.
          4. Start date

          It would be instructive to compare GISS without the polar region with CRU. I suspect people will find very little difference. What this means is that CRU adds NO VALUE over what GISS already do. Virtually none.
          They use roughly the same stations, handle area averaging roughly the same way, and cover roughly the same time periods. CRU adds nothing but mystery and intrigue. GISS, for all it’s faults, has open data, and open source, and they have the added bonus of being able to “extrapolate” to get coverage for the N pole. Once SteveM gets the data and stations sorted out people will see that CRU adds nothing over the value ( limited and flawed as it is) of GISS. Nothing. That is my take away of SteveMs point. So when people say that “both CRU and GISS show warming” it’s just a form of double counting. CRU adds nothing of value. Once people get sorted on that we will be back to the only real issue in town. UHI and microsite bias.

        • thefordprefect
          Posted Aug 1, 2009 at 6:10 AM | Permalink

          Re: steven mosher (#53),

          I was going to do a comparison of raw GISS data from grade 1 and grade 2 sites (using the data which the public have provided to Watts for his surface station project) with the graphs in the NCDC Talking Points Memo. Watts suggests that this is invalid as both high grade and all plots have the same correction applied. He also says that the station list they used was out of date.
          A polite request was put in his blog for the full updated database to be posted. Nothing has been updated. And I note from his blurb that this will not be updated until after the “smoking is not dangerous group” The Heartland Institute publish his document.
          Watts claims that:

          “…my published book “Is The U.S. Surface Temperature Record Reliable?”. I hold the copyright on this book. The notice for copyright is in the inside front cover.”

          But the book is freely available on the web and according to you, if it is on the web it is fair game! Even if commercial interests are involved!

          Lukewarmer: “Free the data;Free the code;free the debate”

        • TerryS
          Posted Aug 1, 2009 at 7:29 AM | Permalink

          Re: thefordprefect (#54),

          But the book is freely available on the web and according to you, if it is on the web it is fair game! Even if commercial interests are involved!

          Anthony holds the copyright on the book so he is free to distribute it under whatever license terms he wishes. The book itself contains copyright notices informing the reader that they may not redistribute in part or in whole. This means that while you can download it you can not redistribute it.

          On the other hand, the CRU’s ftp site does not contain any copyright notices or terms and conditions for download. You are therefore free to download anything you find there. Should you wish to redistribute anything you find there then CRU would have a difficult task in claiming any breach of IPR because of the lack of notices. They might prevail claiming infringement of “Database Rights” or “Sweat of The Brow” but they would have a hard time proving it.

          You should note that if the IPR for any of the files is held by third parties then it is the CRU who infringed them by putting the files on a public site rather than anybody who downloaded it.

        • Posted Aug 1, 2009 at 9:12 AM | Permalink

          Re: TerryS (#58),

          On the other hand, the CRU’s ftp site does not contain any copyright notices or terms and conditions for download. You are therefore free to download anything you find there. Should you wish to redistribute anything you find there then CRU would have a difficult task in claiming any breach of IPR because of the lack of notices.

          Not true, copyright is established when the work is created (reduced to tangible form) no registration or notices are required, CRU has copyright to data they put on their site.

        • QBeamus
          Posted Aug 4, 2009 at 9:22 AM | Permalink

          Re: Phil. (#61),

          To be more precise, the absence of the copyright notice means that there can be no legal liability for copying or distributing the work. While you are technically correct that the copyright springs into existance when the work is reduced to a tangible medium, the subsequent publication without notice bars one from asserting that right. There are some circumstances under which one might revive that right, but they won’t create liability for acts taken before the copyright is “revived.” To oversimplify, if a work is published without a copyright notice, it creates a safe harbor in which the public can use the work without fear.

        • steven mosher
          Posted Aug 1, 2009 at 9:15 AM | Permalink

          Re: thefordprefect (#54),

          More on this later when time permits. First, You’ll be hard pressed to find me saying “if it’s on the web, it’s fair game.” WRT to copyrights, copyrights need to be acknowledged and honored according to the law with the appropriate exemptions. I haven’t reviewed the Watts video, but from what I gather somebody used some of Anthony’s material ( probably a fair use use) without showing the proper copyrights and an appropriate take down action was taken. Most of my work these days, Ford, is copyleft. In copyleft we post our material with a copyleft (GPL) license. We want you to copy our stuff, but we have rules about what you have to do if you copy it ( always supply your source code, license back modifications). And we sue people all the time, successfully, for taking what we freely offer and refusing to allow others to copy derivative works. http://gpl-violations.org/ You will note of course that there are exceptions for the academic use of copyrighted material. Something I learned when I taught at the university. So WRT copyrights, unless I was drunk posting you’ll not find me saying that you can take somebodies copyrighted material and exploit it for your own commercial materials ( Long ago I was a part of the movement that fought against RIAA, but there too this was about fair use exceptions to ) I’ll touch on surface stations data when I get back. But In general I’ll say that until data and methods are published and replication is possible, I don’t consider it to be science.

        • Keith
          Posted Aug 1, 2009 at 1:02 PM | Permalink

          Re: thefordprefect (#54),

          TFP – actually, the information is available. Just go to surfacestations.org . Every site review is available. I think you are just wanting a list of the rating 1 & 2 sites, and Anthony may not have compiled such a list. He does say they are going back through, visiting and re-evaluating some of the sites, as USHCN has discontinued some of the sites, and improved some of the others. He is being thourough. But the data is available, you just have to work to find it. Much like Steve and Anthony had to work to find the current version of CRU data that the”mole” released for them. It might take you a day or so to go through the individual station listings, but you yourself can compile a list of those stations with good ratings. Then you can do your own analysis.

        • thefordprefect
          Posted Aug 1, 2009 at 1:27 PM | Permalink

          Re: Keith (#70),

          TFP – actually, the information is available. Just go to surfacestations.org . Every site review is available. I think you are just wanting a list of the rating 1 & 2 sites, and Anthony may not have compiled such a list.
          For the last few days this is the info from the photo pages

          Maintenance
          Site is temporarily down for maintenance.
          Admin Login

          The station list is here:
          http://www.surfacestations.org/USHCN_stationlist.htm
          with this note at the top:
          .
          NOTE: This is NOT Current data – 4/18/2008

        • Keith
          Posted Aug 1, 2009 at 2:24 PM | Permalink

          Re: thefordprefect (#73),
          The tag line you cite is to the fact the UHSCN listing is not current. USHCN has changed there station listings in the past 12 months. As for the station surveys being under maintenance, I am sure they will be back up soon. Anthony admits he has been traveling the last two weeks. He might have new information that he is adding to the survey compilation. He might even be compiling a quick reference list, for each different station rating. I do know the surveys were available at the beginning of July, as a station near my home was surveyed, and I checked on the station rating. It was one of the new rating 2 listings.

        • Anthony Watts
          Posted Aug 1, 2009 at 8:48 PM | Permalink

          Re: Keith (#80), Most of that 2 weeks travel has been to put bread on the table. I’ve surveyed all of 3 stations, 2 of which were closed and will require follow up research to determine the site rating.

        • Anthony Watts
          Posted Aug 1, 2009 at 8:56 PM | Permalink

          Re: thefordprefect (#77),

          For the last couple of days I’ve been getting numerous hack attacks and DOS attacks due to my increased popularity with the alarmosphere I suspect.

          Rather than worry about the security issues individually, I simply put the server into safe mode. In the over 2 years now that the image gallery website has been publicly available, this is the only time I’ve ever had to do this for an extended period.

          I’m also running an offsite backup (since the server resides over 90 miles away from my office) and that takes time, it is very slow. I put the gallery server outside of my own office network so that my own business would not be at risk from people with no scruples. Funny, on one continent we have people trying to hide data and on another people trying to destroy it.

          My actions are prudent I think in view of recent circumstances.

        • steven mosher
          Posted Aug 1, 2009 at 9:53 PM | Permalink

          Re: thefordprefect (#54),

          I would associate myself with McIntyre’s comments here. personally I would like to see the data posted under a GPL like license that would require everyone who USES the data to create published material, must make their analysis code available under a GPL licenses. In short if you use this data to produce a study any code that uses this data must also be published. That would be a cool thing. This would prevent somebody like NOAA from taking open data, running it through closed code and producing results.

        • Geoff Sherrington
          Posted Aug 1, 2009 at 6:49 AM | Permalink

          Re: steven mosher (#53),

          If you look at my graph in #51, how can you justify saying that GISS has any intrinsic value, whether or not it is similar to CRU? The five sets of lines on that graph – and there are thousands more stations in the the same plight – cannot all be right. We are talking about differences exceeding a degree C a year. That’s a lot of correction.

          As it stands, it’s a pecarious investment of maths/stats time trying to work with temp data like these. Especially GISS.

        • thefordprefect
          Posted Aug 1, 2009 at 7:10 AM | Permalink

          Re: Geoff Sherrington (#56), looking at your graph it is a bit difficult to see, but to me it looks as if there is a constant offset of almost a degree on the giss data.

          to do a comparison you need to do a monthly average over a period (say 30 years) for each plot then create an anomaly graph. It looks like the KNMI data could show a large difference.
          Re: bernie (#55), Ok apologies for the Heartland thing, it is not very relevant just that I would not want my work published ANY pressure group IF I wanted it taken seriously.
          As to the data,
          1. it has been created by the public
          2. Where’s the harm in keeping it updated?

        • David Cauthen
          Posted Aug 1, 2009 at 7:43 AM | Permalink

          Re: thefordprefect (#57),

          A suggestion. After Watts’ work is published, simultaneously ask Watts and Jones for their respective code, data, methods, etc.

        • Anthony Watts
          Posted Aug 1, 2009 at 8:46 PM | Permalink

          Re: David Cauthen (#59),

          For the record, one my work is published in a journal I plan on making it all available of my own volition.

        • David Cauthen
          Posted Aug 2, 2009 at 7:21 AM | Permalink

          Re: Anthony Watts (#88),
          Anthony, there was never any doubt in my mind that you would. Nor, I would guess in fordprefect’s either.

        • Geoff Sherrington
          Posted Aug 5, 2009 at 7:39 PM | Permalink

          Re: thefordprefect (#57),

          Please explain why I should have to fiddle with official data to make it look better?

          My data are taken directly from the public sources noted, the same way an aspiring author in climate science might do. If his/her results turn out wrong, it is not because I failed to warn of suspect official data. The problem is that some of the official data are WRONG.

          A temperature in Deg C taken on a day is a more basic measurement that one normalised to some arbitrary time period which itself carries estimation errors. You are encouraging the conduct of science in indirect ways when the primary way is better. Why?

          I have shown but the tip of the iceberg and I have often chosen favourable cases as well. There are plenty of worse ones. Remember, they all derive from the Australian Bureau of Meteorology, which I consider has done a collection/collation job as good as any I have seen worldwide.

        • tetris
          Posted Aug 1, 2009 at 9:59 AM | Permalink

          Re: steven mosher (#53),
          Mosher:
          Interesting point about the lack of value added in the CRU data. However, as Geoff Sherington points out in #56, the GISS data is nothing to write home about. We all know that the GISS data at the source level is full of holes, as has been demonstrated on this site some of them big enough to drive an 18 wheeler through, and that to boot it has been “re-calibrated” and “adjusted” to the point of stretching credulity. So if if we only have the GISS data to work with, what level of confidence does that give us about surface temperatures?
          And by the way I like the “luke warmer” credo, but as a skeptic it’s been my motto all along.

        • Steve McIntyre
          Posted Aug 1, 2009 at 10:50 AM | Permalink

          Re: tetris (#64),

          I don’t think that Steve Mosher has expressed this point quite right tho I think I know what he means.

          Neither CRU nor GISS add much value in the sense that neither agency does any quality control. Nor it seems does either agency have ongoing programs to do special technical studies to re-examine and improve historical collections (e,g, neither seems to have done any digitization of colonial records in over 20 years of the type recently done by Christy et al.) As Gavin Schmidt observed, GISS allocates about 0.25 man-years to their temperature index and carry out ongoing quality control.

          My guess is that CRU allocates less time.

          The computing effort in making a gridcell average of GHCN and MCDW station data is negligible once you have mechanics for collating data.

          CRU’s systems look less weird than GISS’ and we’ll be able to emulate CRU in fairly short order when I start on it.

          The problems with the GISS method are the endless numbers of pointless operations – the Marvellous Toy syndrome. But at the end of the day, they are simply pointless operations and you still end up with an average of station data. The issues pertain to UHI and microsites as they always have.

          All this exercise does is clear the brush for access to these issues. We’ll see EXACTLY what they do and this can begin the process of analysis.

          But as I’ve said over and over, there’s not going to be a Mannian principal components moment in this stuff.

        • tetris
          Posted Aug 1, 2009 at 2:35 PM | Permalink

          Re: Steve McIntyre (#66),
          Steve,
          Thx for the rejoinder. As an aside, the HadCru situation highlights the issues of conflict that arise when government agencies act as “businesses” [ref: TAG’s comments in # 72].

        • steven mosher
          Posted Aug 2, 2009 at 10:44 AM | Permalink

          Re: tetris (#64),

          Tetris, Thanks I think Geoff may have been reading too much into what I wrote so let me clarify. I think CRU offers no added value ( scientifically speaking) over GISStemp. It can’t and it can’t primarily because of lack of access to the data and the code. It’s mere anecdote. That means the science of record falls on Hansen’s shoulders: flawed as it is, GISStemp represents the best science has to offer today for an estimate of global temperatures. ( RSS and UAH are their own can of worms) That should be absolutely shocking to all the people who have looked at the process by which GISStemp is constructed. Let’s start with the patchwork of data. In today’s age it simply boggles the mind that there isnt a cleaner better organized repository of climate data ( historic and current). The lack metadata describing the history and evolution of sites and instruments is astounding. After a ton of volunteer hours we are finally getting to a point where the stations in the US have their CURRENT status documented ( photos, etc ). But researching the history of stations and snapping photos of them today is rather mundane, not a lot of money in that. No photo ops of scientists on glaciers there. No creation of exotic new statistical methods there. Just the drudgery of checking records, and rechecking. Next look at the methods of Gisstemp. You’ve got some automated QC, and you also have some hand editing with no records to back up those decisions. When you move on to the code you are in for an even bigger shock. And then you realize that a good portion of climate science rests on this shoddy work. Theory is supposed to account for observations, Past observations. And it is supposed to make predictions about future observations. All of this depends on getting the best record of observations you can. As I’ve said, If GISStemp went away tommorow we will still be left with the fact that GHGs warm the planet. So this is not about “falsifying” AGW. It’s simply about doing better science. For now, GISStemp is the best you have for a global record. It’s all you have. Is it flawed, yup. Now go write a grant to get $$ to improve GISSTemp. Probably ain’t gunna happen.

          Lukewarmer: Free the data; free the code; free the debate

        • tetris
          Posted Aug 2, 2009 at 11:36 AM | Permalink

          Re: steven mosher (#104),
          The whole kerfuffle should not be about the accessibility of verifiable data and code and the ability to have a well informed debate on that basis, but about actually proving the contention you state as “fact” in the 3rd line from the bottom. Because as I hope that we all understand by now, things climactic are definitely not as straightforward as that.

        • steven mosher
          Posted Aug 2, 2009 at 12:24 PM | Permalink

          Re: tetris (#106),

          Tetris you and I will have to agree to disagree on this. But let me modify what I said a bit for clarity. “fact” is probably the wrong locution. I would say this. The scientific theory, basic physics, that increasing GHGs will increase the planet’s temperature is well established well confirmed science. Now, I don’t want to get into a discussions about the details of this ( how this happens, what’s the exact relationship, etc etc) those discussions quickly devolve into nonsense. Those discussions become so divergent, in part, because the most fundamental data has not been cared for properly. In any discussion of the relationship between GHGs and temperature you have two observational elephants in the room. GHGs and temperature. Any attempt to either re confirm the basic physics ( more GHGs = more warming) or dis confirm that theory must start with the elephants. And we can’t even examine the temperature data when it is kept under lock and key. Steve’s approach is one that I like. Understand the state of the accepted science. Start your investigation with the most straightforward investigation around. Where are the observations and are the files all in order. Basic lab proceedure. Check your instruments. So, until that is done I’m not going to join other discussions. Let me give you one of my favorite examples: people who both criticize the temperature record and simultaneously try to find correlations between sunspots and that record. I have no patience for that.
          Now, of course, some will look at this and say.. the temperature record is “good enough” let’s move on to other discussions. Fine, let them move on. I’m not interested in those discussions. Don’t care about ice, or clouds, or aerosols, or sun spots, or any of that. Why? because in the end all of those fleas land on the elephant. I will however, bet on sea ice and hurricanes, maybe bet on sun spots as well. Others, of course are free to debate this stuff. I’m free not to join that. Still working on getting the data free and code free.

          Lukewarmer: free the data; free the code; free the debate.

        • thefordprefect
          Posted Aug 2, 2009 at 1:05 PM | Permalink

          Re: steven mosher (#107),
          Believe it or not I very much agree with your statements!

          Data should be verifiable by all with an interest. It is very sad that some is “commercial” (this is not CRUs fault).
          I would like to find AGW to be a mistake – otherwise we may be in the poo already.

          No temperature recordings up until a few years ago were set up for particular accuracy. They were there just to answer: wow it’s cold today I wonder how cold?
          There is NOTHING that can be done to make the methods/recoding/location better. It is obviously too late. But this inaccurate log is all there is. If AGW is a fact then waiting another 100 years to say “yup!, the planet’s over heating” is just not on. One is left with this log, glaciers, cherry blossom, grape harvest, sea ice and a few other indicators to judge how to face the future.
          climate models may eventually help, but they are too much in their infancy at present.

          If we had to decide what to do now, what indicators can be used?

          Mike

        • steven mosher
          Posted Aug 2, 2009 at 3:54 PM | Permalink

          Re: thefordprefect (#108),

          As a lukewarmer I’m used to taking it from both sides, until emotion clears
          and then some common ground emerges.

          Data should be verifiable by all with an interest. It is very sad that some is “commercial” (this is not CRUs fault).
          I would like to find AGW to be a mistake – otherwise we may be in the poo already.

          Well, I’m certainly not interested in casting blame or not. As I’ve said, the “best” record we have is GISStemp. I say best in terms of data availability and as you know the code, while something of a mess from professional standards, is at least open and functioning. So, I would dustbin CRU and the whole issue of data availablity, what contracts, etc goes away.
          If CRU or others want to improve on GISS, then they have a simple path forward. If they feel like different stations would be better, they can add those stations. In doing this they can take care to note and document which are Open and which are closed. Pressure can be brought to bear to open all the data they desire to use. But this current state of saying data is closed but not being able to revisit that or change that or saying the contracts are lost is just not acceptable. As for AGW being found to be a mistake? I would not hope for it to be found a mistake. A lot of science would have to unravel for that to be the case. here I take AGW to be the following bare claims:
          1. Humans are adding GHGs to the Atmosphere.
          2. Increasing GHGs warms the planet ( rates and limits and feedbacks and exceptions TBD)

          WRT to us being in the poo. I’ve haven’t seen any science that indicates that I will be in the poo. I live on high ground away from hurricanes in a climate that could use a few more warm days and warmer nights. The question really has to do with future generations which turns on the question of the rate of change and the ethical question of intergenerational obligations and transfers of funds. Frankly, I’m not willing to incur a current cost to fund the ability of some future child of a movie star to live a life of luxury in malibu, CA with landsides and fires to the right of them and high tides to left. ( I always shake my head when driving through there)

          No temperature recordings up until a few years ago were set up for particular accuracy. They were there just to answer: wow it’s cold today I wonder how cold?

          All we will ever have are various proxies for temperature. These proxies ( liquid in a glass, d018 in ice, rings in trees, etc etc) have differing accuracies, sampling distributions, biases etc. It’s a tough but tractable problem. There is no reason whatsoever to complicate the problem by hiding data, obscuring methods, and cherry picking results. There is no reason whatsoever to limit papers on the problems to 20 pages. It boggles the mind.

          There is NOTHING that can be done to make the methods/recoding/location better. It is obviously too late. But this inaccurate log is all there is.

          Well, we don’t know that. It was funny when Anthony started surface stations what people said. Let me give a sampling:

          1. Your sample will be limited and skewed to areas close to urban areas.
          Wrong.
          2. Photo’s have no value: Opps people find out that photographing the site
          is a required practice for CRN
          3. You can’t tell what a site was like many years ago. Opps old photos and records pop up.
          4. Microsite bias will even out. Opps, the few of us who looked at it found
          a slight positive bias ( warmer) in line with theory of course.

          5. The world is oversampled, don’t worry about errors. But then… NOAA
          upgrades stations and decommissions others

          6. We can’t throw out faulty stations we need the coverage, we will adjust
          them instead.

          With respect to improving the record I would suggest a couple of areas.

          1. A re examination of the TOBS adjustment. is this adjustment made worldwide? The original paper on TOBS was very thin and covered only the US.
          2. Lapse rate adjustments during site changes. When sites change position
          in the past there was a SHAP adjustment. We never got to the bottom
          of lapse rate adjustments.
          3. UHI. The supporting studies here are woefully inadequate ( parker and peterson and jones) and they actually run counter to some established science.
          4. The whole bucket (SST) fiasco needs to be re visited.

          So, we don’t know that the innacurate log is all we have. Look at the two jeffs work on Steig. Personally I think they took the same records and provided an improved view of it. a more robust more defensible view. At the risk of starting an engineer/scientist debate, I’d say something positive about engineers here…

          If AGW is a fact then waiting another 100 years to say “yup!, the planet’s over heating” is just not on. One is left with this log, glaciers, cherry blossom, grape harvest, sea ice and a few other indicators to judge how to face the future.

          We always have to act under uncertainty. The question is what actions are prudent given our present state of knowledge/uncertainty. the fact that we have to act ( ignoring a problem is acting) Doesn’t really remove the need to do a better job with that inacurate log. I’m not however inclined to go off with a climate science version of pascal’s wager and take drastic action because of tipping-point-burn-in hell-if-you-dont-believe scenarios

          climate models may eventually help, but they are too much in their infancy at present.

          If we had to decide what to do now, what indicators can be used?

          Climate models. here to we see a waste of resources. there are multiple instances of climate models all with varying degrees of skill. A typical model might have 1 million LOC. a fraction of the LOC in a modern operating system. But rather than focusing efforts on the best models people continue to support many models, some that don’t even capture effects of volcanoes. what’s up with that. Keeping bad models in the herd does two things:
          1. generates a wider spread of model prediction ( hey observations FIT!)
          2. Keeps research jobs alive.

          Indicators: I would not use the sloppiness of the record to recommend inaction. The key would be to generate a prediction with best data available and then recommend actions accordingly. Now, however, we have people using data that hasnt been properly QCed, run through code that definitely needs a rewrite, to generate alarmist scenarios that can only be avoided by drastic action.

          Lukewarmer: free the data; free the code; free the debate.

        • Jonathan
          Posted Aug 2, 2009 at 2:43 PM | Permalink

          Re: steven mosher (#107), it seems to me that there is a perfectly reasonable position that GIS/HADCRUT accurately capture the short timescale variations in the data, but that the long timescale variations may be badly contaminated. Of course the long timescale variations contain the “trend” that really matters. But this does provide a rationale for simultaneously questioning the data and correlating it with sunspots etc.

          On the theory side I broadly agree with Calvin – so I’m a lukewarmer for the moment.

        • Geoff Sherrington
          Posted Aug 4, 2009 at 7:25 AM | Permalink

          Re: steven mosher (#104),

          Re this and following comments,

          The Australian temperature data story goes a little like this – I have researched the early years only quickly.

          Starting about 1850, records were kept systematically in some of the larger cities, perhaps by learned society volunteers. Use was for agriculture, travel, etc.

          The Bureau of Meteorology was established in 1908. Australia was in a period of wealth from many good gold discoveries from 1865 and it was a good era for science and instrumentation. There were possibly in excess of 100 recording stations by 1900, taking Tmax and Tmin. It is significant that the population around the recording station did not always grow, so there was no shift from rural to urban in cases such as small stations to relay the overland telegraph, or small mines that lapsed after a couple of years. This limits the number of retrospective adjustments that have a reason to be made to stations, sometimes to one – the Stevenson screen.

          There was a gradual shift from other surrounds to Stevenson screens in the 40 years to 1910. The way in which screen adjustments should be made is not always known. Apart from this, the other main early adjustment would be in the calibration of the max/min mercury thermometers and the accuracy with which they were read.

          The BOM has collected data to the present. In the early 1990s Simon Torok wrote a doctoral dissertation in which he drew together much of the data and applied a number of homogeneity corrections, to produce a dataset that the BOM sold on a CD. This was about 1993. I have not studied the adjustments, but it is on the agenda.

          The BOM released later CDs, for example one ending in early 2007. Station by station, there is not always a match between the 1993 version and the later versions. I do not know all of the reasons why. Some stations have significant missing data and in-filling to make tables easier is part of the reason.

          So far as I know, virtually all Australian data (including some Antarctic) derives from the BOM. It is copyrighted, but I do not know the arrangements under which compilers like CRU and NOAA obtain the data. Maybe some CRU reluctance to disclose is from copyright conditions. I do not know.

          Thus, for compilers like CRU and its later forms, or GHCN or NOAA or GISS or KNMI, the original source is constant. The variability shown by graphs like the one below results from adjustments. Because the BOM data go out already adjusted, it is difficult to envisage why more adjustments are made. The creation of grid cells from stations is one reason, but it is also on the study agenda. Work on this was done by Paul Della-Marta and others, including filnet-related work.

          My personal interest in the CRU data is to see how many versions there are, and why, so that the most appropriate one can be used for future studies. I know from pers comm that there are complications in combining SST data with coastal surface stations by CRU.

          In a practical sense, I can think of no better cause for concern than showing the next graph (but one of many available) where a picture speaks 1,000 words. This station, a remote lighthuse, was selected at random, not to show any specific points. (Is it worth opening a betting book as to where CRU will plot on selected graphs)?

          Now you know why I do not trust GISS data and why I have commended KNMI on initiative, but warned of possible errors, which they admit exist.

          The Gabo Island graph, WMO 94933, follows. It is completely rural. The yellow line is GISS homogenised. Please email me for a bigger image. The public metadata file is at http://www.bom.gov.au/climate/cdo/metadata/pdf/metadata084016.pdf

          It seems that the BOM had decided that the quality of data before 1957 is too poor to continue. It is possible that automatic recording commenced 1957. This has not deterred other compilers from using earlier data. What will CRU show?

          Those used to looking at numbers might well see some oddities in this sequence of raw data and also wonder if it is a basis for good science.

          Year, Month, Day, Tmax, Tmin

          1965, 8, 14, 16.7, 12.2,
          1965, 8, 15, 15.8, 12.8,
          1965, 8, 16, 15.0, 10,
          1965, 8, 17, 13.6, 9.4,
          1965, 8, 18, 14.7, 9.7,
          1965, 8, 19, 16.4, 8.3,
          1965, 8, 20, 12.5, 10,
          1965, 8, 21, 12.2, 7.8,
          1965, 8, 22, 12.8, 7.8,
          1965, 8, 23, 12.8, 8.9,
          1965, 8, 24, 14.2, 8.3,
          1965, 8, 25, 16.1, 10.6,
          1965, 8, 26, 15.6, 10,
          1965, 8, 27, 16.1, 10,
          1965, 8, 28, 16.1, 10,
          1965, 8, 29, 16.1, 10.6,
          1965, 8, 30, 16.7, 10

          There are many 10s in the Tmin. This is one of many examples from this station that should not be taken at face value. The BOM has done a good job, but it can only be as good as the data.

        • steven mosher
          Posted Aug 4, 2009 at 10:16 AM | Permalink

          Re: Geoff Sherrington (#136),

          Nice summary and good work. My position remains unchanged.

          1. CRU is to be rejected out of hand on the basis of data and source code availability. WRT to the land/sea issue
          that is one that RomanM and I have discussed here.

          2. NOAA is to be rejected out of hand for the lack of source code available

          3. GISStemp remains the only global record with open data and open source.

          4. I don’t hold that GISStemp is to be TRUSTED, precisely because of the work that anthony has done and the types
          of things that you have shown. But neither do you throw it out completely. It’s a starting point for GLOBAL.

          So if you ask me what index deserves time and effort to improve, if you ask me what index one uses ( if forced to today) then that standard is GISS. Is it a perfectstandard? Nope. Are there problems? yup. Are they major problems?
          don’t know. Can I study those problems and the interaction between data and methods? yes. Can I do that with any other index? no.

        • Geoff Sherrington
          Posted Aug 4, 2009 at 5:39 PM | Permalink

          Re: steven mosher (#141),

          I guess we part philosophical company about here where you write

          So if you ask me what index deserves time and effort to improve, if you ask me what index one uses ( if forced to today) then that standard is GISS.

          Nobody forces me to use data I know to be questionable. I can leave it out. Science does not progress in the intended manner when people “force” themselves to questionable standards.

          Is there reason to question GISS data? Yes. Is it serious? Yes.

          There are many deeper reasons to question the value of a global temperature measure, many of which have been discussed on CA. Questionable data are but a small part. If scientists had made an early, concerted rebellion about the global temperature concept (my mates and I started about 1993 in Australia) then the whole business would look rather different today.

          BTW, the Gabo Island illustration I gave above is geographically one of the most favourable examples one can envisage for the absence of conditions needing adjustment. Compared to it, others can only be worse, mainly.

          Trust the BOM before you trust GISS. At Gabo, GISS just copies BOM CD 1993 line until 1993, when it stops.

        • Ian Castles
          Posted Aug 5, 2009 at 1:52 AM | Permalink

          Re: Geoff Sherrington (#136),

          Just a piece of trivia to supplement the results of your researches into early Australian meteorological records. Early in 1855 William Stanley Jevons, a 19-year-old assayer at the Sydney Branch of the Royal Mint who was later to become one of the leading economists of the day, began to make daily observations of temperature and rainfall at 9 a.m. and 9 p.m. He maintained this solo effort for more than three years, publishing his results in Henry Parkes’s newspaper “The Empire”, of which he became Meteorological Observer. In an entry in his Journal in September 1856, Jevons noted that he had been “engaged … in copying out, correcting and calculating my two daily observations for the last twenty months … it is a work of some forty or fifty thousand figures, independent of continual calculations, drawing of means, and other work.” The official meteorologist had absconded some time before this, and Jevons’s records are the only ones that survive for a period of (I think) about two years.

          When the Sydney Morning Herald announced its intention of resuming meteorological observations in 1857, Jevons professed himself to be “happy to be the means of connecting the Herald’s old set of observations with those which will be shortly commenced by the Government, thus preventing a break in a long serious (sic) of years which would have been most unfortunate, and as far as we know irrecoverable.”

        • Geoff Sherrington
          Posted Aug 5, 2009 at 3:46 AM | Permalink

          Re: Ian Castles (#151),

          There are probably quite a few stories showing dedication and probably a few when the observer was too sleepy or too inebriated to do a good recording job. The latter is human nature and it happens. The former sets standards that should act as models of how to get it right. Thank you for the story.

          Whn you look at the records pencilled day-by-day (instead of non-critically crunching them in adjusted gobbles of decades) you see blocks of data that you sense to have the wrong feel. You cannot prove that they are wrong. You can sometimes show that the are improbable, but not by how much or why. So, you can accept them against better judgement, or reject them. (Who am I to be lecturing you on this?).

          My point is that the detailed examination of past temperature data should result in much more rejection. It should not be in-filled as a mathematical processing convenience, for all in-filling is a guess.

          The disadvantage is that the instrumental period would be rather more sparse and shorter. The advantage is that processes set against the temperature have a better chance of success, with fewer false artefacts.

          If the long term outcome is that the early instrumental period is completely dismissed because of excessive error, proxy calibrations would change. That could be a good coutcome. Many failed posts on CA can probably be traced to misplaced confidence in the fidelity of the temperature record.

          There is no imperative to “balance the books” as in datasets for accounting, or in counting the properties of all people in a census. The choice is open to reject suspect temperature data.

        • QBeamus
          Posted Aug 5, 2009 at 3:31 PM | Permalink

          Re: Geoff Sherrington (#152), But isn’t the problem even more difficult than that? If a researcher had rejected large swaths of data because they “didn’t feel right” I would feel entirely justified in discounting his conclusions, if I didn’t care for them. (I’d like to think I’d discount them even if I did, but I try to advert to my human frailty.) Even if they had used some operationalized test for choosing which swaths to discard, I would be suspicious that the test was reverse-engineered for effect.

          Shouldn’t we be asking ourselves how deep this rabit hole goes? At what point should we simply give up, and conclude that the temperature record we have is simply not suitable for the task we’re trying to use it for? Has anyone tried to quantify the signal-to-noise ratio? It’s not that I don’t sympathize with the impracticality of having to start a multi-century experiment from scratch, but if the information isn’t in the data, no amount of massaging can fix that.

        • sky
          Posted Aug 5, 2009 at 4:46 PM | Permalink

          Re: QBeamus (#154),

          Most of the temperature measurements made in population centers of one kind or another were never intended for accurate analysis of subtle climate variations, but serve a much more mundane purpose. I think that bias and data gaps are far more serious problems than the S/N ratio in available temperature records.

  18. Posted Jul 31, 2009 at 11:03 AM | Permalink

    Jonathan–
    Absolutely. Loads of people find it convenient to store stuff on the ftp server. Network admins are usually to busy to constantly check. Usually, nothing bad happens. But Network admins know that it’s better to be careful and implement rules. (Then, everyone else grumbles about those rules.)

    That’s why I think people put stuff up there when they shouldn’t have; when the network administrator discovered the issue, he told them to clear everything off.

  19. TAG
    Posted Jul 31, 2009 at 11:06 AM | Permalink

    It has been my experience that FTP sites are not normally backed up. If the server crashes, data on the FTP site may be lost. Files are placed there for trnsmission to others and not not for permanent storage

  20. jae
    Posted Jul 31, 2009 at 11:36 AM | Permalink

    I detect the presence of lawyers.

  21. jae
    Posted Jul 31, 2009 at 11:41 AM | Permalink

    Dr. Phil, eh? 🙂

  22. Gerald Machnee
    Posted Jul 31, 2009 at 11:42 AM | Permalink

    They are now spending(wasting) time password protecting their files. If they would spend half that time filling requests properly, they would earn some respect. But do not count on it unless there is a major housecleaning. So the FOI results will be the same.

  23. Posted Jul 31, 2009 at 11:48 AM | Permalink

    Hiding of CRU data is the same behavior that lead to the snipping of any comment at all relating to Ryan’s work on the Antarctic after the last RC thread. We’re simply not allowed to comment. You can’t handle the truth!

    The whole CRU data issue is the same as Dr. Steig not releasing code, Dr. Comiso not replying to Ryan’s requests, Mann too busy to answer SteveM’s requests, people getting fired for claiming snow pack didn’t shrink, surfacestations being attacked disingenuously by the NCDC before anything is published, they don’t want anyone to see their work.

    While Lucia’s explanation for data disappearance makes sense, I’m a bit grumpy with the whole lot of them today.

    RC Correctness Censors

  24. Gary Hladik
    Posted Jul 31, 2009 at 12:08 PM | Permalink

    Whoa! Looks like the next FoI request will have to be a wee bit bigger…

  25. Geo
    Posted Jul 31, 2009 at 12:48 PM | Permalink

    I don’t think a collective temper tantrum is really the right way to look at it. A much more likely scenario is that after being embarrassed over the station data, some muckety-muck dug up an existing policy on what data is supposed to be out in those public sites, and sent an email to everyone reminding them of the existing policy and pointing at the recent breech as a sad commentary on the slackness of staff in following published policies and procedures in this area, etc etc etc.

    And so everyone had to go cull their public data because papa was pissed they hadn’t cleaned their rooms in weeks when they all knew they were supposed to do it every Saturday.

    More like that, rather than a collective tantrum.

  26. tetris
    Posted Jul 31, 2009 at 12:56 PM | Permalink

    Steve,

    Your self-imposed deadline of mid-night has passed. I’d be surprised if I were the only one here to whom it has occurred that you may have had a first go at the data. snip

    Or, as you and others have suggested all along, is the whole thing no more than a tempest in a tea cup, a mix of ineptitude, pique and ego issues on the part of the HadCRU team and the valiant Dr “No-you-can-not-have-my-data” Jones, and not worth the bureaucratic stone walling?

    Steve: I’ve said over and over that I don’t expect anything all that interesting in this data other than the triviality of the amount of what they do. I’ve spent some time making a concordance of CRU numbers to GHCN numbers, something that they’ve withheld. This is slow work and I like to have that sort of thing don before doing much analysis. All they do is average ata so there’s not much way that can be screwed up. The battelground issue remains UHI as it always has. The difference will be the ability to examine specific sites.

  27. Dio Genes
    Posted Jul 31, 2009 at 1:24 PM | Permalink

    Has anyone noticed that the Arctic Sea Ice Area on Cryosphere is lower than last year at this time, and NSIDC shows Extent is almost down to the 2007 level for this date? Or are they both part of the Vast Scientific Conspiracy as well?

  28. David Holland
    Posted Jul 31, 2009 at 1:32 PM | Permalink

    I would caution, as Steve and Lucia keep doing, not to jump to conclusions. It may be unlikely, but perhaps CRU are “doing he right thing” by the EIR and preparing to,

    (a) progressively make the information available to the public by electronic means which are easily accessible; and
    (b) take reasonable steps to organize the information relevant to its functions with a view to the active and systematic dissemination to the public of the information.

    However it would do no harm for a public spirited person to email Mr Palmer (DavidDOTPalmerATuea.ac.uk) and ask. I have 6 EIR cases on the go with the ICO (which I will try to report on after a short holiday) – so I am not looking for any more fights.

    Any information on the states of the elements of environment” is subject to the EIR, and I think that includes what they have done with the ftp data what they and plan to do with it.

    As Steve mentioned EIR Regulation 19 makes it a ÂŁ5,000 offence for someone at UEA if he or she

    alters, defaces, blocks, erases, destroys or conceals any record held by the public authority, with the intention of preventing the disclosure by that authority of all, or any part, of the information to which the applicant would have been entitled.

    The statutory EIR Code of Practice states:

    All communications to a public authority, including those not in writing and those transmitted by electronic means, potentially amount to a request for information within the meaning of the EIR, and if they do they must be dealt with in accordance with the provisions of the EIR.

    A bit of creative argument might just persuade the Court that removing environmental information from publicly accessible web sites upon which the public make electronic searches (enquiries?) could be r.19 offence if the purpose is to prevent the public accessing it.

    However be sensible. Don’t all send an email – that might be vexatious. Post a comment first if you intend to do it.

  29. Keith
    Posted Jul 31, 2009 at 1:37 PM | Permalink

    Maybe the data removal is being driven by the folders being accessed from off their main server. I checked the davelister folder under the people directory that I checked yesterday. The file forphilj is missing. That is the one with the China data that are mentioned on the previous thread.

    • steven mosher
      Posted Jul 31, 2009 at 1:46 PM | Permalink

      Re: Keith (#39),

      The forphil folder actually had a nice readme explaining the data that was availble and the data that was under
      restriction. It showed that some actual thought was put into data management control and documentation.
      We may disagree with the need of sheltering that data, but davelister showed one way of handling the problem by detailing his postings with a readme.

      • Steve McIntyre
        Posted Jul 31, 2009 at 1:55 PM | Permalink

        Re: steven mosher (#39), There’s nothing there right now.

        • steven mosher
          Posted Jul 31, 2009 at 2:46 PM | Permalink

          Re: Steve McIntyre (#41), Re: Keith (#42),

          yes, I know he removed them. but prior to removal he had a nice little notice in his folders, indicating which data was open and which was closed.( he didnt post the closed data ) And yes keith he had data on scotland and other UK sites that would have been of interest to people.

          Lukewarmer: “Free the data;Free the code;free the debate”

      • Keith
        Posted Jul 31, 2009 at 2:41 PM | Permalink

        Re: steven mosher (#39),

        I’m actually disappointed because there was a file in davelister named scotland that had a series of files that I think detailed most of Britain’s sites. It is not there any longer, and I was curious to check a couple.

  30. John Wright
    Posted Jul 31, 2009 at 1:54 PM | Permalink

    So basically Steve, you’re saying they are hiding it under the bed. It’s all very childish behaviour and a sign that their house of cards is rapidly collapsing. Don’t let them off the hook. Your request was perfectly reasonable and if the cat is now among the pigeons, well it’s all thanks to you (and when I say “thanks” I mean that literally) – sorry about the mixed metaphors. Anyway that puts me in total disagreement with DaveR’s comment.

    Otherwise I don’t quite see the nuance between a mole and a whistle-blower. If ever he is identified he’ll be able to publish his memoirs.

  31. Bill Jamison
    Posted Jul 31, 2009 at 3:20 PM | Permalink

    At this rate it’s clear that 2009 will set a new record for data deletion. If this trend continues there will be no data left in just a couple of years!

  32. Adam Soereg
    Posted Jul 31, 2009 at 4:09 PM | Permalink

    Data deletion is happening at least three times faster than predicted.

    And I don’t even mentioned that the daily amount of deleted files seems to increase exponentially. It suggests a strong positive feedback mechanism.

  33. ianl
    Posted Jul 31, 2009 at 5:22 PM | Permalink

    I do wonder how many times this has to be pointed out:

    episodes such as this will make no difference to public policy as the populist “meeja” will simply not report it

  34. Calvin Ball
    Posted Jul 31, 2009 at 5:39 PM | Permalink

    Ordinarily, when major surgery is done on an FTP site, they leave a readme to sort of explain what happened and why. The reason why this was done may be innocent enough, but a readme would have been a reasonable and professional touch. Just clearing the decks like this is at the very least rather crude.

  35. Calvin Ball
    Posted Jul 31, 2009 at 5:41 PM | Permalink

    Oops. Didn’t see Mosher’s comment. Never mind.

  36. Geoff Sherrington
    Posted Jul 31, 2009 at 7:49 PM | Permalink

    Let’s be practical. here is a graph from a single Australian station. It lacks the primary CRU data. The CRU data exist, but it is possible that there are many versions, each version adjusted for an unexplained reason. Some might have been used by other compilers and is could be shown here under the name of another institution.

    Two questions:

    1. Where do I imagine the CRU data will sit on the graph when it is released?

    2. How do I know what adjustment has been done?

    To my simple mind, it is very hard to progress this sub-science because of its adjustment and lack of explanation of logic and magnitude of adjustment.

    Note. The primary data come from the Bureau of Meteorology. All other versions are derived therefrom. I have done very minor guesswork in-fills of occasional missings data days, so exact replication is impossible. I wanted to detect changes of up to 0.2 deg C per decade like some famous scientific authors do.

  37. Jimmy Haigh
    Posted Jul 31, 2009 at 11:30 PM | Permalink

    I bet the last couple of days were a fun time in the CRU offices…

  38. bernie
    Posted Aug 1, 2009 at 6:43 AM | Permalink

    thefordperfect:
    Anthony says explicitly in the same post you quote from that the updated surface station data “is not yet public domain, though I plan to make it so after I’ve published my paper.” That does not seem to me to be inappropriate and certainly appears to be in line with SM’s requests of Jones et al.

    Your reference to other issues that the Heartland Institute has addressed seems out of place and amounts to a “cheap shot”. You seem to be conflating many issues and misrepresenting them as well. [snip]

    Steve: please do not rise to the bait.

  39. Steve McIntyre
    Posted Aug 1, 2009 at 9:11 AM | Permalink

    It is my understanding that Anthony plans to submit an article to an academic journal using the updated classification and that the updated classification would be made available concurrent with publication in an academic journal.

    I have some sympathy for the argument that the Heartland article constitutes “publication” in a sense. So are blog posts. However, the people that criticize me for not “publishing” more are obviously not criticizing me for not making enough blog posts. They define “publication” as publication in an academic journal.

    If data were generally made available by climate scientists on such occasions, then this would eliminate most of the data archiving issues raised here. That seems to be Anthony’s plan here.

    Given the demands that are beginning to be made by fordprefect and others, I think that Anthony should place a dropdead data on that process i.e. if he’s not finished the process by say six months from now, he should undertake to place the database online regardless. If the classification remains unavailable after publication or some finite time, I’ll add my voice to the requests and make a critical post on the matter.

    This would obviously be a far more aggressive commitment to data availability than practiced in most of the climate science community, including by people that fordprefect doesn’t criticize.

    Until fordprefect criticizes data obstruction by CRU in clear and forceful terms, I don’t think that he has sufficiently clean hands (using the term in the legal sense) to criticize anybody.

    • thefordprefect
      Posted Aug 1, 2009 at 11:51 AM | Permalink

      Re: Steve McIntyre (#60),

      Until fordprefect criticizes data obstruction by CRU in clear and forceful terms, I don’t think that he has sufficiently clean hands (using the term in the legal sense) to criticize anybody.

      I will not criticise CRU for not releasing all their data:
      I will criticise then for not releasing the data that is not commercial
      I will severely criticise the data suppliers for not giving the data free.

      No one should be tweaking figures to fit beliefs and if they do then that is despicable.

      To try to correct data against non climatic changes is acceptable and necessary.

      To try to discredit another person/institution using invalid ruses in the minds of some of the cletuses on some blogs is despicable.

      Mike

      • Terry
        Posted Aug 1, 2009 at 4:08 PM | Permalink

        Re: thefordprefect (#68),

        To try to discredit another person/institution using invalid ruses in the minds of some of the cletuses on some blogs is despicable.

        What does this mean? This is the most nonsensical sentence/sentiment/thing I’ve read in all of 2009, and probably 2008 as well.

      • Gerald Machnee
        Posted Aug 1, 2009 at 10:38 PM | Permalink

        Re: thefordprefect (#69),

        I will not criticise CRU for not releasing all their data:
        I will criticise then for not releasing the data that is not commercial
        I will severely criticise the data suppliers for not giving the data free.

        We seem to be missing a criticism here. Looking at the first non-criticism, has thefordprefect asked for a copy of the confidentiality agreements that were not produced for Steve M? That seems to be the reason for the first I will not.

      • David Cauthen
        Posted Aug 2, 2009 at 7:45 AM | Permalink

        Re: thefordprefect (#69)

        To try to discredit another person/institution using invalid ruses in the minds of some of the cletuses on some blogs is despicable.

        Huh? Do you mean the ruses are invalid in the minds of some cletuses on some blogs? Or, do you mean the use of invalid ruses is considered despicable by cletuses on some blogs? Do all cletuses have problems using commas?

  40. Steve McIntyre
    Posted Aug 1, 2009 at 9:23 AM | Permalink

    For some reason, the word “copyright” always seems to set of a torrent of commentary that, in my opinion, seldom has much to do with issues at hand. Every so often, I have to intervene editorially in such discussions and this is one more time.

    There’s never been the slightest suggestion that anyone wants to use the CRU data without properly attributing the data to CRU or to appropriate it for commercial purposes. Nor have CRU used “copyright” issues to defend their actions.

    So please , no more opining on copyright.

    • Kenneth Fritsch
      Posted Aug 1, 2009 at 11:29 AM | Permalink

      Re: Steve McIntyre (#63),

      Good call.

      There’s never been the slightest suggestion that anyone wants to use the CRU data without properly attributing the data to CRU or to appropriate it for commercial purposes. Nor have CRU used “copyright” issues to defend their actions.

      So please , no more opining on copyright.

      I opined about copyright and (correctly) got snipped. It was rather obvious that I was responding to a prefect bait to derail the discussion from the issue at hand. I was looking for Steve M to do the right thing, after the fact, and he has.

      I would think all sides can agree that the CRU keepers have shown a sloppy and casual approach to the keeping of the temperature series records. The pushing by the Steve M from the blogosphere has revealed sloppiness on their part that might not otherwise have been apparent.

      I like others here would not expect a closer examination of the CRU records to reveal (given the assumptions of the adjustments applied or not applied to the raw data) any thing that would result in a major swing in the global temperature trends. Based on the sloppiness witnessed to date we might expect to see errors with lesser consequential affect on the trends.

      The point is and has been the adjustments to records, and further, in my view, whether the seemingly lackadaisical attitude and approach seen in record keeping spills over into formation of the adjustment algorithms.

      I further see the keeping of the source information under wraps as a way of averting attention from the (lack of) valued added issue for using CRU data. And further, I think that the CRU keepers are motivated more by the academic pride (vanity) in the CRU series being used for scientific works and papers than any commercial concerns.

    • Jonathan
      Posted Aug 4, 2009 at 10:43 AM | Permalink

      Re: Steve McIntyre (#63), people might want to read our host’s comment again.

      • Steve McIntyre
        Posted Aug 4, 2009 at 11:35 AM | Permalink

        Re: Jonathan (#144),

        Quite so. As stated above:

        For some reason, the word “copyright” always seems to set of a torrent of commentary that, in my opinion, seldom has much to do with issues at hand. Every so often, I have to intervene editorially in such discussions and this is one more time.

        There’s never been the slightest suggestion that anyone wants to use the CRU data without properly attributing the data to CRU or to appropriate it for commercial purposes. Nor have CRU used “copyright” issues to defend their actions.

        So please , no more opining on copyright.

  41. Robinedwards
    Posted Aug 1, 2009 at 10:38 AM | Permalink

    I’ve looked at Geoff’s (#51) plots and tried to produce something like them (or one of them) from the data I have, which unfortunately ends at 2002.

    Now my data is from the GHCN archive, and I cannot say anything about its reliability. Searching in the rather confusing BoM site(s) I’ve not succeeded in locating a numerical data file or files for Alice Springs or any other Australian location. Am I being dumb as usual? Is there a way in that requires some sleight of finger that I do not have? Anyway, my plots of the raw data resemble Geoff’s, though through the welter of symbols and lines it’s hard to be sure of which!

    What is to be gleaned from the GHCN data? As usual its structure is not at all clear from plotting the “raw” data. And again as usual it is nevertheless possible to find periods of little or no detectable change interspersed with times of really rapid change. Late 1953 saw a substantial change followed by about twenty years of stability at a higher average temperature, again surplanted by a spectacular decrease (1.7 deg C) in Sept 1973 which endured for five years. Then changes upwards at 1980 and 1990, with fairly stable conditions to the end of my data (2002).

    So, patterns are to be found in GHCN, but what is the situation now? I need a bit of prompting to find the raw data. Help!

  42. Bill W
    Posted Aug 1, 2009 at 11:49 AM | Permalink

    “Weather records are a state secret”
    Christopher Booker, Telegraph UK

  43. fizzissist
    Posted Aug 1, 2009 at 12:00 PM | Permalink

    From the CRU website:

    “The various datasets on the CRU website are provided for all to use, provided the sources are acknowledged. Acknowledgement should preferably be by citing one or more of the papers referenced on the appropriate page. The website can also be acknowledged if deemed necessary. CRU will endeavour to update the majority of the data pages at timely intervals although this cannot be guaranteed by specific dates.”
    http://www.cru.uea.ac.uk/cru/data/

    This would seem to be at odds with the confidentiality claims? ‘…are provided for all to use..’?? No standard for amateur, scientist, or truck driver. Publicly posted sets the precedent, as does publicly funded without prior security clearance or classification.

    I am outraged at CRU’s conduct.

    • thefordprefect
      Posted Aug 1, 2009 at 1:08 PM | Permalink

      Re: fizzissist (#69), the data on the website is free for all – obviously. The foi requests are for data that is not (or should not)be on the website.

      could you please detail what you find outrageous?

      • TAG
        Posted Aug 1, 2009 at 1:19 PM | Permalink

        Re: thefordprefect (#71),

        could you please detail what you find outrageous?

        That decisions that affect the world economy are being based on data that is being shared only with a select in-group.

        That some of the data cannot be shared because it is subject to non-disclosure agreements but that the non-disclosure agreements have been lost

        That nobody can recall which data were subject to these agreements.

        That the sources of data have been forgotten and would not be shared in any event.

        • thefordprefect
          Posted Aug 1, 2009 at 1:32 PM | Permalink

          Re: TAG (#72), If you go through the papaer trail you will see that CRU says the free data was available elsewhere

          The data has been around since 1980 – with changes in technology it is not surprising that agreements are misplaced (paper copies? 30 years old)
          Commercial data should not be shared.

        • steven mosher
          Posted Aug 1, 2009 at 11:41 PM | Permalink

          Re: thefordprefect (#78),

          Please see the records acts of 1958 and 1967. If CRU believe that there are confidentiality agreements covering SOME of the data but not all then it is incumbent on them to notify the third parties ( it’s actually required that they do so) and inform them of the request for the confidential records. Failure to keep good records cannot be an excuse as it places a dis incentive to keep good records. If CRU contacts all the third parties and some ( say Syria) respond that they want their data kept secret then I have no issue with CRU keeping it secret. I would suggest that CRU, in the future, not use data that is under these restrictions. So, just to be clear. CRU claim the data ( some of it ) may be covered. They may have lost the contracts ( how well did they look? you have to wonder since they leave files laying around ftp sites) A fair solution if for them to contact the third parties and check the status of the contracts. Finally, Given the kind of simple data transmission and data reading errors ( you found one of those is GISS remember) I would request the CRU data “AS USED” in their analysis. It’s not enough for them to say “we copied it from location X” Giss tried that with temperature data and we got the Y2K bug. So, I’d like the data AS USED.
          if they claim they got data from GHCN, then I’d like to se what they think a good copy is. That the simplest QC check.

  44. Phillip Bratby
    Posted Aug 1, 2009 at 12:15 PM | Permalink

    As you would expect Steve, Christopher Booker gives you a good bit of praise in his Telegraph column:
    http://www.telegraph.co.uk/comment/columnists/christopherbooker/5955955/Weather-records-are-a-state-secret.html

  45. MetMole
    Posted Aug 1, 2009 at 12:52 PM | Permalink

    Mr McIntyre,
    Christopher Booker is on the case now, writing in the Daily Telegraph:
    Weather records are a state secret

  46. John Archer
    Posted Aug 1, 2009 at 1:26 PM | Permalink

    Christopher Booker is on the case in the Telegraph:
    Weather records are a state secret

  47. fizzissist
    Posted Aug 1, 2009 at 2:07 PM | Permalink

    The data has been around since 1980 – with changes in technology it is not surprising that agreements are misplaced (paper copies? 30 years old)
    Commercial data should not be shared.

    If the contract doesn’t exist, then there’s no enforceable contract. Commercial data wouldn’t be subject to a FOIA, and unless I’m misreading something here, Steve isn’t requesting commercial data, is he?

    TAG sums up the outrage pretty well, while you, thefordprefect, sound like a CRU employee who has lost his towel, or hiding the fact that CRU has lost theirs.

    • steven mosher
      Posted Aug 1, 2009 at 11:55 PM | Permalink

      Re: fizzissist (#79),

      Commercial dat IS subject to FOIA. In the instructions to CRU personell they are instructed to avoid entering into these types of contracts as FOIA may TRUMP the commercial interests. However, CRU and MET NEVER CLAIMED commercial intersts per se. They claimed confidentiality agreements that pertained to “NON ACADEMICS” For never can get his facts straight. When an FOIA request for data that is covered by a contract with a third party CRU MUST contact that third party and get there opinion. In this case, CRU is claiming this:

      1. we dont have the contracts
      2. we can’t remember exactly who we have them with
      3. we can however remember that they use the term “non academic” Simply,
      we remember that they all say this data cannot be released to a “non academic” and we remember that because err uh err because we just released the data to Webster, and err uh, err we only did that because he is an academic and err uh err McIntyre is not.
      4. And besides 98% of this data is available elsewhere, err uh.. but to figure that we would have no know who we have contracts with, but we dont know that err uh err.. WHAT ABOUT WATTS? ya ya what about him.

      • steven mosher
        Posted Aug 2, 2009 at 12:00 AM | Permalink

        Re: steven mosher (#95),

        And

        5. The contracts that we don’t have and cant check were all entered into before the FOIA act, we are err uh err sure of that because we can’t find them and we are sure that none of these contracts had expiration dates.. ya this data is secret for all times, pretty sure of that, and we moved twice, and why keep records when you have this great memory.. opps I just put the shit on the web.. err data? what data? we aint got no stinking data.. its GHCN we just do averages here.. can I have more money now.

  48. Kenneth Fritsch
    Posted Aug 1, 2009 at 4:04 PM | Permalink

    I have used the incomplete and publicly available Watts CRN evaluation data and made the proper attributation to him and his team.

    If Watts is going to use that data to publish a paper than I have no problem with him with holding the data until that time. In fact this is what I was hoping that he and his team would do – with the aid of proper and professional statisticians in the process.

    To make an analogy with a paper to be published for public viewing, I judge Watts to on solid ground. To make the analogy with an incomplete data base, I judge Watts to be on solid ground.

    On the other hand, to use Watts CRN evaluations and their pulication as analogous to the CRU data is to be grasping at straws or perhaps a life preserver at this point in the discussion.

  49. jh
    Posted Aug 1, 2009 at 5:23 PM | Permalink

    see this report in the uk press

    http://www.telegraph.co.uk/comment/columnists/christopherbooker/5955955/Weather-records-are-a-state-secret.html

  50. John Goetz
    Posted Aug 2, 2009 at 9:27 AM | Permalink

    The Russian Meteo site purged their daily temperature data a day after I sent them a question about the data in May of 2008. At that time I was comparing the daily records with what NOAA produced and GISS consumed.

    That data is still missing today. I still kick myself for not scraping a copy of everything before asking about it.

  51. David Cauthen
    Posted Aug 2, 2009 at 9:33 AM | Permalink

    Better get all your cletus comments in before Big Cletus gets back and erases ’em all. Think I’ll head over to RealCletus and see what dem boys is up to.

  52. EW
    Posted Aug 2, 2009 at 9:55 AM | Permalink

    John Goetz:
    Maybe the data are still available here – they can be downloaded as a .zip file for the respective stations. Some of the data go back to 1997, some to 2000. And they are in daily formats.
    http://meteo.infospace.ru/wcarch/html/e_sel_admin.sht?country=176

  53. EW
    Posted Aug 2, 2009 at 9:59 AM | Permalink

    John Goetz:
    Try the site meteo.infospace.ru – the data for stations can be downoasded in a zip format and there are daily values. The data go back some 10-15 years.

  54. Fred
    Posted Aug 2, 2009 at 10:33 AM | Permalink

    accurate UK met data, all for free, no FOI request required

    http://landedunderclass.wordpress.com/2009/08/01/the-met-office-enemies-of-the-people/

  55. Steven G
    Posted Aug 2, 2009 at 10:45 AM | Permalink

    CRU has just shot themselves in the foot. By making their data so secretive and unverifiable, they’ll also be taking themselves out of the loop when it comes to credible research. Researchers (even those with confidential agreements providing access to CRU data) will be forced to use other data sources because no sane person in the scientific community should believe research conclusions based on data that is unverifiable. Consequently, if I was a climate researcher, I wouldn’t waste my time using this data. I’d use other sources that my peers can corroborate.

  56. David Cauthen
    Posted Aug 2, 2009 at 1:30 PM | Permalink

    snip – please stop the abusive comments.

  57. Calvin Ball
    Posted Aug 2, 2009 at 2:21 PM | Permalink

    107, I don’t think we want to go off in the weeds on this, but I need to point out that you left a critical piece out: feedback. The raw greenhouse effect is relatively straightforward. It’s also not very scary. The theoretical controversy is pretty much entirely over feedback, which is anything but straightforward.

    The logical consequence of positing feedback, particularly sans any really good theoretical models, is that you have a theory, but the magnitude of the effect is unknown. This makes the data critically important, because the “proof” that IPCC uses essentially amounts to curve fitting the 20th century, or segments of it, to the Arrhenius equation, and backing out feedback as the curve fit parameter.

    I won’t waste time on what’s theoretically wrong with that, but it should be obvious that if the “model” is based on a curve fit, the data ends up driving the model. Thus the data is critical even in producing model results.

    Which brings me to the same conclusion: free the data; free the code; free the debate. But don’t forget that in the process that got us here, there are several theoretical assumptions that are subject to review and possible overturning.

    • steven mosher
      Posted Aug 2, 2009 at 11:07 PM | Permalink

      Re: Calvin Ball (#112),

      yes, feedbacks of course are important ( mentioned that somewhere around here).

  58. Posted Aug 2, 2009 at 5:14 PM | Permalink

    While it’s amusing that CRU is purging its FTP servers in a panic, I don’t see that there is anything sinister here.

    When I want to send a multi-MB file to a long list of people, I ordinarily just place it on my website with no link, and then just send them an e-mail with the URL. That way if they want to read it, they can print it or download it, and if not, I haven’t clogged up their computers with a big file. The directory lists in question are not publicly viewable (I think), so the public can’t find the file until I create a link.

    So it seems reasonable that CRU people would do the same thing. In fact, it’s a little careless that they would leave the directories where they do this public, and entirely appropriate that they would now clean up their public directories.

    Of course, this does not excuse them for not revealing their data files, or at least the international agreements with Syria or whoever that prevent them from releasing the raw data.

    If CRU wants to use secret data to compile their indices, there is probably ultimately nothing that can be done to stop them. However, IPCC member governments should move to block IPCC from using any such secret-source indices in its forthcoming Fifth Assessment Report.

    The FOIs for the withheld data and the alleged confidentiality agreements should still proceed full force. However, there is no reason to additionally FOI all the purged files, or anything relating to the purged files.

  59. John Goetz
    Posted Aug 2, 2009 at 7:50 PM | Permalink

    #104 EW:

    Thank you for point me to that link. The data, as you note, only goes back about a decade. The data that used to be on their FTP site went back to the 1930’s and before. It is a shame it is gone.

  60. FredG
    Posted Aug 2, 2009 at 9:17 PM | Permalink

    Hu McCulloch,

    The directory lists in question are not publicly viewable (I think), so the public can’t find the file until I create a link.

    I believe this is incorrect. The files and directories are read-only, so you can browse the directories and files even if no link exists. You can use an ftp client on passive mode to view these files.

    Unless you place the files in a password protected directory…

  61. Scott Gibson
    Posted Aug 3, 2009 at 1:42 AM | Permalink

    Steven Mosher et al,

    Because I am a geologist, I come to the table from a different view than many of you. Without getting in too much detail, it is clear from the geologic record that the climate has varied from much warmer than now to somewhat colder. Based on the fossils, sea level variations, and lithologies (rock types), it is clear that most of the Earth’s geologic history was much warmer than current temperatures. In fact, my view is that the current glacial/interglacial regime is probably the coolest period in the Earth’s history, but I can’t cite hard evidence proving this.

    As for carbon dioxide concentrations, it is difficult to be sure how it has varied because it is such a tiny part of our atmosphere. I am sure that it is more soluble in water at colder temperatures and less soluble at warmer temperatures. I suspect from the rock record that the concentration of CO2 has been much higher in the past than currently, but don’t know if that caused the warmer climates or was caused by the warmer climates.

    My main concern is that such tiny concentrations of CO2 at least intuitively should not outweigh much more abundant greenhouse gases, the most obvious being water vapor. It is a cop-out when people wave their arms and say that water vapor doesn’t cause much variation because it is always there. Living in southern Arizona, I have experienced large swings in temperature due to changes in water vapor (as much as 30 degrees in less than two hours). Yes, that is weather, but weather is part of climate.

    The lack of ability to trust the data we see in all these conversations makes it difficult to be have any certainty about the models we are given. So as a skeptic, I can agree wholeheartedly with the lukewarmer credo: free the data; free the code; free the debate.

    • steven mosher
      Posted Aug 3, 2009 at 3:20 AM | Permalink

      Re: Scott Gibson (#119),

      You’ll note that I was very careful not to say C02, but rather GHG. SteveMc has a policy here which I think is wise. If we want to talk about C02 it has to be in the context of a “foundational” (my word) text that discusses the relationship between increased C02 and increased temps with particular emphasis on the question of sensitivity. One thing I like about this blog is Steve’s attention to linchpin papers ( parker on uhi for example, mann on proxies, hansen on global temp)

      Lukewarmer: free the data;free the code;free the debate

      Steve” : Let me endorse Steve Mosher’s comment here. Editorially I see little purpose in two paragraph exchanges about the “big picture” effect of CO2. I’m interested in the topic but it’s surprisingly hard finding good references on the topic. If someone wishes to bring a reference to our discussion and use that reference as a starting point, fine, but otherwise let’s stick to smaller questions, which editorially have a chance of leading somewhere.

  62. Max
    Posted Aug 3, 2009 at 6:14 AM | Permalink

    Hmm, I always read sentences like this: “If the methods and data isn’t published, it is not science” Yet, the topic I am most familiar with is mechanical engineering and here papers and stuff only have pictures but never data (though a method description is in it usually). Also, authors more often refuse to openly distribute data they have often painstakingly acquired and with the help of a private company. So, they don’t want or can’t display the data freely.
    However, the difference is that engineering science is actually never used for policy decisions and when it is, the data usually has to be distributed. Yet, I don’t see why anyone in the climate science would act differently. This academia not “science fair”, it is the most narrow minded society possible at least when it comes to research…
    Often private companies are more forthcoming than research scientists, because they are a jealous lot and (rightly so) fear their fellow researchers. However, I would still like to see the data, especially since these studies are government only studies that serve no other purpose than to inform politicians and the public. On top of that they are funded by us, so they better stop pretending that this is all their IP. As with all inventions, the people running the company have a first shot at the patenting and thus the property of the data in question…
    Though this kind of thinking is still a stranger to Academia.

    • fFreddy
      Posted Aug 3, 2009 at 8:36 AM | Permalink

      Re: Max (#122),

      Yet, the topic I am most familiar with is mechanical engineering and here papers and stuff only have pictures but never data (though a method description is in it usually).

      In academic papers in mechanical engineering, are the method statements ever insufficient for a competent mechanical engineer to reproduce the paper’s results ?

  63. Edward
    Posted Aug 3, 2009 at 9:02 AM | Permalink

    I understand CRU’s reaction to having their data obtained by Steve. I work for a very large International Steel company. We and many other US steel companies provide daily information to CRU on a confidential basis which they use to create a weekly index of steel pricing levels. My company would consider it a violation of our agreement with CRU if our raw data was somehow shared publicly. CRU has agreements that had to be lived up to with providers of the data. I feel their withdrawal of that information protects the confidentiality of their clients.

    I do not agree with Phil Jones with-holding the data but it’s not CRU’s fault. CRU is a for profit venture and they are just trying to protect their investment and the value of their product. It’s not some climate conspiracy as some seem to infer.
    thanks
    Ed

    • bernie
      Posted Aug 3, 2009 at 9:44 AM | Permalink

      Re: Edward (#124),
      I am not sure I understand the link between the Climate Research Unit and an international steel index? Moreover, if there were or are confidentiality agreements, then Dr Jones could simply produce a list of them.

    • Mark T
      Posted Aug 3, 2009 at 9:52 AM | Permalink

      Re: Edward (#124),

      CRU is a for profit venture and they are just trying to protect their investment and the value of their product.

      There’s a difference between a large steel company protecting private interests and a company that operates on government funding whose work is used to make massive public policy decisions. These decisions are not otherwise protectable state secrets, e.g., some new weapon technology developed by a military contractor, but decisions that will directly effect the daily lives of the people that are paying for the data. By default, that gives them access rights, at the very least for legitimate scientific analysis, which Steve is certainly conducting.

      It’s not some climate conspiracy as some seem to infer.

      Phil Jones has made it clear that he refuses to release the data because he does not want to give skeptics an opportunity to find fault with it. Every other reason they (CRU) have proffered amounts to nothing but a smokescreen. If it looks like a duck, walks like a duck, and quacks like a duck, the only real conclusion is that it is a duck.
      .
      Mark

    • Keith
      Posted Aug 3, 2009 at 11:03 AM | Permalink

      Re: Edward (#124),

      Edward, I think you have mistaken which CRU we are discussing. I will hazard that you are used to this CRU (http://crugroup.com/Pages/default.aspx), which is an independent business analysis group, concentrating on particular industries. The CRU we are discussing is the Climate Research Unit, located at the University of East Anglia, and affiliated with the Hadley Climate Center. Yours is a commercial enterprise, while ours is an academic center concentrating on Climate Research.

      • Mark T
        Posted Aug 3, 2009 at 4:38 PM | Permalink

        Re: Keith (#126), CRU is still a private entity, albeit a sort of government contractor, which was his point.

        Mark

        • Keith
          Posted Aug 4, 2009 at 11:42 AM | Permalink

          Re: Mark T (#129),

          Mark, that is not the point. His comment was that he sent metallurgic information to CRU, and he expected it to remain confidential. The CRU he sent his information to is an analysis corporation. Yes, private, but a commercial entity, bound by its charter of incorporation. The CRU we are discussing is an academic entity, a part of the Faculty of Science at the University of East Anglia (http://www.cru.uea.ac.uk/). As such, they operate under the Royal Charter that founded the University, making them a public concern. Per the UEA webpage –

          The University of East Anglia’s Royal Charter commits it to ‘advance learning and knowledge by teaching and research and to enable students to obtain the advantages of university education.’ As a modern institution in a global economy the university is also committed to ensuring that it operates at the highest standards with appropriate policies that meet the needs of staff, students and the wider public.

          and

          Introduction to Freedom of Information

          The Freedom of Information (FOI) Act came into effect in January 2005 and aims to promote transparency and accountability in the public sector. Under the terms of the Act, individuals have the right to request any information that is held by the University including all digital and print records and information whether current or archived. There are situations where information is not required to be released, or should not be released due to exemptions. The University, as a public body, is obliged to comply with the Act, and all staff have the responsibility to make themselves aware of their obligations under the Act.
          What does the Freedom of Information Act mean for the University?

          There are two main obligations imposed by the Act on the University:

          1. That UEA must maintain a Publication Scheme, which lists the types and format of information the University routinely provides to the public

          2. That any individual making a request for information is entitled to be informed in writing by the University whether or not the University holds the information, and if it does to have the information communicated to them within the specified time limit of 20 working days.

          For more specific guidance and advice on compliance with the FOI Act in your work please visit our guidance for staff page.

          For further information on how to make a request for information, or to make a request please visit our requests for information page.

          (http://www.uea.ac.uk/is/foi)

          Dr. Jones, as a member of the Staff of UEA, is bound by these terms.

  64. EW
    Posted Aug 3, 2009 at 9:20 AM | Permalink

    John Goetz:
    One more possibility to get old Russian data, but rather awkward. At this page in Russian, there are names of stations and when clicking it (and optionally specifying the time interval), you’ll get a table of values for that station, although the download is rather long. The data are from 30’s or even from 19th century.
    Of course, collate this would be probably more difficult than the standard weather file format, but if this is the only option left… maybe it helps.

  65. Scott Gibson
    Posted Aug 3, 2009 at 11:31 AM | Permalink

    Re: Steven Mosher (#121)

    You’re right, I made the mental leap of assuming anyone talking about GHG was talking about CO2, and I apologize. That is also why I like this blog: it is possible to have a discussion where the science is not settled, and people are able to separate reality from models, whether they are computer models or just our own mental models.

  66. John Goetz
    Posted Aug 3, 2009 at 7:12 PM | Permalink

    #123 EW

    Bingo! That is it…but what a painful way now to collect data.

  67. ianl
    Posted Aug 4, 2009 at 4:09 AM | Permalink

    Not only was the reaction to “lie in the weeds”, but in fact then to pull the weeds over the top of them

    Our saying in Aus is “sitting on their hands” – a common reaction when caught out

  68. Mike Lorrey
    Posted Aug 4, 2009 at 5:00 AM | Permalink

    “Dave Lister” at CRU has to be a pseudonym, its the name of a character from the BBC scifi comedy “Red Dwarf”.

  69. Steve McIntyre
    Posted Aug 4, 2009 at 7:22 AM | Permalink

    Later in the day on July 31, Tim Osborn, obviously a faithful Climate Audit reader, censored the censored folder and now even the existence of the censored folder (and the controversy webpage) is now censored.

    • Geoff Sherrington
      Posted Aug 4, 2009 at 7:28 AM | Permalink

      Re: Steve McIntyre (#135),

      Oh gracious, the folk at Wiki will have nothing left to censor.

      Maybe they can invent a document and put it through even more censorship steps that you relate at CA, just to show that they can.

  70. MetMole
    Posted Aug 4, 2009 at 8:30 AM | Permalink

    Veering off topic, but talking of censorship — or maybe not, as the case may be — I can find no mention of CRU’s recent activities/embarrassment over at RealClimate.morgue.

    Strange. Funny ol’ business.

  71. Posted Aug 4, 2009 at 10:03 AM | Permalink

    RE Geoff Sherrington E 136,
    These numbers appear to be F (to the nearest .5deg) rounded to C: 50F = 10.0C, 61F = 16.1C, 61.5F = 16.4C, etc. [corr.]
    Was Aust on F back then?

    • Geoff Sherrington
      Posted Aug 4, 2009 at 5:20 PM | Permalink

      Re: Hu McCulloch (#140),

      Australia went decimal on 14 Feb 1966. These temps might be in F originally, but even then the frequency of occurrence of certain figs does not look nice. Similar patterns happen after 1966 as well. The lighthouse keeper might have had a thermometer problem, because sometimes Tmax is given but Tmin can be missing on the same days for up to 2 months at a atretch. Nonetheless, this interval is still covered by GISS, with in-filling. I also in-filled, but by taking the same term from the following year and noting that I used infilling when describing method.

      It is the failure to note detailed metadata and the failure to emphasise in-filling that causes my suspicion when GISS differs from BOM. Why adjust more, when adjustment has already been done? When you adjust, you should justify, right? (or left, or center).

  72. Posted Aug 4, 2009 at 10:16 AM | Permalink

    QBeamus

    To oversimplify, if a work is published without a copyright notice, it creates a safe harbor in which the public can use the work without fear.

    INAL, but I think this is not entirely accurate. Failing to add the copyright notice no longer prevents one from asserting rights under copyright. The copyright notice permits the owner to sue for an expanded collection or remedies when other violate copyright, but failing to place a notices does not elimimate all right.

    Some things, like facts and ideas, cannot be copyrighted at all. Fair use can permit some copying of copyrighted materials. For the most part, it’s the copyright owner and not others who can sue for copyright violations. Any of these may apply in the case of CRU data but they are separate from the question of whether or not failing to disclose the copyright notice bars the copyright owner from asserting rights under copyright. Owners can assert right to their copyright materials even if they did not include the notice in a publication.

    • QBeamus
      Posted Aug 4, 2009 at 1:24 PM | Permalink

      Re: lucia (#142), I’ve now read down to the posts from Steve asking that we not hi-jack the thread with discussion of copyright, but I was so excited, after months of lurking, to have something I know enough about to actually contribute, I can’t resist. So with some trepidation, let me respond.

      I believe you just said pretty much what I was trying to say, but let me be more precise. Until 1988, publication without notice (at least in the U.S.) would kill the copyright, period. That was a feature of every U.S. copyright statute since 1790, the first. In ’88, Congress passed the new statute that implemented (theoretically) the Berne convention. The BCIA was a compromise–those who prefer teh European regime complain that it failed to truly conform to the European standards. Perhaps this is where some of the confusion comes from…that is, the difference between the European rule, which it was supposed to implement, and the actual language and subsequent application by U.S. courts.

      In any event, as you say, the notice is important because it controls the range of remedies available. This is what I was referring to as the “safe harbor.” One who innocently copies a work that was published without the notice does not face legal liability. (See 17 U.S.C. sec. 405(b).) Assuming one is willing to stop distributing the work if and when they receive such notice, I believe my characterization is sound–they may use the work without fear (at least in the U.S.). That is, the worst that will happen is that they’ll have to stop, if and when the notice failure is remedied.

      In the context in which the issue came up in this thread, in particular, I believe it supports the conclusion, namely, that there is no reason, at present, to believe that downloading data from the CRU ftp site is a violation of (U.S.) IP law. That might conceivably be wrong–they might be able to “revive” a copyright. But, even if that’s so, you will be given notice and a chance to comply before facing any real liability.

      Your other observations about copyright law appear correct to me. My own professional opinion is that these data sets are copyrightable subject matter–“expression,” not an “idea”–(if a phone book can be, then this can, too), but that it would be a professional humiliation not to dedicate it to the public domain, and so I believe it is reasonable to believe that is what was done. Then again, the bizzarre secrecy measures of the Met office are what got this thread started in the first place.

      • Steve McIntyre
        Posted Aug 4, 2009 at 1:31 PM | Permalink

        Re: QBeamus (#147),

        Point made. Your observations seem sensible to me. But let’s leave the matter where it stands. Thanks/

  73. Mark T
    Posted Aug 4, 2009 at 10:27 AM | Permalink

    Lucia’s got it right. Failiing to file the copyright simply eliminates the possibility of collecting damages for infringement. Of course, I’m not sure how much protection one actually has if damages cannot be collected, i.e., if there’s no reason to sue.

    Mark

  74. Posted Aug 5, 2009 at 11:27 AM | Permalink

    Basicly if you are hiding your data, algorithms etc you aren’t douing science. Alchemy perhaps.

  75. Robinedwards
    Posted Aug 6, 2009 at 4:17 PM | Permalink

    Thanks, Geoff for those links. I have just tried a few, and they work, but I’ll not be looking for any data that are being sold! Currently a bit busy with data from Alaska, which show some remarkable features.

    I really am searching for series that are as long as possible, and wonder whether there is an index of any sort. Have not tried very hard to find one – just hoping that someone knows about these things.

    Your comments on “old” data are interesting. I am perhaps more sanguine (or naive) about old data. Really old stuff, such as for Berlin and other places in Europe, always impress me with their consistency. The techniques I use tend to draw attention to data that are unreliable or “fiddled”, and so far the old values seem to have suffered less than some of the modern ones, which can show signs of prior treatment, like smoothing, before publication. Those old investigators were perhaps a bit more upfront about their work than those engaged on climate research these days.

    Robin

    • Geoff Sherrington
      Posted Aug 7, 2009 at 6:08 AM | Permalink

      Re: Robinedwards (#158),

      I apologise for the cost factor, but I don’t set it. Really, I think this past data have already been funded by the taxpayer.

      It would be easy to send you selected indices and long datasets, especially ones that I have analysed, but there are strict copyright provisions that I do not intend to break.

      (I’m not invoking the infamous “Why should I send you all this work when all you want to do is criticise it?”)

      Some of the references are cost free, but you need to go beyond Excel if you have not already, to condense them into bite-sized chunks. And as I have posted before, there are discrepancies between compliers of the data to the extent that it’s a bit hard to tell what is real and what is make-believe.

      Please persevere with the Aussie data. It’s quite rewarding and shrouded with less muck and mystery than that from some other countries. I’m happy to help.

  76. Steve McIntyre
    Posted Aug 12, 2009 at 4:03 PM | Permalink

    Olive Heffernan of Nature explains the deletion as follows:

    It transpires, however, that these data were on an anonymous ftp server intended for Met Office Hadley Centre project partners only, and were not for public use.

    • henry
      Posted Aug 13, 2009 at 6:38 AM | Permalink

      Re: Steve McIntyre (#161),

      Maybe we should ask Olive Heffernan exactly who these “Met Office Hadley Centre project partners” were, and why they couldn’t have been directed to a file with more security. The idea of an “anonymous ftp server” being loaded for use of a specific few is absurd.

      I mean, how do they know that the users of this file were academics, or would agree to the “confidentiality” that Jones requested?

      Or for that matter, what system was set up to prevent public use?

10 Trackbacks

  1. […] “Unprecedented” Data Purge At CRU […]

  2. […] “Unprecedented” Data Purge At CRU […]

  3. […] 2009 Steve McIntyre has responded to the discovery that Professor Jones and his colleagues at the CRU have purged data from servers at the CRU with a typically humorous parody of alarmist headlines […]

  4. By Jennifer Marohasy » Data Purge at CRU on Aug 4, 2009 at 5:44 AM

    […] the University of East Anglia now withdrawing data that was once publicly availability.   Read more here. […]

  5. By Pajamas Media » Climate Data: Top Secret! on Aug 10, 2009 at 12:37 AM

    […] chief British Climate Research Unit (CRU) at Hadley has begun to eliminate the daily temperature records from its public […]

  6. By Freedom of obfuscation on Aug 15, 2009 at 7:31 AM

    […] occurred through some sort of FTP security lapse at the CRU, which was then fixed in what McIntyre describes – in extruciating detail, as if the tanks were rolling into Washington DC – as an “unprecedented […]

  7. By Typeboard on Aug 31, 2009 at 1:30 AM

    […] obtained raw data when it was accidentally left on an FTP server last month. Since then, CRU has battened down the hatches, and purged its FTP directories lest any more raw data escapes and falls into the wrong […]

  8. […] chief British Climate Research Unit (CRU) at Hadley has begun to eliminate the daily temperature records from its public […]

  9. […] we expect anything less from the people who erased the CRU tempurature data that countered AGW Thermageddon claims? Anthony has the guest post by Dr. Tony Brown here. It is […]

  10. […] also: – In July ‘09, following Freedom of Information requests, CRU undertook an “unprecedented” public data purge. – Ian Wishart in NZ’s Investigate Magazine 20 Nov issue (pdf) – WSJ blog (Keith Johnson) – […]