NAS: Assuring the Integrity of Research Data

per inquired recently about obtaining a copy of Gerry North’s presentation to the newly minted NAS Panel on Assuring the Integrity of Research Data, which held its first hearings last week. Gerry North was appropriately the first speaker, as the new panel was occasioned by problems left unanswered by the North panel, although its terms of reference are much broader. The North presentation is here. Some background and thoughts follow.

The NAS Panel was formed at the request of the House Science Committee, protecting their turf from previous questions sent by the House Energy and Commerce Committee to MBH, IPCC and the NSF. Because of the controversy over replicability, one of the key Boehlert questions was:

(2) (c) Has the information needed to replicate their work been available? (d) Have other scientists been able to replicate their work?

At the NAS panel hearings, the panel was nonplussed when von Storch put this question on the board as Ralph Cicerone, the President of NAS, had excluded this question from the terms of reference of the panel and they knoew nothing of the Boehlert questions (although these were widely disseminated). My contemporary discussion, Sir Humphrey and the Boehlert Questions, is here. It seemed that the NAS Panel was more interested in dealing with “big” questions rather than giving an opinion on the controversy that had occasioned their appointment.

At the end of the first day, David Goldston of the House Science Committee spoke up in the public portion, asking that the panel actually take some of the smaller controversies off the table, noting that there was going to be ongoing discussion of the “big” questions and lots of other occasions to deal with them. Several weeks later, the terms of reference were modified (contemporary CA report here) as follows:

– Comment [Evaluate -deleted] on the overall accuracy and precision of such reconstructions, relevant data quality and access issues, and future research challenges.

I thought that it would be very unlikely that the panel would grasp this nettle in view of their virtual total failure to even ask Mann about the questions raised by the House Committees and their total failure to follow up on Mann’s bizarre denial of even calculating the verification r2 statistic. And they didn’t, with only some highly generalized observations that providing data was a good thing. A contemporary report at CA is here. The panel did not investigate or report on actual availability, merely observing platitudinously:

Our view is that all research benefits from full and open access to published datasets and that a clear explanation of analytical methods is mandatory. Peers should have access to the information needed to reproduce published results, so that increased confidence in the outcome of the study can be generated inside and outside the scientific community. Other committees and organizations have produced an extensive body of literature on the importance of open access to scientific data and on the related guidelines for data archiving and data access (e.g., NRC 1995).

I recall the idea of a new committee on Data being in the air around the time, but didn’t notice any contemporary notes. My guess is that Sir Humphrey Cicerone decided that, if they had no choice but to respond to these matters, his best approach was to dilute the paleoclimate problems where an answer was easy with complicated questions about software and every other scientific discipline under the sun – sort of like trying to develop a Napoleonic Code instead of making common law decisions.

Be that as it may, the new panel is entitled: Assuring the Integrity of Research Data (PIN: CSEP-Q-06-02-A ). Its terms of reference here say:

Project Scope: An ad hoc committee will conduct a study of issues that have arisen from the evolution of practices in the collection, processing, oversight, publishing, ownership, accessing and archiving of research data. The key issues to be addressed are:

1. What are the growing varieties of research data-?. In addition to issues concerned with the direct products of research, what issues are involved in the treatment of raw data, pre-publication data, materials, algorithms, and computer codes?

2. Who owns research data, particularly that which results from federally-funded research? Is it the public? The research institution? The lab? The researcher?

3. To what extent is a scientist responsible for supplying research data to other scientists (including those who seek to reproduce the research) and to other parties who request them? Is a scientist responsible for supplying data, algorithms and computer codes to other scientists who request them?

4. What challenges does the science and technology community face arising from actions that would compromise the integrity of research data? What steps should be taken by the science and technology community, research institutions, journal publishers, and funders of research in response to these challenges?

5. What are the current standards for accessing and maintaining research data, and, how should these evolve in the future? How might such standards differ for federally-funded and privately-funded research, and for research conducted in academia, government, nongovernmental organizations, and industry?

The agenda for their first meeting is here. At my request, North sent me his presentation and gave me permission to post it up. My comments here are similar to my earlier comment. The PPT leads with an anti-Barton editorial from the Houston Chronicle and then as separate slides on both realclimate and climateaudit. One of his last slides is my letter post-NAS Panel to North asking for help getting data. (Not a single piece of data has been provided; the committee apparently did not ask whether any of the information was ever provided to me.)

Some of the points that North highlighted give a hugely misleading impression of paleoclimate data issues. For example, Phil Jones obstructionist refusal to identify stations is well-known to CA readers. At the NAS Panel hearings, Hans von Storch posted a slide with Phil Jones famous refusal: “We have 25 years invested in this, why should we let you see the data when your only objective is to find something wrong with it?” von Storch said that he could not believe that a responsible scientist had made such a statement and that he asked Jones to confirm the truth of this story – which Jones did. Von Storch condemned this attitude in the strongest possible terms. Instead of bringing this to the attention of the Data Integrity Panel, North posted up the following trivializing email exchange involving Phil Jones as follows:

John,
You don’t need to ask permission to get HadCRUT2v. It is sitting on our web site.
http://www.cru.uea.ac.uk/cru/data/temperature/ . . .Cheers Phil

Readers of CA are familiar with the fact that key data sets used over and over by the Team in multiproxy studies – Yamal, Taymir etc. – are not archived and that Briffa refuses to disclose the data. These problems were brought to the attention of the NAS Panel. Indeed, I specifically confronted two presenters to the NAS PAnel, D’Arrigo and Hegerl, about the unavailability of data and asked the NAS panel to get it. The NAS panel did nothing. Instead of reporting on these issues, North cited a 1997 study saying the following:

A considerable portion of tree ring data collected on all inhabited continents is freely available online (Grissino-Mayer and Fritts 1997)

While its true that much tree ring data is online, the trouble is obviously that this isn’t the case for some key series used by the Team. In his covering email to me, North said that he verbally took the position that data archiving procedures need to be tightened up:

My suggestion (not really on the slides) is that when an agency awards a grant like some many of these we have talked about, there should be a negotiation as to what is to be saved and in what form. There needs to be some consideration for the costs, etc. But the bottom line is that these things need to be agreed upon before the money is awarded.

In his covering email, he also said that he’d said that the paleoclimate community was “shocked” to find themselves thrust into the limelight and “totally unprepared” for it. That seems only partly true – they seem quite prepared for awards from Scientific American and quite prepared for adulation. Indeed, not only were they prepared for it, they went so far as to issue press releases for many scientific studies – you don’t issue press releases if you aren’t seeking the limelight. What they seem not to have been “prepared” for is someone saying: if you communicate with the public, then you have responsibilities of disclosure and due diligence that exceed those applicable to discussion in seminar rooms.

61 Comments

  1. Posted Apr 23, 2007 at 6:18 PM | Permalink

    “Instead of bringing this to the attention of the Data Integrity Panel, North posted up the following trivializing email exchange”

    An initial instinctive reaction would be that North is seriously afraid of something. For some reason the players seem to be afraid to have it put plainly in the public record that the data and methods are not available and these conclusions can not be independently verified. I am not sure what the value might be in my speculation of exactly he and the others might be afraid of, but I did want to share that this does appear to me to be cover up and not a single soul is willing to acknowledge the elephant in the room. If I were a betting man, I would say they might have some serious skeletons in their closets that they want to see remain there. These people are afraid, in my opinion.

  2. Gary
    Posted Apr 23, 2007 at 7:22 PM | Permalink

    A sixth item seems to be missing from the AIRD Project Scope list: creation of an inventory of projects receiving federal grants that were completed in the last several (three? five?) years with a simple indication of whether or not “raw data, pre-publication data, materials, algorithms, and computer codes” were archived in part or in whole. How can this panel be any more than another round of hand-waving if a benchmark isn’t established? And just to be honest about it, there ought to relevant information about what archiving was required for each grant.

  3. Stan Palmer
    Posted Apr 23, 2007 at 9:12 PM | Permalink

    When post-modern philosphers point out that published sciens is a narrative agreed upon because it suits the inerest of stakeholders, physicists and other scientists publish papers ridiculing such nonsense. Science is an unbiased search for the truth. When science as it is practiced is held up to the light,as it is here, it is illuminating to see which side is spouting nonsense.

    The issue here is this is a political effort being pursued to advance the interest of the stakeholders. This blog keeps trying to treat it as a pure scientific effort. Pure science does not exist. Real science is all politics and politics is about power not truth. The truth is that which suits the interestes of teh powerful.

  4. Armand MacMurray
    Posted Apr 23, 2007 at 11:37 PM | Permalink

    I remain puzzled by the repeated implication that some sort of considerable cost will be incurred by performing acceptable archiving, as shown again in the quote of North above:

    There needs to be some consideration for the costs, etc

    This isn’t genomics generating gigabytes of data! It seems that at worst, any extra work would just involve scanning in a few notebook pages giving location/description details of cores/stations and uploading those along with the small data files containing ring measurements, temperatures, or whatever. Uploading code/calculation files would be nice, but also not very taxing.
    Perhaps there’s an “angry” climatologist out there who can point out the big time/money sink that I’m missing? 🙂

  5. bkc
    Posted Apr 24, 2007 at 12:13 AM | Permalink

    Dr. North, (I’m assuming you will read these comments)

    As a U.S. citizen, fellow Aggie (do Texas A&M Professors consider themselves Aggies?) and, with respect to Aggie (and TCO) tradition – after a couple of 101s and water, I would like to make a few comments on your presentation.

    First, it’s hard to get the flavor of your presentation from the slides alone. You present some of both sides, but without your commentary it isn’t really clear what your position is(at least some of the issues).
    Would you consider doing a guest post on this site (as you’ve done on others)? I’m sure Mr. McIntyre would apprecitate it. I think it would really help clarify your position, especially if you respond to some of the comments it would inspire.

    You come across as a very reasonable, intelligent kind of guy. However – you agreed with Mr. McIntyre’s criticisms of MBH in the body of the NAS report, but made excuses for them in the press conference. Dr. Mann essentially lied to your panel in his testimony regarding calculating R2, and you tolerated it. Mr. McIntyre has documented numerous examples of shoddy science, obfuscation, cherry picking, etc., but you are largely silent on the subject. Why is that?

    If I were as prominent and influential as you in my profession, I hope I would be the first to condem these practices. Whatever you believe about AGW, permitting lousy science to go unchallenged will hurt your cause.

  6. TAC
    Posted Apr 24, 2007 at 4:19 AM | Permalink

    The five questions listed under “Project Scope” all seem reasonable, but they don’t reflect the evolving context of these questions. For one thing, technology has altered the landscape dramatically over the past decade. The cost of public distribution of data has dropped by orders of magnitude, and because essentially all data are now processed electronically, it has become trivially easy to publish them (as well as source code and detailed notes) on a website. While there may have been a valid “cost” argument at one time, it is hard to imagine making that argument today.

    In any event, where scientific work is used to inform public policy, the argument for open access — possibly with cost recovery when it comes to distribution of journal articles, for example — seems compelling. Is anyone prepared to argue that publically funded scientific work should not be open to scrutiny?

    This Panel has the opportunity to recognize, endorse and promote a strong, open, and positive vision for government-funded 21st-century Science. I hope it takes advantage of it.

  7. Sara Chan
    Posted Apr 24, 2007 at 6:02 AM | Permalink

    #6, TAC, asks this: “Is anyone prepared to argue that publicly funded scientific work should not be open to scrutiny?” The answer is a very definite “Yes”. In particular, Peter Brown, the President of the Tree-Ring Society (the main society representing dendrochronologists), says that even when funding comes from public agencies, the data is his. No one else is necessarily allowed to scrutinize the data. Even more, Brown thinks that anyone who cannot see this is being grossly unreasonable.

  8. John A
    Posted Apr 24, 2007 at 6:17 AM | Permalink

    I can’t help feeling I’m watching climate science come to grips with “problems” and “questions” already posed and answered for the pharmaceutical industry, hospitals, medical centers, universities and health NGOs.

    If climate scientists weren’t getting a free pass, but were expected to be as slippery with the truth as pharmaceutical companies are claimed to be, would we really be looking at a document with such open-ended questions?

    Where are the questions concerning scientific ethics and the roles of audit and replication, Dr North?

  9. STAFFAN LINDSTRÖM
    Posted Apr 24, 2007 at 6:29 AM | Permalink

    4. Hey there 1 GB on DVD Cost of DVD-R or DVD+R 12-36 CENTS,
    depending on quality…US currency…too expensive…???

  10. John F. Pittman
    Posted Apr 24, 2007 at 7:32 AM | Permalink

    From #2

    A sixth item seems to be missing from the AIRD Project Scope list: creation of an inventory of projects receiving federal grants that were completed in the last several (three? five?) years with a simple indication of whether or not “raw data, pre-publication data, materials, algorithms, and computer codes” were archived in part or in whole.

    From #4

    This isn’t genomics generating gigabytes of data! It seems that at worst, any extra work would just involve scanning in a few notebook pages giving location/description details of cores/stations and uploading those along with the small data files containing ring measurements, temperatures, or whatever. Uploading code/calculation files would be nice, but also not very taxing.

    from #6

    The cost of public distribution of data has dropped by orders of magnitude, and because essentially all data are now processed electronically, it has become trivially easy to publish them (as well as source code and detailed notes) on a website. While there may have been a valid “cost” argument at one time, it is hard to imagine making that argument today.

    I am afraid this concern of cost is a red herring. As the orignal post indicates

    In his covering email, he also said that he’d said that the paleoclimate community was “shocked” to find themselves thrust into the limelight and “totally unprepared” for it. That seems only partly true – they seem quite prepared for awards from Scientific American and quite prepared for adulation. Indeed, not only were they prepared for it, they went so far as to issue press releases for many scientific studies – you don’t issue press releases if you aren’t seeking the limelight. What they seem not to have been “prepared” for is someone saying: if you communicate with the public, then you have responsibilities of disclosure and due diligence that exceed those applicable to discussion in seminar rooms.

    Let’s think about this rationally. as #8 has indicated. What if we were discussing a cure for the HIV virus. Does anyone think that effort to record the data, effort to be prepared for the “limelight”, lawyers to secure data integrity would not have been mandated and required? Do you not believe that testimony would be required such that if someone made such a claim, testified, and was proven to lie, that they would not be prosecuted for felony perjury?

    Yet the potential costs of reducing just CO2 emissions could dwarf what we are looking at as the cost of HIV. I started reading about these issues because regulations are being considered for the US. Other nations have already started. No where in the cost estimates have I seen one of the fundamental facts about man’s fuel use. We take essentally pure, concentrated forms of hydrocarbon, and oxidize these forms effeciently to produce energy that results in a volumous discharge of gases. Two funadmental facts of this process is that the concentrated forms are easy and realitively cheap to handle, gaseous emissions are not easy nor cheap to handle. This makes me want to add another item to the Scope.

    Item x. Should potentail costs that the public, industry, and government communities face arising from the science and technology reports be anticipated with respect to data integrity, transparency, and sharing? What steps should be taken by the science and technology community, research institutions, journal publishers, and funders of research in response to assuring transparency, data sharing (including undocumented algorithms), and verification for reports where it can be reasonably assumed that these costs to the communities will incurr?

    I would also like to add another item to the scope. Since, I, or my superiors always have to certify that the information is true and accurate or face multiple 5 year perjury felonies, I propose this addition:

    Item y. What certifications should the science and technology community persue to insure integrity of research data and what penalaties should the science and technology community face arising from actions that would compromise the integrity of research data? What steps should be taken by the science and technology community, research institutions, journal publishers, and funders of research in response to these challenges?

  11. JP
    Posted Apr 24, 2007 at 8:58 AM | Permalink

    #4,
    Much of the temp data used by different researchers comes from the same sources. Armand, I certainly agree with you in that repsect. How much money can it cost to archive what is probably less than 1 GB of data, various pieces of source code, sorting and filtering routines, etc…? If one were to take all of the hourly weather observations for the last 40 years and put them into a database, that database would be smaller than database running an ERP system for a medium sized factory.

    My solution is simple. Since the Federal goverment spends nearly 3 trillion dollars of taxpayer’s money, there should be enough money to fund a Climate Science clearinng house which will provide this archiving service free of charge. This service would be madatory for all who use federal funds, participate in the IPCC and other organizations.

  12. Mark T.
    Posted Apr 24, 2007 at 9:10 AM | Permalink

    Is anyone prepared to argue that publically funded scientific work should not be open to scrutiny?

    Some, yes. Military work should not be open to public scrutiny, nor any other defense related work. Neither should any work that is based solely on the concept of private profit. This dendro work does not qualify in either respect. It should also be noted that those holding on to their work for profit motive do not publish in journals that require archiving, either. It should also be noted that patentable ideas remain so for some time period after initial publication (though it is unwise to publish first, then attempt to patent).

    Mark

  13. Richard deSousa
    Posted Apr 24, 2007 at 10:04 AM | Permalink

    If medical research to invent new drugs to cure illness were conducted like climate research we’d all be dead by now because of the inability to verify the efficacy of the drugs.

  14. trevor
    Posted Apr 24, 2007 at 10:17 AM | Permalink

    Aren’t there laws about this stuff if the work is funded by ‘we the people’ money?

  15. John Hekman
    Posted Apr 24, 2007 at 11:07 AM | Permalink

    Steve
    Are you aware of this article by Soon et al?

    Estimation and representation of long-term (>40 year) trends of Northern-Hemisphere-gridded surface temperature: A note of caution

    abstract:
    Several quantitative estimates of surface instrumental temperature trends in the late 20th century are compared by using published results and our independent analyses. These estimates highlight a significant sensitivity to the method of analysis, the treatment of data, and the choice of data presentation (i.e., size of the smoothing filter window). Providing an accurate description of both quantitative uncertainties and sensitivity to the treatment of data is recommended as well as avoiding subjective data-padding procedures.

    http://www.agu.org/pubs/crossref/2004/2003GL019141.shtml

  16. Posted Apr 24, 2007 at 12:20 PM | Permalink

    #15

    I think RC has debunked that already:

    http://www.realclimate.org/index.php/archives/2005/01/peer-review-a-necessary-but-not-sufficient-condition/

    Size of smoothing filter window is something I’ve been worried about ( e.g. http://www.geocities.com/uc_edit/MA/moving_average.html ), I’d like to see that paper.

    RC:

    It is unfortunate that a followup paper even had to be published, as the flaws in the original study were so severe as to have rendered the study of essentially no scientific value.

  17. Mark T.
    Posted Apr 24, 2007 at 12:36 PM | Permalink

    Aren’t there laws about this stuff if the work is funded by we the people’ money?

    Not that I know of.

    Mark

  18. Posted Apr 24, 2007 at 1:40 PM | Permalink

    It is obvious that Mann et al never opened their statistical signal processing textbooks. Reading MannGRL04 makes me sick (and that is the paper that supposedly invalidates that Soon’s paper). Not my problem, but someone should go to some SIAM conference, present that paper (or any other that these people are involved) and shout: Go get em boys! 😉

  19. Mark T.
    Posted Apr 24, 2007 at 2:03 PM | Permalink

    It is obvious that Mann et al never opened their statistical signal processing textbooks.

    Mann openly admits to not being a statistician, while vehemently defending his statistical methods.

    Mark

  20. Douglas Hoyt
    Posted Apr 24, 2007 at 2:41 PM | Permalink

    Has Mann ever published a paper that is not riddled with errors?

  21. John Lang
    Posted Apr 24, 2007 at 6:48 PM | Permalink

    Have a look at Michael Mann and Gavin Schmidt’s latest statistical analysis.

    Tropical Atlantic SST’s are up 2.5C. The highest correlation you have ever seen with tropical storm PDI’s and the “unbelievably” high increase in SSTs in the past 33 years. LOL.

  22. Paul Linsay
    Posted Apr 24, 2007 at 7:35 PM | Permalink

    #16

    Size of smoothing filter window is something I’ve been worried about

    Amen. For some reason climate scientists have the bizarre notion that they should run a smoothing filter over any and every time series. Not only does it produce false correlations and spurious trends but it also removes a lot of interesting information that might exist in the noise. It happens a lot on this site too. I guess it’s to be expected, “bad company ruins your manners.”

  23. Steve Sadlov
    Posted Apr 24, 2007 at 7:38 PM | Permalink

    RE: #21 – and of course, the obligatory “with adjustments and corrections” – arrrrrrgh!

  24. Posted Apr 24, 2007 at 11:41 PM | Permalink

    #20

    I’m not aware of such paper, maybe there is a paper where he doesn’t use math.

    #21

    RC:

    Some have even gone so far as to state that this study proves that recent trends in hurricane activity are part of a natural cycle.

    It seems that for those guys H0 is ‘this phenomenon is due to humans’, and you’ll need lots of evidence to reject that hypothesis.

  25. Don Keiller
    Posted Apr 25, 2007 at 9:07 AM | Permalink

    Not connected directly with this thread, but thought you guys would be interested in this http://environment.guardian.co.uk/climatechange/story/0,,2064925,00.html
    In short dozens of climate scientists are trying to block the DVD release of a controversial Channel 4 programme that claimed global warming is nothing to do with human greenhouse gas emissions.

    Sir John Houghton, former head of the Met Office, and Bob May, former president of the Royal Society, are among 37 experts who have called for the DVD to be heavily edited or removed from sale. The film, the Great Global Warming Swindle, was first shown on March 8, and was criticised by scientists as distorted and misleading.

    In an open letter to Martin Durkin, head of Wag TV, the independent production company that made the film, the scientists say: “We believe that the misrepresentation of facts and views, both of which occur in your programme, are so serious that repeat broadcasts of the programme, without amendment, are not in the public interest … In fact, so serious and fundamental are the misrepresentations that the distribution of the DVD of the programme without their removal amounts to nothing more than an exercise in misleading the public.”

    I just wonder if these eminent scientists will be calling for the revision/withdrawal of “An Inconvenient Truth”? Or do they think that this latter film accurately represents the science?

  26. Stan Palmer
    Posted Apr 25, 2007 at 9:36 AM | Permalink

    re 25

    In regard to science as “an accurate representation of reality”, one must remember that this is a philosophical and not a scientific problem. Scientists have no particular expertise in determining what is real or not real. For this sort of discussion, the concept of hyperreality is a good one to consider. Polar bears are thriving. Their demise is a media invention. It was invented in the media because it suited the broader purposes of environmentalism. In nature, polar bear populations are increasing to the level that they are becoming pests in human settlements. In the reality of “An Inconvenient Truth”, they are rapidly disappearing. In the words of the Wikipedia entry

    Hyperreality is a means of characterizing the way consciousness defines what is actually “real” in a world where a multitude of media can radically shape and filter the original event or experience being depicted

    Scientific reality is something that is invented. The philosopher C.S. Pierce pointed out that it could be invented because it was useful in making predictions and calculations. It can be invented because it is pragmatically useful. On the other hand, it can be invented because it suits a larger social purpose. The AGW of “An Inconvenient Truth” is a scientific truth of that sort. AIT accurately represents a the hyperreal view of a certain group environmentalists
    .

  27. Steve Sadlov
    Posted Apr 25, 2007 at 10:33 AM | Permalink

    RE: #25 – The Ministry of Truth must edit the DVD first, to ensure content fitting of the Love of Big Brother, prior to any Telescreen viewing of it /s

    RE: #26 – Bear problems. This issue is heating up (pun intended) with regards to all North American species. Some are becoming so accustomed to being garbage bears that they don’t hibernate. UHI for wamth, trash cans for food, life is good! Interstingly, in a local rag I read recently, some local Greenie was trying to spin it into “early emergence from hibernation due to AGW” – when in fact, I doubt that enough bears a tagged and trackable to be able to assert this. There is no early emergence, they never went into hibernation at all. Surburbanized bears.

  28. fFreddy
    Posted Apr 25, 2007 at 12:39 PM | Permalink

    RE #25, Stan Palmer
    Rubbish.
    There are not multiple realities: there is only one reality and multiple perceptions of that reality.
    The demise of polar bears is not a media-invented reality, it is a media-promulgated misperception.
    Why do philosophy geeks spend their whole time trying to sneak complete redefinitions of the language into their arguments?

    This post is much politer than its first incarnation …

  29. Don Keiller
    Posted Apr 25, 2007 at 1:00 PM | Permalink

    Amazingly Sir John Houghton still endorses the “Hockey Stick”
    http://www.jri.org.uk/brief/climatechange.htm

    Clearly he will be more than happy with its use by Al Gore-
    This is dismal stuff – this guy advises our Government!

  30. Steve Sadlov
    Posted Apr 25, 2007 at 1:12 PM | Permalink

    RE: #29 – (OT)Whether or not one agrees with its overall premises, Peter Hitchens’ book “The Abolition of Britain” does a good job of presenting snap shots of the British mass psyche at various points in post Ww2 history. If one were to extrapolate the trends based on said snapshots, from the date the book was completed (I beleive it must have been ~ 2001 or so) none of this is a bit surprising. My only question would be, where will it lead? I will rule nothing in or out. The lessons of the 16th through late 18th centuries are particularly poignant.(/OT)

  31. per
    Posted Apr 25, 2007 at 1:41 PM | Permalink

    I understand that Prof North is acting on behalf of a committee, and so must give a presentation that reflects the committee’s view. Nonetheless, i would not describe the content of the powerpoints as making a case; although the accompanying narrative would make an enormous difference. I was a bit disappointed in the powerpoints, given the material he must have seen.

    There needs to be some consideration for the costs, etc

    I think the practicalities are important here, so I will accept this and other points. Someone needs to look after the server that makes the stuff available; someone needs to curate the archives, manage updates, etc. Even if the data is 1 kByte per entry, it still requires attention; and attention costs money.

    Not that this is any excuse for some of the issues publicised so well on this blog !
    per

  32. Steve McIntyre
    Posted Apr 25, 2007 at 1:57 PM | Permalink

    #31. per, that’s already looked after. The World Data Center for Paleoclimatology is already funded to do this. There’s a large existing collection of tree ring, ocean sediment and even ice core data. So costs at a permanent archive are NOT an issue. The only cost is sending the data in.

  33. Jaye
    Posted Apr 25, 2007 at 3:04 PM | Permalink

    Scientific reality is something that is invented.

    That’s an old argument. For instance, is mathematics discovered or invented. Not an easy question to answer. When I discovered that you could prove the Fundamental Theorem of Algebra using complex analysis, topological or algebraic proofs, I switched from invented to discovered. Although some would say that because beings with the same basic brains conceived of these notions then there will naturally be cross talk between the various forms. Though I still think its discovered. Reality does exist, unless of course we just think it does.

  34. per
    Posted Apr 25, 2007 at 4:23 PM | Permalink

    #32; it is clear to me that many of the points on data archiving raised at this blog are straightforward. There seems to me to be precious little excuse for failing to archive many of the series you have blogged about; and there is an existing archive. That is why I wrote that there is no excuse for these examples.

    I took North’s point to be broader, and I was responding in part to Armand MacMurray’s post, #4. Even without considering the prospects of archiving new and different forms of data, and data from proprietary programmes, etc., the archive has to get funded from the research pot. As you can imagine, this pot is fiercely contested, and it is very easy for data centres to lose funding; and data centres frequently do have to put in effort to offer a good service (curation, cataloguing, backup,…). So my perspective here is that North was right to identify this as an important issue, even if it isn’t much of an issue to a user of the service.

    per

  35. Stan Palmer
    Posted Apr 25, 2007 at 4:51 PM | Permalink

    re 28

    What ‘reality’ is available to the standard media consumer in the west. It is the reality reported on that media. In the media, polar bears are being gravely endangered by AGW. If reality is what we experience with out senses then teh AGW demise of the polar bear is real. This the hyperreality that people use to make their decisions. Teh science of AIC is true in a hyperreal world.

    So in reply to 33, reality deos exist, it is jsut that it is not particularly important in the discussion of AGW. If sceptics attempt to treat this as an issue of natuaral reality, they are bound to lose. Hyperreality is a creation of the media and the media are a creation of what people want to believe.

  36. Steve Sadlov
    Posted Apr 25, 2007 at 5:00 PM | Permalink

    RE: #35 – Classic example of hyperreality. Many ski areas in California are suffering financially, and this in spite of some really wonderful, long seasons since the end of the 1987 – 1991 drought. Why? Because the masses, conditioned by the media to believe that “AGW will cause snow levels to rise and shorten ski seasons” no longer bother to check ski area’s web site or book trips there after the end of February. It’s great for us, since we dispense with the crowds during Mar – May, but terrible for the ski areas, who can barely afford to run half their available lifts during this now lengthened “shoulder season.” Maybe they should undertake class action lawsuits against Gore and Der Govinator.

  37. Sam
    Posted Apr 25, 2007 at 5:14 PM | Permalink

    While it has already been said many times here, imagine if a group of scientists antithetic
    to AWG produced a study claiming to,once and for all, disprove any human responsibility for the apparent climate change. While accepting all the laudatory acclaim (I’m dreaming) some disgruntled, and soon to be out of work, catastrophists desperately demanded the data sets that proved this finding. The new scientists de celebre calmly state that their data had been inadvertently destroyed but their work was so thorough that everyone could rest assured that the results were solid and valid. Congress should withdraw any actions it had planned.

    I’m sure this would be gracefully accepted and no governmental investigations or hearings would be forthcoming.

  38. per
    Posted Apr 25, 2007 at 5:23 PM | Permalink

    imagine if a group of scientists antithetic to AWG…

    well, there are many examples of similar groupthink. One prime example is tobacco smoking; while it is clear that the tobacco companies did not act particularly ethically, I am also aware of a rather eminent professor who deliberately suppressed data because it showed that tobacco smoking had a beneficial effect.

    I am sure he had the best possible motivation; but just imagine the effect on the scientific literature if people deliberately self-censor what they publish; you wouldn’t be able to trust anything you read, for fear either that it was self-censored, or people hadn’t published the contradictory results. This type of analysis is one of the reasons why the scientific literature on second-hand smoking health effects is so weak.

    per

  39. TAC
    Posted Apr 25, 2007 at 6:25 PM | Permalink

    The Committee on Assuring the Integrity of Research Data in an Era of E-Science (roster) seems to be a diverse group consisting of independent-minded and highly accomplished physicists (one atmospheric scientist, Wofsy), biologists and lawyers. My bet is that we can look forward to a thoughtful and interesting report.

  40. Steve Sadlov
    Posted Apr 25, 2007 at 7:54 PM | Permalink

    RE: I scanned some snowfall records for a famous (think 1960 Winter Olympics) ski area. Since the winter of 98/99 there has been no discernible trend toward shorter seasons. The longest season in that time frame was actually 04/05 – which started October 20 and ended the week after Memorial Day. Even this season, in what is technically a drought year, is neither the shortest nor the one with the lowest cumulative snow fall amount.

  41. hswiseman
    Posted Apr 25, 2007 at 8:03 PM | Permalink

    The beginning, middle and end of scientific reality is measurement. Measurement can only exist within a context. Context is created through the use of conventions such as GMT, the length of a meter, the mass of a kilo, the quantification of a calorie, etc. This whole field looks like the Gang that couldn’t measure straight (certainly there are not too many straight shooters either). The conventions for measuring SST, surface temp, polar ice coverage, hurricane intensity, appropriate statistical technique and so on and so on are poorly defined, lack universal scientific acceptance and are reduced to practice with feeble rigor and dubious calibration. If we were serious about the endeavor to understand our climate (and if you have not yet figured out that a huge chunk of the “science” is nothing but agitprop, you may stop reading now), the first step would be the creation of comprehensive agreed conventions reliably executed. The failure to address this simple step speaks volumes about the hyperbolic search for climate truth. Wake me up when the real data gets here.

  42. bender
    Posted Apr 25, 2007 at 8:14 PM | Permalink

    I’m no expert, but 20th c. temperature measurements were never designed for global climate change detection. They were designed for local-scale weather forecasting. It’s not that global climatologists can’t measure straight. It’s that local meteorological observations aren’t optimized for global climatological inference. Consequently you need to make adjustments if you want to make correct inferences. Hence the complexity, the statistics, the arguments over methodology …

    No one here has proven that Jones’s adjustments are wrong. All that has been shown is that we don’t know what the adjustments really are.

  43. Steve McIntyre
    Posted Apr 25, 2007 at 9:41 PM | Permalink

    #42. bender, the CRU assumption that all SST measurements shifted in 1941 from Recyclable AbathyThermographs (buckets) to engine inlets is clearly wrong.

  44. MrPete
    Posted Apr 25, 2007 at 9:45 PM | Permalink

    bender said:

    No one here has proven that Jones’s adjustments are wrong. All that has been shown is that we don’t know what the adjustments really are.

    …and AFAIK, [known value] * [unknown adjustment] = [unknown value]

    Therefore, what has also been shown is that we don’t know what Jone’s results really are, even though they are published!

    If I’m not mistaken, the CI for unknown data is unknown…

    …and thus, any work that incorporates Jone’s data as a factor has actually introduced an unknown.

    How interesting.

    I used to postulate that only executable code (including scripts, macros, etc) could convey a computer virus… and that pure data was immune from viral attack.

    What we’re seeing here is a delayed-release data virus: the infection has been spreading since 1990. Now the trigger is being released. It will be fascinating to see how much scientific work has been infected.

  45. tc
    Posted Apr 25, 2007 at 10:21 PM | Permalink

    Yes, there are federal laws regarding access to and integrity of research data (reply to trevor #14, Mark T. #17). Here’s an excerpt from Public Law 105-277 105th Congress enacted October 21,1998:

    That the Director of OMB amends Section __.36 of OMB Circular A-110 to require Federal awarding agencies to ensure that all data produced under an award will be made available to the public through the procedures established under the Freedom of Information Act…

    After the law was enacted, OMB set about to implement that law by proposing to revise Section 36 of OMB Circular A-110 with a watered-down version of the law’s plain language. Even this watery version was too difficult for the scientific research establishment to swallow. The establishment was energized and swamped OMB with squeals of discomfort. As a result, here is an excerpt from the Section 36 that OMB adopted:

    (d) (1) In addition, in response to a Freedom of Information Act (FOIA) request for research data relating to published research findings produced under an award that were used by the Federal Government in developing an agency action that has the force and effect of law, the Federal awarding agency shall request, and the recipient shall provide, within a reasonable time, the research data so that they can be made available to the public through the procedures established under the FOIA.

    http://www.whitehouse.gov/omb/circulars/a110/a110.html#36

    Section 36 is a powerful and useful tool to obtain federally-funded research data used by the Federal Government in developing an agency action that has the force and effect of law. But compare the plain language of the law to the overly-restrictive language of OMB’s implementation.

    One action that I will be recommending to the NAS Committee on Assuring the Integrity of Research Data is that the NAS Committee request OMB to revise Section 36 of OMB Circular A-110 to better reflect the plain language of Public Law 105-277.

    Frankly, it was disappointing to see the NAS Project Scope’s laundry list of questions that suggest the scientists are wandering in the wilderness with no guide. In fact, there are already existing laws and regulations with established requirements for those questions that pertain to federally-funded research.

    It also was disappointing to see that the agenda of Committee’s first meeting (April 16-17, 2007) did not appear to include a major presentation on existing laws and regulations. Such a presentation would establish the context and basis for discussion of needed improvements. For example, it would have been useful to have a presentation titled: Current federal laws and regulations on access to and integrity of research data: what is working, what is not working, what needs improvement.

    On the project’s NAS public website, the lack of explicit recognition of existing laws and regulations in the Project Scope and in the first meeting agenda does not reflect well on the public face of this project. Through the NAS website, I will be making recommendations to the Committee to remedy this situation. You too can provide recommendations or comments on this project at the FEEDBACK link on the bottom of:

    http://www8.nationalacademies.org/cp/projectview.aspx?key=48721

    In the coming days, I will post more information about existing federal laws and regulations on access to and integrity of research data.

  46. bender
    Posted Apr 25, 2007 at 10:34 PM | Permalink

    Re #43 Ok, other than that, #42 stands. [Hadn’t read the RAT thread yet. And like I said, I’m no expert.]
    Re #44 You are throwing the baby out with the bathwater. Counterpoint:
    [biased value] * [correct bias-correction method] = [correct value]
    The problem is that without transparency you have no way of evaluating the correctness of the bias-correction method. Point is: transparency and correctness are different issues. So don’t go assuming that people who aren’t transparent are necessarily wrong.

  47. trevor
    Posted Apr 25, 2007 at 10:38 PM | Permalink

    Re #45: Thank you tc. Clearly an excellent well informed post, and your suggestions are sound and constructive. I will look forward to seeing your further posts.

  48. Steve McIntyre
    Posted Apr 25, 2007 at 11:02 PM | Permalink

    #46. I agree with this.

    I’m of two minds about spending time on temperature data. I have no doubt that late 20th century temperatures were warmer than mid-19th century temperatures and the estimates are reasonable up to a couple of tenths; is there a little point-shaving in AR4 to bump the number from 0.6 to 0.8 deg C. I wouldn’t be surprised if they were on leaning on the data a little, but we’re just talking a couple of tenths,

    I’m partly interested because Phil Jones is being so obstructive. But just because he’s being ornery doesn’t necessarily mean that there’s a smoking gun anywhere. It’s possible that he’s just being ornery. Or maybe they aren’t doing a whole lot of work on the data, they are doing simple operations on GHCN data and subsidizing other work off this profit center. Maybe that’s why they don’t have an open kimono. But such a reason wouldn’t have an impact on their results.

    I’m more intrigued with the SST situation. I’d like to have control of the land situation, which seems simpler, prior to wading into the SSTs. But maybe the difference between the 1930s and the present isn’t as much as represented. If one thinks that the proxy information has any merit, many proxies show warm 1930s.

  49. fFreddy
    Posted Apr 26, 2007 at 2:43 AM | Permalink

    Re #35, Stan Palmer
    Sorry, this is still more rubbish. Please stop abusing the meaning of the word ‘real’.

    In the media, polar bears are being gravely endangered by AGW. If reality is what we experience with out senses then teh AGW demise of the polar bear is real.

    This statement assumes that the media is entirely correct, which they clearly are not. Listening to/reading a media report is not a direct experience of reaity. There are good reasons for the courts to exclude hearsay evidence.

    This the hyperreality that people use to make their decisions.

    Please also avoid totally bogus neologisms like ‘hyperreal’. I suggest that your #35 would benefit from replacing this non-word with the more accurate word ‘fantasy’.

    This the fantasy that people use to make their decisions. Teh science of AIC is true in a fantasy world.

    See what I mean ? Same applies to Steve S in #36.

    Your abuse of the meaning of the word real in your first paragraph leads into your second paragraph with the completely ridiculous statement that:

    So in reply to 33, reality deos exist, it is jsut that it is not particularly important in the discussion of AGW.

    The next step in this progression (which I do not accuse you of having taken ) is to claim that realities are personal, and all realities are equally valid, and not accepting other people’s realities is discriminatory. You can probably slip in that science is just another form of faith.
    The only way I know to stop this destructive nonsense is to cut it off at the first step. So please stop confusing ‘perception’ with ‘reality’.

  50. TAC
    Posted Apr 26, 2007 at 4:53 AM | Permalink

    #42, bender raises an important set of issues when he says

    20th c. temperature measurements were never designed for global climate change detection

    Collecting longitudinal data requires diligence and consistency if one hopes to conduct meaningful trend studies. Relatively modest changes in methods (instruments; sampling locations; sampling times; personnel; instrument calibration method; etc.) can easily introduce “statistically significant” bias.

    More important, IMHO: post facto bias corrections are invariably problematical (I’m not talking only about climate science here, but about earth-science monitoring in general). Often they are done for a procrustean purpose, never stated, of bringing data into conformance with theory. IMNSHO, where data have been adjusted, one should never be confident that an observed signal corresponds to a natural process rather than to the adjustment.

    What’s the answer? Funding extremely high quality data collection efforts (disclosure: I am paid in part to work on this). Otherwise, as I think we all know, everything that follows rests on an uncertain foundation.

  51. MarkW
    Posted Apr 26, 2007 at 5:01 AM | Permalink

    42,

    The adjustments haven’t been proven right either. We are just told that we have to trust that Jones et. al. did them correctly since after all, they are “reputable” scientists.

  52. MrPete
    Posted Apr 26, 2007 at 5:50 PM | Permalink

    #46 (bender)

    You are throwing the baby out with the bathwater. Counterpoint:
    [biased value] * [correct bias-correction method] = [correct value]
    The problem is that without transparency you have no way of evaluating the correctness of the bias-correction method. Point is: transparency and correctness are different issues. So don’t go assuming that people who aren’t transparent are necessarily wrong.

    I hope I’m not throwing out the baby! Of what use are correct answers if their correctness can’t be evaluated? I didn’t say the answers are WRONG just UNKNOWN.

    In most arenas, an Unknown data value poisons any formula it enters. “#VALUE” in excel terms 😉

    I agree that unknown != incorrect. It’s just as bad from a practical viewpoint.

  53. hswiseman
    Posted Apr 26, 2007 at 10:51 PM | Permalink

    If you are using data with adjusted basis expressed in a new metric and you defined both the adjustment and the metric, and the end result is a grand slam in favor of your theory (did I just hear someone say Emmanuel, or was it just a stray prayer?)the burden of proof shifts against the theory proponent. If this is transparently presented, fine, lets have at it and debate the merits. There is alot of valuable work in this vein, (Kossin’s latest comes to mind) so I agree with the Ridin’ High Gator that reflexively impugning motives is unfair and stiffles a full discussion. Make no mistake though, these researchers are out on a limb and CA is a house of saws. If I were one of them, I am not sure I would rush in for my turn under the bare light bulb either.

  54. bender
    Posted Apr 27, 2007 at 6:02 AM | Permalink

    RE: “CA is house of saws”.
    Some at CA are sawing limbs. Others are trying to coax the fervent few back to the trunk that is normal scientific practice: conjecture, study, document, publish, replicate. If GW is a problem and if A is to blame, then we need an auditible paper trail for the estimate of A. Some will say it is large. Some will say it is small. You want to know how the various estimates of A in AGW are derived before you start sawing too hard. You might catch your hand in there.

  55. Steve McIntyre
    Posted Apr 27, 2007 at 7:09 AM | Permalink

    #54. In this respect, readers may observe that I have not expressed any opinion on this. I do not exclude the possibility that there is some valid element of truth underneath the hype and over-promotion. Anyone familiar with mining promotions knows this – just because something’s being hyped doesn’t mean that the claims are false, just that you have to be careful. Also just because some overheated claims are made doesn’t mean that the stock is necessarily a bust; I can think of some opposite cases. But you need to be even more careful then.

    Personally I think that the Team’s cause would be better served for their purposes by having a clear verifiable trail of explanation of exactly why there’s a problem and how the quantum is calculated so that an educated non-specialist reader (with say the skill set of many of us here) could follow the logic from top to bottom and verify the calculations for themselves. That’s what I’d do if I were managing this aspect of IPCC. However, the Team has chosen not to do things this way.

  56. Posted Apr 27, 2007 at 12:00 PM | Permalink

    Re 48: “I’m more intrigued with the SST situation. I’d like to have control of the land situation, which seems simpler, prior to wading into the SSTs.”

    The SST graph I’ve filed from this site which shows the anomaly with and without the bucket correction seems extremely simple to me. Warming begins in 1910 and rises at approx 1.4 deg/decade for 100 years. There is a beautiful Kreigesmarine signal during WWII and a couple of intriguing minor spikes.

    This is as expected — clean data from the oceans will be slightly buffered by the medium and will be immune from heat island effects. The oceans, barring something like current changes, should give much more reliable answers.

    I think. Please don’t tell me that this data has been fiddled with as well.

    JF

  57. MarkW
    Posted Apr 27, 2007 at 1:43 PM | Permalink

    56,

    “approx 1.4 deg/decade for 100 years”

    The oceans have heated 14 degrees (F or C?) over the last 100 years?

  58. Posted Apr 27, 2007 at 2:41 PM | Permalink

    Re 57: Oh, bum! Divide by ten, of course. Does anyone use F any more?

    I think the graph came from the thread on bucket corrections here. Well worth looking at, it cuts out lots of noise and looks really simple.

    JF

  59. MrPete
    Posted Apr 27, 2007 at 7:45 PM | Permalink

    #54,55

    I do not exclude the possibility that there is some valid element of truth underneath the hype and over-promotion.

    Exactly. Back to Y2K again: there was a grain of truth hiding behind the hype. But because few if any took the time to convert unknowns to knows… people were left twisting in the wind.

    The reality was: Y2K required some significant work. But it was not at ALL the risk the alarmists claimed. And once we got to work on investigating the reality, we were able to prove it.

  60. esceptico
    Posted Apr 28, 2007 at 12:57 AM | Permalink

    The new “No hair” theorem in climate science:

    Most climate reconstructions can be completely characterized by only three externally observable parameters: cherry picking, fudge factor, and “consensus”. All other information about the discovery context or analysis process is falling into the black-hole event horizon,it “disappears” and is therefore permanently inaccessible to external observers.

  61. Posted Nov 15, 2008 at 12:03 AM | Permalink

    Interesting to note that this panel has still not reported. And yet they don’t seem to have had a meeting for nearly a year.

    Do they think we’re all going to get bored and go away?

One Trackback

  1. […] “Assuring the Integrity of Research Data“, Climate Audit, 23 April 2007 […]