GAO Report on Data Sharing in Climate Science

The U.S. General Accounting Office (GAO) has issued a report as requested by the House Energy and Commerce Committee last year on data sharing in climate. The full report is here. The Republican press release is here .

There are no signs that Lonnie Thompson will be required to archive his Dunde ice core sample data obtained in 1987 and still unarchived at the World Data Center for Paleoclimatology (which is perfectly well equipped to host this information without any new funding whatever). There’s no hint in this report that Ralph Cicerone of the National Academy of Sciences has acquiesced in Thompson withholding his data. From a quick browse, the answers of the program managers are infuriating. In a quick read, the report is far too prepared to accept pieties from program managers. For example, they do not appear to have made any attempt to follow up any of the cases that prompted the inquiry: Thompson, Jones, Mann, Briffa, etc.

Readers of this site are aware of my attempts to get paleoclimate data properly archived. Many climate researchers are pretty good about archiving their data; the problem is that there are some who aren’t and those that aren’t all too often are the studies that are relied on. For example, Al Gore’s Inconvenient Truth shows a Hockey Stick made from Lonnie Thompson’s ice core data. So let’s consider that as a type case. Readers of this site are aware that “grey” versions of Thompson’s data are inconsistent, that Thompson has grudgingly archived only a few cursory summaries (which are themselves often inconsistent) and that Thompson has refused to archive original sample data, a refusal that has been acquiesced in by the NSF, National Academy of Sciences and by Sciencemag. This would be an easy case to investigate how NSF policies worked when the rubber hit the road. Here’s what GASO said:

NASA and NSF have data-sharing policies documented at the agency level that address openness and timing and apply to all topics of research; …

For example, the overarching data-sharing policy for NSF requires researchers to make data available to others but does not specify how, …

The Global Change Research Program observed that “proper preparation, validation, description, and care of data sets is critical to their use by the widest possible scientific community.” The CCSP has encouraged those agencies funding climate change research to incorporate the guidelines listed in this voluntary policy into their data-sharing policies and practices. Senior officials at DOE, NASA, NOAA, and NSF told us that their data-sharing policies and practices adhere to the principles of the guidelines. …

The NSF agencywide policy states that researchers are “expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered.”1313National Science Foundation, NSF Grant Policy Manual, (Arlington, VA, 2005). In order to address the needs of specific research programs, program-level policies often provide researchers more detailed guidance about how to carry out the agencywide data-sharing policy. This agencywide policy establishes a general expectation that data are to be shared with other researchers. The data-sharing policy for the oceans program—one of NSF’s programs funding particular climate change research—identifies particular archives for researcher use, such as one that preserves sediment samples from the ocean floor. …

Further, the agencywide policy states that data are to be shared “within a reasonable time” and the oceans program policy states that data should be shared as soon as possible but no later than 2 years after collection. …

We found that NSF expects researchers applying for grants to present, as appropriate, a clear description of “plans for preservation, documentation, and sharing of data, samples, physical collections, curriculum materials, and other related research and education products.”16 However, the general grant guidance materials for researchers applying for DOE, NASA, and NOAA climate change grants do not explicitly instruct them to include data-sharing plans in their proposals. Nevertheless, some program managers encourage researchers to do so in practice.

The extent to which federal climate change research agencies use various aspects of the grant review process to encourage data sharing varies, depending on the initiative of the program manager, in part because there are no requirements for them to do so. For example, an NSF official stated that the consideration of past data-sharing activities is not a discrete factor that the agencies require program managers to use in making award decisions.

The pieties are recapitulated, but there is no documentation of Thompson’s obdurate refusal to provide sample data or the failure of any program manager to even say boo to him.

I sent the following letter to the NSF Administrator, Arden Bement:

Dear Dr Bement,

I have read the recent GAO Report, which states:
The NSF agencywide policy states that researchers are “expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered.”

For several years, I have been attempting to obtain the “primary data” pertaining to Thompson’s ice cores from Dunde, Guliya and elsewhere. This data was used recently in An Inconvenient Truth. For each core, there are typically over 3000 samples, and each sample has a suite of measurements including isotopes and chemistry. Thompson has failed to archive this “primary data” and has failed to share it with other researchers. Instead of archiving this important data, Thompson has (and this only after complaint) archived only gross summaries of the oxygen isotope information and not always for the complete core. Grey versions of the data are often inconsistent.

If existing NSF policies are sufficient to require Thompson to archive or share this data, could you please take immediate steps to require him to do so. If NSF policies are inadequate to require him to do so, could you please immediately advise the GAO that your policies do not require Thompson to archive or share his data so that GAO does not mislead readers who might interpret the language in their report as implying that NSF policies are binding on researchers.

Regards, Stephen McIntyre


31 Comments

  1. Gary
    Posted Oct 22, 2007 at 1:18 PM | Permalink

    Not even a slap on the wrist. Write up a few guidelines, suggest to grant writers they slap in some boilerplate about archiving data, and now get back to standard operating procedures because we’ve brushed off the congressman’s request. Well, at least you have Official Recommendations to wave around next time you ask for data.

  2. Henry
    Posted Oct 22, 2007 at 1:59 PM | Permalink

    Why not write to the GAO asking if they checked whether the policies were being followed (as would be normal in a policy audit), and in particular if they checked your problematic cases?

  3. Follow the Money
    Posted Oct 22, 2007 at 2:03 PM | Permalink

    “There are no signs”

    Is it incumbent on the GAO to render an opinion about change, or only lay out curent praciteces for the committee?

  4. James Erlandson
    Posted Oct 22, 2007 at 2:06 PM | Permalink

    And today this piece called Information R/evolution is making the rounds. Everywhere but the NSF, National Academy of Sciences and Sciencemag.
    (Five minutes)

  5. Larry
    Posted Oct 22, 2007 at 2:07 PM | Permalink

    I hope Sen. Inhofe is reading this entry. In my experience, the only thing that gets the attention of these agencies is someone from congress breathing down their necks. And even then, expect to have a more powerful delegation from the other side insisting that they do nothing to enforce this.

  6. SteveSadlov
    Posted Oct 22, 2007 at 2:25 PM | Permalink

    I hate to write it, but, this is pushing a rope. The President of the US and most of his staff are bought into a certain AGW scenario. Many corporate leaders are as well. Needless to say, in the academic realm, the majority is massive. Some may be cynical about it – carbon trading, etc. Some may be more doctrinaire. Bottom line – the masses are bought in, changing anything will be, at best, like turning a super container carrier.

  7. MarkW
    Posted Oct 22, 2007 at 2:29 PM | Permalink

    SteveS,

    The masses have bought in, but not to the point of wanting to do anything that will cost them something.
    So there’s still a little time.

  8. Sam Urbinto
    Posted Oct 22, 2007 at 2:41 PM | Permalink

    And are you of the opinion we’ll ever be at the point of wanting to do anything that will cost something? I don’t think so, which is why I don’t let this stuff upset me too much. :)

    What Steve posted, looks like the GAO was only explaining the professed procedures and not checking anything. What’s the point of that, it seems rather lame.

  9. per
    Posted Oct 22, 2007 at 2:52 PM | Permalink

    i am not quite so downbeat. The GAO does admit that there are problems. The rubber will hit the road when someone decides what to do about this report; and that will be a rather big and messy affair. Amongst other things, it will cost universities and governments a lot of money.

    per

  10. Buddenbrook
    Posted Oct 22, 2007 at 3:37 PM | Permalink

    Is there data available on

    a) source code of the GCMs they use

    b) on the parameters they have used to get varying projections

    c) on the scientific process that has made them to conclude that the projections of around +2.5C for doubling of CO2 are the most accurate ones?

    These should be fundamental and central to the AGW debate, but it seems that that debate is nowhere to be seen. IPCC, for an example, seems to always dodge it.

    And where did Hansen’s long term climate sensitivity of 6C suddenly come from? The whole concept of it, and that numeric value? Where is the study to substantiate this claim? The data and explanations on projections, AOGCM parameters and so forth?

    Any of this openly available?

  11. Marian
    Posted Oct 22, 2007 at 3:40 PM | Permalink

    I’ve always found archiving data I’ve collected to be inconvienient. And that’s the truth!

  12. Kenneth Fritsch
    Posted Oct 22, 2007 at 3:58 PM | Permalink

    I read through this most typical of bureaucratic government regulation reports which says absolutely nothing about problem solving, but does, perhaps unintentionally, point to what the basic problems are.

    Although researchers can contact the agency if other researchers withhold data, this is not an effective way to resolve situations involving incomplete or missing data. Several data managers told us that documentation about the data—such as conditions under which it was gathered—is crucial because important details about the data are likely to be forgotten as the researcher moves on to new projects. Furthermore, at some point, it may become too late for federal agencies to encourage data sharing because by the time one requests access to certain data—possibly years after the initial data collection—the original researcher may have lost the data or failed to record important metadata. Therefore, we believe that agencies’ reliance on self-policing by the research community does not provide adequate assurances that researchers will fulfill the data-sharing expectations set forth in the agencies’ policies.

    From this I can only conclude that original research is poorly documented and the scientific promise for replicating the research gets lost in “moving on” to new projects and publications. In other words old theories are never refuted by analysis of previous research but only questioned by doing and reporting new research. Lets call that the gentlemanly and ladylike way of doing business.

    Senior agency officials at all four agencies told us that it is impractical for program managers to verify data sharing because they oversee many researchers and must focus on higher priority tasks. Moreover, several of these officials believe that current self-policing is effective because of the collaborative nature of climate change research.

    In other words verifying and replicating data has a relatively low priority. The “collaborative nature” means that the experts “in the field” should be trusted no matter how difficult this makes an evaluation from outside that particular scientific community.

    Researchers seeking data that have not been made widely available, such as through an archive, generally need to contact the original researcher(s) to request data. While most of the program managers we surveyed indicated that there are several incentives for researchers to make data available—such as maintaining informal relationships with other researchers, obtaining recognition in the scientific community for the work, or the potential for future collaboration—there is no guarantee that the original researcher will have the complete data readily available to comply with another researcher’s request for data.

    And from this one can readily conclude that if an outsider from that particular scientific community makes that request there is no incentive for the researcher to deliver.

    Making data available often involves laborious and time-intensive tasks to adequately document the data and to perform quality assurance checks, such as correcting errors, to make them usable for other researchers.

    Oh my, God and Heaven forbid that a researcher must take another look at his/her research to double check for errors. This could significantly slow their publishing rates.

    In case you do not want to believe my cynical eyes, try this final excerpt on for size.

    These officials stated that funding agencies and the scientific community expect researchers to both publish their results and make underlying data available, but researchers have traditionally been rewarded mainly for publication. According to a National Academies report on data access, “society fellowship and award committees generally do not place much value on the contributions their applicants may make to the infrastructure of science in the form of data compilation, organization, and evaluation work.”21 As a result, researchers who have to compete for funding are more likely to focus on publishing research results than preserving underlying data for future use, thereby putting the data at risk of being lost or inaccessible to other researchers.

  13. Larry
    Posted Oct 22, 2007 at 4:12 PM | Permalink

    12, as opposed to an engineering report, where getting it right is considered more important than getting it first. They’ve said, in so many words, that publishing something novel is the only thing that matters, and getting it right is too mundane to interest anyone. Kinda like the media, wanting news that’s splashy and flashy, and not caring about whether any of it’s correct.

  14. MrPete
    Posted Oct 22, 2007 at 4:34 PM | Permalink

    Making data available often involves laborious and time-intensive tasks to adequately document the data and to perform quality assurance checks, such as correcting errors, to make them usable for other researchers.

    My turn to be “gobsmacked.”

    Are they admitting, in print, that scientific publication does NOT require…

    * Documented data collection

    * Quality-checked data

    * Error-corrected data

    ??!!

    AND they are stating, in writing, that data that’s UNusable by other researchers, is usable for my own purposes and for publication?

    Wow.

  15. Posted Oct 22, 2007 at 6:39 PM | Permalink

    Bureaucratic business as usual…

  16. Larry
    Posted Oct 22, 2007 at 6:45 PM | Permalink

    Stupid question, but can/should there be standard language in any research contract that goes through the GSA that contractually obliges them to do this? If they’re contractually obliged, third parties can deal with enforcement (and treble damages for non-compliance would make that kinda fun to go after).

  17. Posted Oct 22, 2007 at 6:55 PM | Permalink

    I think it’s reprehensible that scientists do not share their data.
    To set a good example I’d like to request that Steve McIntyre starts archiving the data used in his articles. It only takes a minute to upload via FTP. (Gridded CRN12, hint, hint).

  18. MarkR
    Posted Oct 22, 2007 at 7:12 PM | Permalink

    There is no compulsion for archiving data, but US Government sites can’t publish anything based on data that isn’t archived/reproducable. That is the lever. Just write to a few leading ones and tell them they have to take down the graph based on/containing non-archived data. If they don’t they are probably suable.

  19. Geoff Sherrington
    Posted Oct 22, 2007 at 7:51 PM | Permalink

    I wrote to the NAS some weeks ago reminding them that some data are shared from other countries and that it is only courteous to make the contributions from such other countries open access. That is, the game is global and the local rules of the USA have no intrinsic domination. Result – no response. Action – continue to make representations. We in the SH collect a lot of data for the USA that is vital for military etc purposes. The tap can be turned off here too.

    #14 MrPete encapsulates some major conceptual problems. Please keep highlighting them. The foundations of scientific integrity are being weakened.

  20. James Bailey
    Posted Oct 22, 2007 at 8:54 PM | Permalink

    Amid the outrage over failure to enforce data sharing, have any of you noticed that if they did follow their policies that all they would have to do to deny you access to their data is to declare that you don’t qualify as a fellow researcher?
    Science research funded with public money should be made available to the public. Failure to do so should disqualify a researcher or research institution from receiving any more public funds. It should not be left to the science agencies, it should be a matter of law. Agencies should be forced to report back to congress on their enforcement efforts, just like every other government agency.
    Proper exclusions can be made for research that is already classified, but that would have little to no effect on NSF and NIH funded research.
    Scientists, and science journals, should be laughed at and discredited unless they live up to their vaunted self-descriptions of openness, honesty, painstaking carefulness to avoid errors, and insistance that it is not science if it is not reproducible. But, they live off of government largess. Government gets to set the rules. These are our governments and our money, and we should insist.

  21. Roger Dueck
    Posted Oct 22, 2007 at 9:50 PM | Permalink

    #13 Larry, you hit it on the head! In the acedemia, first is important, even if it is wrong!

    #20 James, AGREED, what’s with this “qualified scientist” BS?! So Joe from Hoboken gets ahold of the data. Is he going to build an A-bomb?! Remember the “peer reviewed” mantra. What could possibly go wrong with more brains looking at it? I have published and I quite frankly don’t give a POOH who has access to the data, as it means little to anyone but the “qualified” scientist. Restriction of data access on the basis of qualification is egocentric and self-serving.

  22. Geoff Sherrington
    Posted Oct 23, 2007 at 3:25 AM | Permalink

    Re #20 James Bailey

    Yes, and the research is funded in part with my Australian taxes. Why does some American bureaucrat assume a position of refusing to give it back to me? I might not be a “fellow researcher” in terms of being a USA buddy, but I think you’ll see my position. Quite frankly, many USA space programs including the Armstrong one would not have been possible wothout Australian data sharing – from Australia to the USA. Reciprocal arrangements seem fair. By the way, in which country did Lonnie Thompson takes his cores – The USA?

  23. Hasse@Norway
    Posted Oct 23, 2007 at 4:04 AM | Permalink

    Why don’t someone write a book about the lack of data access among climate scientist. The attacks from the AGW crowd would give it plenty of publisisty. Easily top 100 best seller in the US.

    My guess is that if it is well written and exact it would be a bombshell. I think the reason why they don’t want to show their data ,apart from the data not beeing able to pass scrutiny. Is that most people don’t care to look into the problem closely enough.

    If they did people wouldn’t like descisions beeing made without proper research.

  24. JC
    Posted Oct 23, 2007 at 5:19 AM | Permalink

    If I were the GAO, I’d be worried about being seen as targeting climate science unless I took action on bad data sharing in all sciences, which is a very tall order. In Computer Science, funded by NSF CISE, I’m not sure anybody does data sharing.

    I’m not saying this to defend GAO or bad science across the board, and I realize that climate science has huge public policy implications, but imagine what the NYT headline would look like if the GAO (or NSF for that matter) fulfilled your request.

  25. Hans Erren
    Posted Oct 23, 2007 at 5:35 AM | Permalink

    James, AGREED, what’s with this “qualified scientist” BS?! So Joe from Hoboken gets ahold of the data. Is he going to build an A-bomb?! Remember the “peer reviewed” mantra. What could possibly go wrong with more brains looking at it? I have published and I quite frankly don’t give a POOH who has access to the data, as it means little to anyone but the “qualified” scientist. Restriction of data access on the basis of qualification is egocentric and self-serving.

    Back in the old days, scientists published tables with data in an appendix. see eg:
    Svante Arrhenius, Ueber den Einfluss des Atmosphärischen Kohlensäurengehalts auf die Temperatur der Erdoberfläche, Bihang till Kongliga Svenska Vetenskaps-Akademiens Handlingar, Stockholm 1896, Band 22 Afd I N:o 1, p1-101.

  26. Richard
    Posted Oct 23, 2007 at 8:01 AM | Permalink

    Archiving data does not have to take a significant amount of time or funds. Our engineering reports have an appendix that contains the original (sometimes hand written) raw data and subcontractor reports. All that is done is to put page numbers on the copies so that they can be referenced in the main report.

  27. Larry
    Posted Oct 23, 2007 at 8:41 AM | Permalink

    I think it’s pointless to even dignify the question of how onerous archiving that data is. It’s part of the job. It has to be done. Period. If you don’t do it, you haven’t completed the job.

    I think that if the standard contract language stated that you don’t get the final payment until it’s done and verified, and the funding agencies enforced that clause, this would quickly become a non-issue.

    This would also generate a niche for auditors (ahem!) to verify that the data has been properly archived. Would this make research more expensive? Yes. Would it be worth it, to have quality externally enforced? Certainly.

  28. EddieQ
    Posted Oct 23, 2007 at 9:29 AM | Permalink

    #18 MarkR; Aye, there must be teeth. A pro golfer signing an incorrect scorecard? Has any golfer even inadvertantly done that more than once? Disqualification is immediate and final, “for the encouragement of others,” and golf’s purity and spirit … (DQ any research team not archiving original data and methods, until it is.)

  29. Posted Oct 23, 2007 at 9:51 AM | Permalink

    Is there a definitive list of all the data/code which has been requested and not delivered up? I wonder if those of us who don’t have the scientific know-how to contribute directly to the discussions here, might do their bit by putting pressure on politicians to get the data released.

  30. Gerald Machnee
    Posted Oct 23, 2007 at 11:42 AM | Permalink

    Re #18 – BBBBBut you are dealing with a Hockey Team, not golfers.

  31. John F. Pittman
    Posted Oct 23, 2007 at 3:13 PM | Permalink

    Sometimes there are conceptual problems; I asked specifically for the actual data that Hansen et al , used for GISS analysis of surface temperature change

    J. Hansen, R. Ruedy, J. Glascoe, and M. Sato

    NASA Goddard Institute for Space Studies, New York
    Hansen, J., R. Ruedy, J. Glascoe, and Mki. Sato, 1999: GISS analysis of surface temperature change. J. Geophys. Res., 104, 30997-31022, doi:10.1029/1999JD900835. The response was

    A search was conducted utilizing the information you provided. Let me know if this is responsive to your request.

    1 – NASA will provide the data that GISS has – which is the closest thing NASA has to “raw” data. Those data are available to the requester on the GISS web site.

    2 – If the requester means by “raw” the original reports from the 31 sources, GISS is not in possession of those data. If requester wants to see all original reports, including the reports from the 8000 weather stations, we would suggest he contact NOAA as NASA does not have those data.

    So, is the data that was used in 1999 actally on-site? I was not provided a link. This response was to a FOIA request. If the data has changed, it would not be the same (see Steve’s thread on the changes made after he identified the “Y2K” type error). If the data is there, why wasn’t a link provided?

Follow

Get every new post delivered to your Inbox.

Join 3,196 other followers

%d bloggers like this: