Jeff Id on the Air Vent has written a post pointing out the recent publication online of a report by the Committee on Ensuring the Utility and Integrity of Research Data in a Digital Age from the National Academy of Sciences: Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age. I am starting this thread so that it can also be discussed here
The committee and its objectives were discussed on CA, for example, here and in a number of other places. You can find the other threads easily by using the search CA feature at the top right of the page for the phrase “NAS Committee”.
It makes for interesting light reading. From the summary:
Legitimate reasons may exist for keeping some data private or delaying their release, but the default assumption should be that research data, methods (including the techniques, procedures, and tools that have been used to collect, generate, or analyze data, such as models, computer code, and input data), and other information integral to a publicly reported result will be publicly accessible when results are reported, at no more than the cost of fulfilling a user request. This assumption underlies the following principle of accessibility:
Data Access and Sharing Principle: Research data, methods, and other information integral to publicly reported results should be publicly accessible.
(bold in report)
Maybe the folks at HadCru should pay attention…
Update (using my Comment 1:)
It appears that there may be some caveats on those for whom the the data should be accessible. From page 2 of the summary chapter (page 3 of the pdf) (all bold mine):
Documenting work flows, instruments, procedures, and measurements so that others can fully understand the context of data is a vital task, but this can be difficult and time-consuming. Furthermore, digital technologies can tempt those who are unaware of or dismissive of accepted practices in a particular research field to manipulate data inappropriately.
On the next page, this seems to be clarified somewhat:
The most effective method for ensuring the integrity of research data is to ensure high standards for openness and transparency. To the extent that data and other information integral to research results are provided to other experts, errors in data collection, analysis, and interpretation (intentional or unintentional) can be discovered and corrected. This requires that the methods and tools used to generate and manipulate the data be available to peers who have the background to understand that information.
The “public” appears to be those who are deemed to deserve it by the owners of the data and methods. After all, who knows what damage can be done when an examination of the data and methods is carried out by someone who doesn’t “understand the information” or associated “accepted practices”. ;)