Here are Judith Curry’s Comments on the Wegman Report. I appreciate these sorts of contributions and am obviously relying on such contributions (Willis, bender, etc.) more and more.
Here are my comments on the Wegman report. I am not going to comment on any technical aspects related to the hockey stick debate nor make comments on the behaviour of any individual scientists or “auditors”. Rather, I focus only on Wegman’s recommendations.
Recommendation 1. Especially when massive amounts of public monies and human lives are at stake, academic work should have a more intense level of scrutiny and review.
***100% agreement. Peer review is a valid procedure for weeding out papers that have obvious flaws, are topically unsuitable to the journal, don’t provide anything new (essentially duplicative of prior research). Unless a reviewer is very close to the topic being reviewed and is already conducting a related investigation, it is unlikely that deep flaws (not obvious one) would be uncovered in the peer review process. Cliques do exist in any field, and continued peer review from within a clique can generate a false sense of “consensus”. However, inferring cliques from long lists of coauthors is misleading. For example, I have my name on about half dozen papers where the number of coauthors numbers into double digits. On one of these papers, I have never met half of the coauthors, and on each of the other papers some of the coauthors I have never met personally (including 2 of these papers on which I am first author).
Two anecdotes regarding peer review. The Hoyos et al. paper skated through Science’s review process with only minor comments. The media, however, apparently conducted an exhaustive peer review of the paper during the week prior to publication (owing to the large publicity of the WHCC paper). I talked to reporters who mentioned the three mathematicians that they sent it to for review, I received a plethora of emails from scientists mentioning that they received the paper to review and had a question, etc. The Hoyos paper seems to have survived pretty much unscathed, and the media did a good job in the peer review of a paper that they thought was highly relevant. The other anecdote is my BAMS article. When this was submitted (Nov 05), the hurricane media wars were especially intense. I requested that the paper not be reviewed by anyone (from either side) that was involved in the media debate, and requested that it be sent to 2 climate researchers and 2 hurricane researchers for review. The initial review process was a little bit bloody, but the review process on the second version in mid Feb totally broke down owing to the infamous Feb 2 WSJ (front page) article of brain fossilization fame. Two of the reviewers were so hostile towards me as a result of the WSJ article that they could not even focus on the paper. The review process broke down, and I negotiated with the editor as to what the final manuscript should look like. The lesson from this is that good papers can get screwed in the peer review process, and highly relevant papers will receive a substantial amount of scrutiny once published.
It is especially the case that authors of policy-related documents like the IPCC report, Climate Change 2001: The Scientific Basis, should not be the same people as those that constructed the academic papers.
***Reports like the IPCC are not simply assessment reports, but rather they are synthesis reports. Assessment is certainly an element of synthesis, but synthesis is arguably a higher-order activity. It is hard to imagine some model for activities such as the IPCC where the authors of the primary academic papers are not involved in such documents; not only is their expertise valuable, but it is hard to imagine people very far outside the field of expertise that would have the motivation to devote the time and energy to this endeavor, even if they were paid to do it. More of a focus on assessment by outsiders should be included in efforts like the IPCC and the CCSP, but it is the combination of assessment and synthesis that is the most powerful and of greatest value to policy makers.
Recommendation 2. We believe that federally funded research agencies should develop a more comprehensive and concise policy on disclosure. Some consideration should be granted to data collectors to have exclusive use of their data for one or two years, prior to publication. But data collected under federal support should be made publicly available.
***100% agreement. Data, plus some reasonable version of the metadata should be stored in a permanent data archive (NSF and NASA fund numerous such data archives). This should be a requirement, and scientists that do not do this should not receive further funding from that agency. Requirements for making the actual code available seem somewhat less defensible (this is a complex issue and should be considered further), although the method used should be completely transparent and reproducible. Scientists whose methods/codes/etc. are not shared should become less relevant in the scientific debate (this is of course not the case when the same scientists are in charge of assessment reports).
Recommendation 3. With clinical trials for drugs and devices to be approved for human use by the FDA, review and consultation with statisticians is expected. Indeed, it is standard practice to include statisticians in the application-for-approval process. We judge this to be a good policy when public health and also when substantial amounts of monies are involved, for example, when there are major policy decisions to be made based on statistical assessments. In such cases, evaluation by statisticians should be standard practice. This evaluation phase should be a mandatory part of all grant applications and funded accordingly.
***I agree with the first part of this statement, but not the last sentence. Statisticians should be involved in the climate assessment reports. For example, the assessment led by Jerry North under the auspices of the NAS/NRC Climate Research Committee (CRC) did include some scientists with statistical expertise, but arguably it would have been a good idea for the CRC to have contacted Wegman’s NAS Committee on Applied Statistics for suggested members of the assessment committee. NAS/NRC is very good about interacting with boards and other committees and disciplines, although after 3 years of serving on the CRC (I just rotated off), I had never heard mention of Wegman’s committee. I will send a message to the NAS staffer at the CRC about this, although I suspect that they have already connected the dots.
It is not always clear in advance what research will be policy relevant. The Emanuel and Webster et al. studies (completed early summer 2005) became policy relevant as a result of Katrina. With regards to routine statistical evaluation of each paleoclimate research project, the funding for that community is miniscule. Now that the awareness of this community has been raised in terms of the importance of the statistical analysis (and they can count on being audited by climateaudit), it will be interesting to see if there is an increase in the rigor of statistical analysis by the paleoclimate community.
Recommendation 4. Emphasis should be placed on the Federal funding of research related to fundamental understanding of the mechanisms of climate change. Funding should focus on interdisciplinary teams and avoid narrowly focused discipline research.
***Climate is inherently a very multi- and interdisciplinary field. To make progress in this field requires observations, understanding of individual physical processes, and the cumulative integration of these physical process in the context of climate variability. Climate models are the embodiment of our integral understanding of the climate system. A critical element in the establishment of any theory is whether it has predictive value. The complexity in the greenhouse warming issue is that if we wait until we are convinced as to whether the models have predictive value, then we may have missed a window of opportunity for action if the predictions of the models are actually correct. The funding of interdisciplinary projects is rather problematic in the funding agencies, particularly NSF, where the focus is disciplinary research (NSF would disagree with me, but I am very prepared to argue this point with them should they ever find they want to listen to me on this subject). The big “integration” activities related to climate and particularly climate modelling tend to occur at the govt labs such as NASA GISS, GFDL, and NCAR with some sort of “block funding”. The bottom line is that the greater involvement of statisticians in climate research would be a good thing, and Wegman’s NRC committee might be able to influence this in some way. I note that NCAR does have a Geophysical Statistics program http://www.image.ucar.edu/GSP/, our field is paying attention to such issues.
JC’s summary statement: As summarized in my BAMS article hypothesis must pass three tests if it is to be elevated to a theory:
1. Survive scrutiny and debate, including attacks by skeptics.
2. Be the best existing explanation (physical and statistical) for the particular phenomenon.
3. Demonstrate predictive capability.
Yes, we all know this, but I think we need periodic reminding of this so that we don’t end up getting all hung up on the minutiae and over react to a single flaw in a paper and infer that the larger hypothesis has been refuted. Skepticism about whether an argument has been made convincingly and the identification of flaws does not imply that hypothesis has been refuted and the converse must therefore be true. There is a difference between science and trial law. Rejection of a hypothesis requires falsification (innocent until proven guilty), although elevation of a hypothesis to theory (complete exoneration and acceptance) must pass more stringent tests. A defense lawyer’s approach is to try to poke a single hole in the prosecution’s case, so the client can get off without conviction. The defense lawyer’s approach to “discrediting” scientific hypotheses doesn’t cut much mustard with scientists in terms of actually falsifying a hypothesis, but the appropriate scientific response should be to do further work to see if the concern raised can be addressed, this is part of the “survive scrutiny and debate, including attacks by skeptics” test. Note, I am sticking to science qua science here, not addressing the fast tracking of hypotheses to policy issue.