Schmidt’s recent post on Yamal advocated the following “conspiracy theory”:
McIntyre got the erroneous idea that studies were being done, but were being suppressed if they showed something ‘inconvenient’. This is of course a classic conspiracy theory and one that can’t be easily disproved. Accusation: you did something and then hid it. Response: No I didn’t, take a look. Accusation: You just hid it somewhere else.
One aspect of Schmidt’s response is beyond laughable. I agree that the best way of disarming suspicion is to show data: “take a look”, as Schmidt says. However, if Schmidt thinks that the conduct of the scientists involved in the various data refusals, obstructions and FOI refusals constitutes “take a look”, then he’s seriously in tin foil country. Comical Gav indeed.
Although I find it hard to believe that Schmidt is unfamiliar with the past incidents that gave rise to suspicion that adverse results and data have been withheld or not reported, I’ll review a couple of important ones. These do not, in any sense, constitute an inventory of incidents. They are ones that are either familiar in part to CA readers or which illustrate an important aspect of the problem.
“Dirty Laundry” and Verification r2
In December 2003, despite a number of prior data refusals, I asked Mann for the residual series from the individual steps (termed “experiments”) in MBH98. Unknown to me at the time, Briffa and Osborn had made an almost identical request three months earlier (which Mann had complied with). Residual series permit a reader to carry out standard statistical tests (verification r2, RE, etc) without having to re-do the entire calculation from scratch. I copied David Verardo of NSF on the request. Without waiting for Mann’s refusal (this surprised me), Verardo said that Mann was not required to provide this data to me. Verardo’s letter was later cited in Mann’s evidence to the House Energy and Commerce Committee and in Stephen Schneider’s book.:
His research is published in the peer-reviewed literature which has passed muster with the editors of those journals and other scientists who have reviewed his manuscripts. You are free to your analysis of climate data and he is free to his. The passing of time and evolving new knowledge about Earth’s climate will eventually tell the full story of changing climate. I would expect that you would respect the views of the US NSF on the issue of data access and intellectual property for US investigators as articulated by me to you in my last message under the advisement of the US NSF’s Office of General Counsel.
In response to the identical inquiry from CRU, Mann immediately sent the residual series to Osborn, warning him that the residual series were his “dirty laundry”, provided to Osborn only because he was a “trusted colleague”. Mann asked Osborn to ensure that the “dirty laundry” didn’t fall into the wrong hands, an assurance that Osborn readily gave.
None of the so-called “inquiries” delved into why Mann regarded the residual series as his “dirty laundry” and why he was so anxious to prevent this (apparently “inconvenient”) information from falling into the wrong hands.
One reason might, of course, have been that the residual series would immediately permit the calculation of the verification r2 for each step. (Favorable) verification r2 results for the AD1820 step were illustrated in MBH98 Figure 3; elsewhere MBH98 said that the verification r2 statistic had been considered. But in the SI to MBH98, Mann had archived the RE for each step but not verification r2.
In MM2005, we reported that the verification r2 for the AD1400 step was approximately zero – a very surprising result given the “skill” claimed for MBH98. (Zorita, for one, was surprised by this result and thought less of MBH98 accordingly.) In MM2005 (EE), we expressed our surprise that the results of such a central verification statistic had not either not been calculated or reported.
In Mann’s testimony at the NAS panel in March 2006, Mann was directly asked whether he had calculated the verification r2 for the AD1400 step; Mann flatly denied doing the calculation, saying that such a calculation would have been a “foolish and incorrect thing to do”. However, by that time, Mann had archived part of his source code in response to the House Committee and that code showed conclusively that the verification r2 values had been calculated for all steps.
That Mann had calculated verification r2 results is beyond dispute. That they were “inconvenient” is beyond dispute. That they were not reported is beyond dispute.
Wahl and Ammann announced in May 2005 that all our claims were “unfounded”. Since our codes were very close and I reconciled them almost immediately, I knew that their verification r2 results would be identical to ours. Again, I was asked to review the paper (though my review was disregarded.) As a reviewer, I asked for the verification r2 results. Wahl and Ammann refused. Rather than rejecting the paper, Schneider terminated me as a reviewer. At AGU in December 2005, I asked Ammann what the verification r2 for their AD1400 step was. He refused to answer – a refusal noted by Eduardo Zorita and others.
I asked Ammann out to lunch after the paleo session (I bought). Since our codes reconciled, it should have been possible to clarify the dispute. I offered to jointly (with our coauthors) write a paper stating what we agreed on and what we disagreed on. He refused, saying that this would be “bad for his career”. To this day, I remain dismayed at this answer. I urged him to report the verification r2 results; he refused. I told him that I would not simply stand by while he refused to report the adverse verification r2 results that confirmed ours; he shrugged. I therefore filed an academic misconduct complaint at UCAR; while the complaint was shrugged off without investigation, the verification r2 results appeared in the final article, confirming our point.
The Climategate emails show that Phil Jones, also a reviewer of the paper, was outraged that we had complained about Wahl and Ammann suppressing the inconvenient data, not by them trying suppressing the data.
Jacoby and D’Arrigo
Another equally disquieting incident occurred before Climate Audit and may not be familiar to all readers. This incident also illuminates issues about when data is “used” – an issue that I believe to be relevant to the Yamal incident.
Jacoby and D’Arrigo (Clim Chg 1989), a study of northern North American tree rings, was extremely influential in expanding the application of tree rings to temperature reconstructions (as opposed to precipitation.) (See CA tag Jacoby for prior posts that have been tagged.) The Jacoby-d’Arrigo reconstruction was used in Jones et al 1998 and its components (especially Gaspe) were used in MBH98. It is used to “bodge” of Mann PC1 in MBH99; Mann’s “Milankowitch” argument rests almost entirely on this bodge – ably deconstructed by Jean S here.
Jacoby and D’Arrigo stated that they had selected the 10 most “temperature-influenced” sites from the 36 northern North America (boreal) sites that they had sampled in the previous decade, to which they added Cook’s (very HS) cedar series from Gaspe “because of the scarcity of data in the eastern region”. However, if you pick the 10 most “temperature-sensitive” series from a network of 36 autocorrelated red noise, you will get a HS. This phenomenon has been more or less independently reported by me, Jeff Id, Lucia, Lubos Motl and David Stockwell. We noted this phenomenon in our PNAS Comment on Mann et al 2008, taking some amusement in citing AIG News (Stockwell) since it was unreported in the Peer Reviewed Litchurchur. The phenomenon seems to baffle climate scientists.
Not only did Jacoby and d’Arrigo pick only 10 of 36 sites, they only archived these 10 sites. I asked for the data from D’Arrigo but got nowhere. At the same time, I had also learned that a new Gaspe version had been calculated – one which did not have a HS. (See CA here). I asked D’Arrigo to archive or provide me the data. She refused, saying that the version on file (cana036), which had a huge HS, was a better guide to NH temperature.
In early 2004, Climatic Change did not have a data policy when I was asked to review a submission by Mann et al savaging us. In my capacity of “reviewer”, I asked for the supporting data and code that Mann had previously refused. The late Stephen Schneider, then editor of Climatic Change, said that no one had ever previously asked for supporting data or code in the 28 years that he had edited the journal and that such a request would require a change in editorial policy. The progress of my request is documented in Climategate letters, since Phil Jones and Ben Santer were on the Climatic Change editorial board and both opposed the proposal. (Peter Gleick made his first cameo appearance at this time, also supporting obstruction.) Eventually Schneider adopted a policy requiring supporting data, but not code. Under the new policy, I asked for supporting data for the new submission, which Mann withdrew. (Osborn expresses some annoyance at this in a CG2 email.) This was the first academic paper that I had been asked to review. While I think that Mann knew that I was the reviewer, if I were doing it again, I would only act as an identified reviewer so that any adverse interest was clearly disclosed to the author.
With the benefit of Climatic Change’s newly minted data policy, I asked them to request the missing measurement data for the “other” 26 Jacoby sites. Jacoby refused in a truly remarkable letter, reported on in one of the very first CA posts here. The following is a lengthy excerpt, see the link for the full letter):
The inquiry is not asking for the data used in the paper (which is available), they are asking for the data that we did not use. We have received several requests of this sort and I guess it is time to provide a full explanation of our operating system to try to bring the question to closure…
We strive to develop and use the best data possible. The criteria are good common low and high-frequency variation, absence of evidence of disturbance (either observed at the site or in the data), and correspondence or correlation with local or regional temperature. If a chronology does not satisfy these criteria, we do not use it. The quality can be evaluated at various steps in the development process. As we are mission oriented, we do not waste time on further analyses if it is apparent that the resulting chronology would be of inferior quality.
If we get a good climatic story from a chronology, we write a paper using it. That is our funded mission. It does not make sense to expend efforts on marginal or poor data and it is a waste of funding agency and taxpayer dollars. The rejected data are set aside and not archived.
As we progress through the years from one computer medium to another, the unused data may be neglected. Some [researchers] feel that if you gather enough data and n approaches infinity, all noise will cancel out and a true signal will come through. That is not true. I maintain that one should not add data without signal. It only increases error bars and obscures signal.
As an ex- marine I refer to the concept of a few good men.
I was dumbfounded by Jacoby’s response. At the time, I expressed my disbelief as follows:
Imagine this argument in the hands of a drug trial. Let’s suppose that they studied 36 patients and picked the patients with the 10 “best” responses, and then refused to produce data on the other 26 patients on the grounds that they didn’t discuss these other patients in their study. It’s too ridiculous for words.
The incident also sheds light on the question of when data is “used”. I plan to cite this incident in another forthcoming post. No statistician would accept Jacoby’s argument for a minute. By examining 36 series and picking 10, all 36 series were “used”. I find it hard to believe that Jacoby’s position has any traction whatever, but I was unsuccessful in persuading Schneider.
Jacoby’s practices came up unexpectedly in the NAS panel workshop, when Rosanne D’Arrigo told an astonished panel that “you had to pick cherries if you want to make cherry pie” (see here). Although this evidence was highly relevant to the subject of the NAS inquiry and occasioned a flurry CLimategate emails, the NAS panel report avoided the issue entirely.
A related issue arises in respect to the Yamal-Urals regional chronology where CRU examined several versions before reverting back to the very HS-shaped chronology arising from the very small dataset used in Briffa 2000. I’ll discuss this in another post.
In the mining exploration business, investors who trade mining stocks know that “late” results are almost never “good” results. The reason is human nature. In public stocks, you’re legally obligated to report results promptly, but there is some play in timing. If promoters have “bad” results in the first part of the program, there is a great temptation to delay the bad news in the hope that later results will bail out the program. The best and only way to deal with temptations to delay bad results is to establish an announcement schedule ahead of time and stick to it.
In 2006, I noticed that Thompson had swiftly reported Kilimanjaro results, but results from Bona Churchill had not been reported with the same alacrity. (Six years later, Bona Churchill results still have not been published, let alone archived.) My surmise at the time was that the results were “bad” (i.e. did not have elevated O18 values in the 20th century.)
By saying this, I am not saying that climate scientists are less honorable than mining promoters. Only that there are great human temptations to delay reporting “bad” results. And, after a while, delay can turn into neglect, without any explicit decision ever having been made not to report the “bad” results.
In the mining business, promoters are bursting to report good results. I presume that this temptation also affects climate scientists, who, as Schmidt tells us from time to time, are human and subject to human frailties. I remain convinced that climate scientists are more eager to publish “good” results than “uninteresting” ones. Especially when Nature and Science have such an appetite for “worse than we thought” articles to be a mild object of satire even within the “community”.
Six years later, Bona Churchill results remain not only unarchived, but unpublished. At this point, one cannot say that the Bona Churchill results have been “suppressed” for good; but they have clearly been delayed. A graphic in a workshop shows that my surmise was correct: contrary to Thompson’s expectation, 20th century O18 values were not elevated. They are “inconvenient”.
Back to Gavin’s point.
It’s not tin foil country to say that, in respect to Jacoby and D’Arrigo 1989, Jacoby failed to report or archive data that didn’t show what he was expecting. I typically describe this sort of phenomenon in less charged terms than Schmidt: “failed to report” as opposed to “suppressed”. But there’s nothing tin foil about saying that Jacoby was selective in what data he archived – Jacoby said so. One of the many problems in this field is that so many “real” scientists see nothing wrong with this.
Similarly with Mann’s residuals (“dirty laundry”) and verification r2. And Thompon’s Bona-Churchill.
I find it hard to believe that Schmidt seriously believes that climate scientists have reacted to suspicion by saying “take a look”. The pages of Climate Audit are replete with one incident after another, where climate scientists have taken precisely the opposite attitude.
That includes the present case. The sane resolution of the present FOI request would have been for East Anglia to just send me the data, even if they felt that they didn’t “have” to. The best way to remove suspicion would have been to say: “Here, Steve, take a look”. If Schmidt thinks that that’s what CRU has done in this case, then he’s the one in tin foil country.