Federal Reserve Bank Working Paper on Replication

Richard G. Anderson, William H. Greene, Bruce D. McCullough and H. D. Vinod have some very interesting comments in a recent Federal Reserve Bank of St Louis Working Paper about the importance of archiving data and code, in which they cite our work approvingly. Here’s a nice cut phrase that they quote:

An applied economics article is only the advertising for the data and code that produced the published results.

Anderson et al. [2005], The Role of Data & Program Code Archives in the Future of Economic Research, Federal Reserve Bank of St. Louis Working Paper 2005-014B, Revised March 2005, available here says:

For all but the simplest applications, a published article cannot describe every step by which the data were filtered and all the implementation details of the estimation methods employed. Without knowledge of these details, results frequently cannot be replicated or, at times, even fully understood. Recognizing this fact, it is apparent that much of the discussion on replication has been misguided because it treats the article itself as if it were the sole contribution to scholarship — it is not. We assert that Jon Claerbout’s insight for computer science, slightly modified, also applies to the field of economics:

An applied economics article is only the advertising for the data and code that produced the published results.

They have the following comments about us:

The global-warming debate provides an illustration outside economics. In an important article, Mann, Bradley and Hughes (1998) presented evidence of temperature warming during the twentieth century, relative to the previous several centuries. Their article became prominent when one of its charts (a hockeystick shaped scatter plot, with a “shaft” consisting of historical data and a “blade” consisting of upwardsloping twentieth century data) was featured prominently in the 2001 report of the U.N. Intergovernmental Panel on Climate Change (the Kyoto treaty). As expected, high visibility invites replication and tests of robustness. In a series of papers, McIntyre and McKitrick (2003, 2005a, 2005b) have chronicled their difficulties in obtaining the data and program code; the publishing journal, Nature, did not archive the data and code. After some delay, the authors provided the data (see Mann et al., 2004) but have declined, at least as of this writing, to furnish their statistical estimation programs despite their statement that the statistical method is the principal contribution of their article, specifically, to “…take a new statistical approach to reconstructing global patterns of annual temperature back to the beginning of the fifteenth century, based on calibration of multiproxy data networks by the dominant patterns of temperature variability in the instrumental record.” (Mann et al. 1998, p. 779). McIntyre and McKitrick’s examination suggests that Mann et al.’s statistical procedure (a calibrated principal components estimator) lacks power and robustness; specifically, that the procedure induces hockey-stick shapes even when the true data generating process has none.

The quote from MBH98 selected by Anderson et al. [2005] rather neatly shows the contradiction between the claim in MBH98 that they were taking a "new statistical approach" and their failure to supply the details. The Anderson et al. [2005] continues an important series of articles in applied economics, which should be required reading in the paleoclimate community (see the references in Anerson et al.).

This entry was written by Stephen McIntyre, posted on Apr 22, 2005 at 9:41 PM, filed under Archiving, Disclosure and Diligence, Replication and tagged Replication. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

2 Comments

Dave Dardinger

Posted Apr 23, 2005 at 8:13 AM | Permalink

I like the quote. But another part of the picture are the data collection methods used. Most science papers have several statements of the form: “using the method of Brown & Smith (1945a)” Workers in the field, of course, understand what this means, but an interested observer from another field doesn’t and has to have access to the Brown & Smith paper to figure it out. This is what makes auditing climate papers so difficult. Unless someone has the time and resources to basically go about it full-time as you have, sooner or later you run into brick walls of inability to continue following up. If we had a well financed and staffed institute to be tasked explicitly with following up in a skeptical manner papers of policy importance, it would be helpful. Of course then the question becomes one of how to be sure the ‘auditors’ aren’t gradually co-opted by the auditees.
Peter Hartley

Posted Apr 23, 2005 at 2:42 PM | Permalink

Competition between researchers for priority in making genuine discoveries, uncovering problems with previous analyses etc. is meant to be the main bulwark against the auditors being co-opted. One needs a depository of data and methods to facilitate independent checking. Economists have their own history of papers on politically sensitive issues being quickly embraced and publicized only to later be found to have serious flaws — for example, look into papers suggesting minimal, or even beneficial, effects of minimum wages on employment published in prestigious journals around the time an increase in the minimum is being debated in Congress. These examples are part of the reason many economics journals have become sensitive to the issue of accountability. The problem has perhaps not been as pressing in the physical sciences up to now because the studies have a more indirect connection with politics and policy. Climate science has become an exception, however, and along with the “prestige” of being “politically relevant” must come a responsibility to guard against contamination from politics (in the broad sense of that term).

Climate Audit