Replication #3: What if a step is not replicable?

This is a short discussion of the issues arising when a calculation step in a multi-step study cannot be replicated.

The post Replication #2 pointed out a situation where a simple step in MBH98 like the selection of 1082 temperature gridcells was not replicable. This non-replicability of simple steps in MBH98 is unfortunately all too common in practice.

Obviously, one’s first inclination in case of such non-replication is to wonder if you’ve made an error in one’s own calculation. Later, one wonders if the description of the procedure is incorrect and there are unfortunately many such examples in MBH98. The problem is partly that making verbal descriptions of computer programs is difficult the best of times. King [1995] and McCullough and Vinod- see citations in MM05(EE) – gives a nice explanation of why the source code itself is necessary to show exactly how details are handled. (BTW we received a very complimentary email from McCullough on our efforts on replication matters.)

In the case of the temperature gridcells, obviously the 1082 gridcells were selected somehow. At this point, I’m not arguing or claiming that there is anything wrong with the selection of the 1082 gridcells – only that I’d like to know how they were actually selected, since the stated reasons do not appear to be correct. Same with the 219 gridcells.

The next problem in a replication study is: how do you proceed after a roadblock?

do you proceed using the different gridcell set that you identified in verification?

do you proceed using the incorrect gridcell set stated by the original authors to have been used

My policy here was to proceed with the author’s gridcell even though I had reason to believe that it did not meet the stated criteria. This seemed to me to be the lesser of two evils. If you proceeded with what seemed to be "correctly" selected gridcells, then your calculations end up diverging right away from MBH98. In some cases, this might be appropriate, but, generally, my interest was in assessing each individual step in MBH98 and I found it preferable to note the inconsistency and proceed. So in all our MBH98 calculations and emulations, we adopted the archived listing of 1082 gridcells even though we had reason to believe that Mann et al. had either inaccurately selected these gridcells or had used a different criterion for selection.

I’ll show a number of other similar situations in some forthcoming posts.

I had a very similar problem in connection with the retention of principal component series on a network/calculation step basis, which I reviewed here. In response to our Nature submissions, Mann et al. for the first time had stated that Preisendorfer’s Rule N had been used for determining the number of PC series to retain in a network/calculation step combination for tree ring networks (MBH98 itself seeming to refer to a more pragmatic criterion related to evening out spatial distribution of proxies). In August 2004, we were provided the diagram which described this procedure for the first time (later published at realclimate on Nov. 22, 2004) and have known of this supposed procedure since then. My first step was to ensure that I could emulate this diagram and then see if the retention of PCs in other networks could be replicated. I reported on these calculations here.

Mann et al. have not provided a comprehensive demonstration of the application of Preisendorfer’s Rule N to yield all PC retentions, but have only shown that its use in the AD1400 North American network would seemingly permit the use of the PC4 (bristlecone pines). I discuss this in Errors Matter #3, showing that there are other important considerations besides Preisendorfer’s Rule N.

Faced with the "roadblock" of the inability to replicate actual PC retentions using the stated policy, the way that we proceeded with PC retentions in our base case emulation was completely consistent with the implicit policy on "roadblocks" for gridcell calculation. We applied the actual choices of the authors – presuming that they had had some reason for their actual choices, but had just not yet articulated it accurately. In MM05(EE), we pointed out as explicitly as possible the effect of retentions, explicitly noting that MBH98-type results were obtained when the North American PC4 was present and MM-type results when the North American PC4 was not present. In MM05(EE), we tried to characterize these matters in terms of "robustness" rather than "correctness", since that is the easier point to prove and is a weighty point in its own right.

Since there was contemporary disclosure that Preisendorfer’s Rule N was applied in the form illustrated on Nov. 22, 2004 at realclimate and it is impossible to replicate actual PC retentions in other networks using the policy stated on Nov. 22, 2004, I think that the onus should be on Mann et al. to provide actual source code.

Climate Audit