Replication #8: Reconstructed PCs

This is the first replication note that gets into the meat of our emulation. Previously we’ve noted the non-replicability of certain MBH98 steps, but, since relevant intermediate calculations were archived, we were able to proceed using the archived intermediate calculations. Here we show the key replication step calculating “reconstructed [temperature] principal component” series. Archiving is incomplete. However, we have figured out the mechanics of 5 archived RPCs and have achieved a reasonably accurate replication. However, there are puzzling breakdowns, including a substantial and unexplained difference in the controversial early 15th century, which we do not believe can be resolved without examining source code.

I’ll describe the detailed implementation of the calibration and estimation procedures on another occasion; for now, I just want to show what we’ve been able to replicate and what we haven’t. This page shows results for the PC1; follow the link to see the effect for all 5 RPCs.

MBH98 theoretically calculated 69 "reconstructed principal components" (RPCs) in the 11 steps of the stepwise calculations, going from one RPC in the AD1400 step to 11 RPCs in the AD1820 step. These original RPCs would each go from the start of the calculation step to 1980. Unfortunately, none of these RPCs are archived. Instead, what appears to be 5 RPCs, consisting of splices of the stepwise RPCs after re-scaling, are archived at NOAA and now mirrored at the Corrigendum SI. The README provides no explanation of these series and the splicing is merely my surmise as how to explain these series. However, the accurate replication of some series indicates strongly the correctness of the surmise that the archived RPCs are actually splices. (In this context, we note that the dataset pcproxy.txt, formerly located on Mann’s FTP and originally said to be the dataset used in MBH98, also spliced principal component series; although , upon this error being pointed out, Mann et al. stated that this dataset was "corrupted".)

In the calculations below, the RPC in each calculation step had its variance re-scaled to match the variance of the corresponding temperature principal component in the 20th century. As noted in Variance Re-scaling, this variance re-scaling would appear to immunize MBH98 from the criticism made in von Storch et al. [2004] that this was not done. We then spliced the re-scaled RPC in each calculation step to make a spliced RPC. Figure 1 below shows our emulation of their RPC1 calculation. In this step, without re-scaling, there were very large differences in variances of the reconstructed PCs (>1) and of the temperature PC (<0.04). The left panel shows a scatter plot of archived versus emulated values; the right panel plots the both the archived and emulated series. For the RPC1, our emulation obviously captures the main features of MBH98 quite accurately, but there is a puzzling deterioration in the controversial early 15th century, where our emulation of their method yields higher values than are archived. Without examining source code, we see no way to reconcile this difference.

RPC Figure 1
Figure 1. Emulation of RPC1. This is less accurate in 1400-1450 period. Black – MBH98; red dashed – emulated.



  1. Hans Erren
    Posted Feb 25, 2005 at 3:29 AM | Permalink

    As for incomplete archiving, Dr Mann is committed to help you!

    We would like to take this opportunity to re-iterate our commitment to getting the science right, and as importantly, getting it right in real-time. We welcome all corrections or clarifications and we will endeavour to fix any errors, great or small, as quickly as we can.

  2. TCO
    Posted Sep 10, 2005 at 8:42 PM | Permalink

    WTF. Why did they splice? Why didn’t they say they did? And what effect did it have on the answer?

  3. Steve McIntyre
    Posted Sep 11, 2005 at 6:45 AM | Permalink

    TCO, even with all the work that I’ve done on this, I still don’t understand some aspects of what they’ve done. The new source code is of no help here. It doesn’t show any splicing. Where does the AD1650 step come from? I have no idea – it’s never mentioned.

    I’m particularly intrigued by this splicing, because the first dataset provided to us containe spliced tree ring PCs, a data set later said by Mann to be “garbage”. In a remarkable feat of prestidigitation, even though we had noticed the splicing and avoided the problem by re-assembling the data, Mann convinced the climate science world that we used the “garbage” data set. Among other things he lied about the date of its creation as it was created long before our inquiry. (He deleted the dataset together with its incriminating date evidence). The real and still unanswered question is whether Mann used the “garbage” data set. It’s possible that he didn’t; that it was an intermediate concoction of Rutherford (sho uses the name of the dataset pcproxy in an archived graphic). But then it’s worth inquiring as to who used this dataset. Anyway, we didn’t.

    So when I see splicing, it has a resonance for me that it doesn’t for others. The new source code dump refers to data sets that don’t exist in the archived data, so it’s still hard to tell how they got their exact results. Wahl and Ammann’s claims that they have exactly replicated Mann are pure disinformation. They, like us, can somewhat emulate Mann’s results, but they haven’t exactly replicated them.

  4. TCO
    Posted Sep 23, 2005 at 9:41 PM | Permalink

    1. Well…of course they fully explained their methods and algorithm in the text of the publication, no? 😉

    2. What happened with the lying about you using the wrong version (link)?

    3. Link to story re deleting of database and speculation why? lose track of the intrigues…

%d bloggers like this: