Replication #1: MBH98 Temperature Dataset

I have a considerable inventory of material on replication issues pertaining to MBH98, which does not really fit into academic journal formats. I’ll probably do about 25 of these notes, which may interest a few people and will illustrate the obstacles to replicating MBH98 without a close examination of source code. I’ll start first with the temperature dataset used in MBH98, identifying the correct reference rather than the one provided in the Corrigendum SI.

MBH98 provides the following information on instrumental temperature provenance:

Monthly instrumental land air and sea surface temperature [10- Jones, P. D. & Briffa, K. R. Global surface air temperature variations during the 20th century: Part 1—Spatial, temporal and seasonal details. Holocene 1, 165–179 (1992)] grid-point data (Fig. 1b) from the period 1902–95 are used to calibrate the proxy data set.

Elsewhere they stated that 1128 months of data were used in for temperature PCdetermination:

the N=1,128 months of data available from 1902 to 1995 were sufficient for a unique, overdetermined eigenvector decomposition

Intuitively, one wonders whether Jones and Briffa [1992] already included 1995 results and this concerns turns out to be well-founded.

The Corrigendum SI (July 2004) (which included a copy of the data set said to have been used, stated one again that Jones and Briffa 1992 had been used:

The original Climatic Research Unit (CRU) gridpoint surface temperature temperature database from 1854-1993 of Jones and Briffa [1992] used by MBH98 (this version of the dataset has been replaced by a different surface temperature dataset at CRU and is no longer available).

The temperature dataset anomalies-new itself, as archived in July 2004 at the Corrigendum SI, is a file of size 22.168 MB with 2592 gridcells (72 longitude bands x 36 latitude bands) and appears to be zeroed on a 1961-1990 reference period.

By contrast, the Jones and Briffa [1992] version uses 5×10 degree gridcells. So Jones and Briffa 1992 cannot be the correct provenance, despite the explicit statement in MBH98, re-iterated in the Corrigendum.

Second, The dataset anomalies-new has monthly anomalies for only 1102 months from 1854 to 1993. Thus, the claim in MBH98 about using data from 1902-1995 (made twice) and about using 1128 months all appear to be incorrect.

Elsewhere in the Corrigendum SI, Mann inconsistently cited Jones et al 1995 supposedly centered on 1951-1970 as the provenance of the temperature data as follows:

Surface temperature anomaly [Jones et al (1995) version of the CRU instrumental surface air temperature dataset, available from 1854-1993, represented as anomalies from 1951-1970 base period] with nearly continuous monthly sampling were used.

However, the Corrigendum SI did not give any further details on Jones et al [1995] and inspection of Jones’ publications, as listed at CRU, does not yield any obvious candidates.

Examining the evolution of the Jones data set, Jones et al. [2001] at CDIAC stated:

The reanalysis of land surface data by the CRU (Jones 1994) resulted in (1) the inclusion of over 1000 additional stations, (2) a new reference period common to all stations (1961-1990; previously 1950-1979), and (3) increased grid-box resolution of the temperature anomalies (5° X 5°).

While there is conflicting information, it appears likely that the temperature data set used in MBH98 was an early version of the 5×5 degree Jones [1994] version (using 1854-1993).

In any event, I downloaded the file “anomalies-new” from the Corrigendum SI and collated it into an R-object as a matrix of time series (2592 columns arranged in Jones order i.e. 87.5N to 87.5S (hour hand), 177.5W to 177.5 E (minute hand)). Since I did the collation on a computer that is a couple of years old, I did the collation in two parts and spliced the table, but this would be unnecessary on a 2004 computer. The collation script is here. The collation was checked in several ways (as seen later) e.g. by plotting location of cells with near-continuous coverage; by comparing time series from collated gridcell.

I have not compared the temperature dataset now archived to other vintage data sets from CRU. It would be worth comparing this newly archived vintage data to the following datasets archived at CDIAC: ndp03/R1, the early Jones 5×10 data to 1984; ndp021/Rev 1990 from 1851 to 1988 (5×10) ; ndp-022/R2 (Rev. 1993) from 1854 to 1991 (reference period 1950-1979); ndp032 ( Antarctic); a later file (2000) from 1856-2000, which seems to be the predecessor to the current HadCRU2, which also starts in 1856 (reference 1961-1990).

Jones and Briffa [1992]. Holocene 2, 165-179.
Jones [1994], Hemispheric surface air temperature variations: A reanalysis and an update to 1993. Journal of Climate 7(11):1794-1802

One Comment

  1. N. Joseph Potts
    Posted Feb 21, 2005 at 6:00 PM | Permalink

    At 6:55PM EST, I was unable to get ("here") to work. Page-not-available error.Steve: I forgot to upload it. Should be better now. It’s not prettied up and you may need to adjust files for your own use.

%d bloggers like this: