NAS: Assuring the Integrity of Research Data

per inquired recently about obtaining a copy of Gerry North’s presentation to the newly minted NAS Panel on Assuring the Integrity of Research Data, which held its first hearings last week. Gerry North was appropriately the first speaker, as the new panel was occasioned by problems left unanswered by the North panel, although its terms of reference are much broader. The North presentation is here. Some background and thoughts follow. Continue reading

Measuring Precipitation on Willis' Boots

Willis writes in with latest FOI refusal from CRU, saying that they are unable to provide a list of the sites used in HadCRU3. Continue reading

Unthreaded #9

Continuation of Unthreaded #8

A Try for Thompson Data at PNAS

The recent success in getting at least some data from Phil Jones – which he had obstructed since my original request in 2003 – has caused me to refresh my attempts to get Lonnie Thompson to archive his data so that the scandalous inconsistencies between different versions can finally be appraised. Last year, he published an article drawing on seven tropical ice cores in PNAS, which has a data policy that provides inter alia:

Unique Materials: Authors must make Unique Materials (e.g., cloned DNAs; antibodies; bacterial, animal, or plant cells; viruses; and computer programs) promptly available on request by qualified researchers for their own use.

and:

Databases: Before publication, authors must deposit large data sets (including microarray data, protein or nucleic acid sequences, and atomic coordinates for macromolecular structures) in an approved database and provide an accession number for inclusion in the published paper. When no public repository exists, authors must provide the data as Supporting Information online or, in special circumstances when this is not possible, on the author’s institutional web site, provided that a copy of the data is provided to PNAS.

These policies seem a little better on paper than some other journal policies, In addition, their webpage invites people experiencing problems to write to them. So I did so. Here’s my letter:

Dear Sirs,

Last year, I was invited to make a presentation to the National Academy of Sciences Panel on Surface Temperature Reconstructions on millennial temperature reconstructions and have published several peer-reviewed articles in the field, which were cited by the above panel in their report last year.

I am writing in connection pursuant to your policies for availability of unique materials and databases http://www.pnas.org/misc/iforc.shtml#submission in connection with Thompson et al 2006, Abrupt tropical climate change: Past and present, PNAS 103, 10536-10543

Your policy statement says that:

Unique Materials: Authors must make Unique Materials (e.g., cloned DNAs; antibodies; bacterial, animal, or plant cells; viruses; and computer programs) promptly available on request by qualified researchers for their own use.

and:

Databases: Before publication, authors must deposit large data sets (including microarray data, protein or nucleic acid sequences, and atomic coordinates for macromolecular structures) in an approved database and provide an accession number for inclusion in the published paper. When no public repository exists, authors must provide the data as Supporting Information online or, in special circumstances when this is not possible, on the author’s institutional web site, provided that a copy of the data is provided to PNAS.

Thompson et al 2006 describe results from ice cores drilled at Dunde, Guliya, Dasuopu, Puruogangri, Quelccaya, Huascaran and Sajama. For each core, several thousand samples were taken and analyses on a sample-by-sample basis made for isotopes, chemistry and other indicators. The information for each core constitutes a large data set within the meaning of your policies. There is an excellent public repository for ice core data at the World Data Center for Paleoclimatology, which satisfies your definition of a public repository. Under your policies, Thompson et al had an obligation to archive this data as a condition of publication, but this appears to have been overlooked. Although Thompson et al provided a highly abbreviated summary of isotope information as Supplementary Information, the Supplementary Information is incomplete and not compliant with journal policies.

There is a pressing need to ensure compliance with journal data policies, because numerous inconsistent summaries are in gray and peer-reviewed circulation. For example, the figure below illustrates substantial differences between Dunde δO18 data summaries. These discrepancies can only be reconciled through examination of the underlying large data sets, which should have been archived prior to publication had journal policies been followed.

Dunde Versions. Heavy black — Yao et al 2006 (3 year rolling average); thin black – MBH98 (annual); red – PNAS 2006 (5-year averages); blue – Clim Chg 2003 (10-year averages); purple – Yang et al 2002 (values in 50 -year intervals); green – Crowley and Lowery 2000 (original in standardized format, re-fitted here for display by regression fit to MBH98).

I request that you ensure that Thompson et al comply with your data policy by forthwith archiving the large datasets used in the PNAS article for each individual ice core (Dunde, Dasuopu, Guliya, Puruoganri, Quelccaya, Sajama, Huascaran) and for the entire suite of isotopes and chemistry. In addition, because the discrepancies may result from changing algorithms for dating the ice cores, I further request that the dating procedure for each core be made available under your Unique Materials policy.

Thank you for your attention.

Stephen McIntyre

We’ll see what happens. BTW the National Academy panel on Data Integrity, promised in the wake of the Surface Temperature Panel, has been empanelled and held its first hearings this week. Gerry North was the first speaker. He’s sent me a copy of his presentation which I’ll post on some day.

Loso: Varves in Alaska

I said that I post the graphic from Loso et al if someone sent it to me today. In fact, Loso et al is online here and interested parties can consult it for themselves. I don’t have time to comment on this study other than very briefly, but here are some of the key graphics. Continue reading

FOI: The “Final” Answer on Jones et al 1990

I wrote again on Apr 17, 2007 on my FOI request observing that part (B) of my FOI request had not been answered: the identification of the stations used as comparanda in the calculations of Jones et al 1990.

Thank you for your courtesy and attention in this matter, which has successfully resolved part (A) of my request. However part (B) remains outstanding and I re-iterate my previous request for this information:

A) the identification of the stations … for the following three Jones et al 1990 networks:
1. the west Russian network
2. the Chinese network
3. the Australian network

B) identification … of the stations used in the gridded network which was used as a comparandum in this study

Thank you for your attention.
Regards, Steve McIntyre

I received the following reply:

In your email of 17 April 2007, you re-iterated your request from your email of 12 March 2007, to see

“B) identification … of the stations used in the gridded network which was used as a comparandum in this study”

I have been in conversation with Dr. Jones and have been advised that, in fact, we are unable to answer (B) as we do not have a copy of the station data as we had it in 1990. The station database has evolved since that time and CRU was not able to keep versions of it as stations were added, amended and deleted. This was a consequence of a lack of data storage comparable to what we have at our disposal currently.

I have been advised that the best equivalent data available is within the current version of CRUTEM3(v) or CRUTEM2(v). The latter is still available on the CRU web site, though not updated beyond 2005.

These latest versions are likely different from what was used in 1990. Australia and China have both released more data since then – it is likely that much of this was not digitized in 1990. Dr. Jones acknowledges that the grid resolution is now different, but this is again due to greater disk storage available.

The details of our updating of the raw station data is discussed in the following article:
Jones, P.D. and Moberg, A., 2003: Hemispheric and large-scale surface air temperature variations: An extensive revision and an update to 2001. J. Climate 16, 206-223.

This is, in effect, our final attempt to resolve this matter informally. If this response is not to your satisfaction, I will initiate the second stage of our internal complaint process and will advise you of progress and outcome as appropriate. For your information, the complaint process is within our Code of Practice and can be found at:

Click to access 1.2750!uea_manual_draft_04b.pdf

Yours sincerely

David Palmer
Information Policy Officer
University of East Anglia

Inversions from Partial Correlation Coefficients

Willis raised an interesting point about trying to invert series based on partial correlation coefficients with monthly temperatures. His post, together with UC’s response are collected here. Continue reading

More North American Upper Treeline: Wilson-Luckman 2002, 2003

In our continuing quest for a North American upper treeline chronology that exemplifies the IPCC AR4 claim that additional data since TAR shows “coherent behavior” across multiple indicators, we turn now to the upper treeline proxies of Wilson and Luckman, 2002, 2003. They collected 20 Engelmann spruce sites in British Columbia in 1998, which together with a site from Washington archived at ITRDB, were discussed in two articles, Wilson and Luckman (Dendrochronologia 2002, Holocene 2003). WL03 said that: “All sites were at or within 100—200 m of upper treeline”.

Of the 21 sites, 7 were analyzed for MXD. None of the data has been archived, but it’s only 10 years since it was collected. [Update: This data was archived in August 2007, several months after this post.]
Continue reading

Recyclable AbathyThermograph Instruments

Lyman et al have had to correct their paper on ocean cooling, as the effect that they observed has proved to derive in part from a bias from “Xpendable Bathy Thermograph (XBT) instruments”. They report

The rapid decrease in globally integrated upper (0—750 m) ocean heat content anomalies (OHCA) between 2003 and 2005 reported by Lyman et al. [2006] appears to be an artifact resulting from the combination of two different instrument biases recently discovered in the in situ profile data…

The second instrument bias arises in data from eXpendable BathyThermograph (XBT) instruments. These inexpensive probes were not designed to provide climate quality scientific data….

The other bias, however, appears to be caused by eXpendable BathyThermograph (XBT) data that are systematically warm [Gouretski and Koltermann, 2007] compared to other instruments. Characterization and removal of this bias will be required before the historical record of ocean heat content can be reevaluated

Perhaps they can now turn their attention to characterizing biases and uses of Recyclable AbathyThermograph (RAT) instruments (formerly known as “buckets”), also a form of inexpensive probe not designed to provide climate quality scientific data, used in an earlier generation of shallow ocean measurements, illustrated below. (As observed by commenters, the earlier generation of inexpensive probes measured shallow (abathos) water and not deep water (bathos)).


Figure: Recyclable AbathyThermograph (RAT) instruments, formerly known as “buckets”.

CRU on Rural Data Exclusion

In their “explanation” of why Chinese rural data is not included in CRU, Jones says:

None of the rural data (because of the annual resolution) entered subsequent versions of the Climatic Research Unit temperature database.

This is obviously untrue as the “rural” sites in TR055 from which the Jones et al data derived had the same monthly resolution as the “urban” stations. Why say this if it isn’t true?