An amusing farce with French protagonists is currently playing at realclimate here. The main protagonist is Raymond Pierrehumbert, the second line of whose CV proclaims that he is a Chevalier of some sort, a term which I will adopt in this post. He mocks another set of French authors, whom he labels as being Chevaliers of a rank presumably less dignified than his own. He accuses them of a variety of transgressions. As it turns out, I agree with most of his criticisms, but, as so often in climate science, Pierrehumbert (or the “Chevalier”) is silent on similar or more egregious transgressions by his fellow RC coauthors or IPCC.
In the farce at hand, the Chevalier raises a variety of issues. The only one that I review here pertains to the provenance of temperature data, since it is an area with which I am familiar and which also appears to be the most “Ugly” problem in the Chevalier’s beauty contest.
The temperature data problem originated with the Chevalier’s opponents (Courtillot et al), who, among other sins, failed to provide a proper data citation to their temperature series. (The lack of an accurate data citation is hardly a unique occurrence in climate science – it is more fair to say that it is the norm; it is something that I have regularly criticized and the failure of Courtillot et al to do so has led to much of the present farce. Although the authors provided the provenance of the data (in a reply to a Comment by another set of French authors, Bard and Delaygue), the Chevalier contests whether they provided actual provenance. He notes that Bard and Delaygue consulted the supposed originator of the data (Phil Jones), who denied ever producing the data in question. Bard and Delaygue reported this in a Note in Proof to their Comment, but this accusation was removed by the editor, to which the Chevalier took umbrage in his RC post. As we shall see, despite Jones’ denials of paternity, the data in question did originate with him – classic mistaken identity.
All in all, it’s a French farce with the Chevalier often acting more like Inspector Clouseau than Hercule Poirot.
In order to properly identify digital data, an adequate citation requires an identification code, which, in practical terms, means a URL together with a download date. My daughter in an undergraduate non-climate course knows this. The AGU has a policy requiring this in its publications:
The following elements must be included in the [data set] reference: author(s), title of data set, access number or code, data center, location including city, state, and country, and date.
Unfortunately, this excellent policy is totally ignored by climate scientists, even in AGU journals like GRL, JGR etc.
The wisdom of a policy requiring adequate data citation is amply demonstrated by the present comedy.
The Chevalier commences his post by stating:
It’s the physics, stupid [in smaller font in the original post].
We put the last word in small letters since we’ve learned that it is not a good debating technique to imply (even inadvertently) that those who are having trouble seeing the force of our arguments might be stupid.
If I may say so, it’s not just a matter of debating technique. People are not stupid. If people have trouble seeing the “force of [his] arguments”, maybe he should spend a little time examining how he’s presenting these arguments. Maybe the problem isn’t that the people are stupid, but that his explanations are not as clear as he thinks they are. RC always seems a little quick to lay blame elsewhere, when a look in the mirror might be in order. Just a thought.
The Chevalier then announces:
we’ll expose a pattern of suspicious errors and omissions that pervades Courtillot’s paper. Sloppiness and ignorance is by far the most charitable interpretation that can be placed on this pattern.
We are completely in favor of realclimate authors spending the time to expose “patterns of suspicious errors and omissions” wherever they lie. (Verification r2 statistics anyone?). He then describes some papers as “Ugly” (as opposed to “Bad”):
These papers cross the line from the merely erroneous into the actively deceptive. Papers in this category commit what Damon and Laut judiciously call a “Pattern of strange errors.” Papers in this category often use questionable (and often hidden and undocumented) data manipulations to manufacture correlations where none exist.
One would have thought that a realclimate coauthor of Michael Mann’s would tread lightly when it came to raising the level of rhetoric about “questionable (and often hidden and undocumented) data manipulations to manufacture correlations where none exist”, but the Chevalier is charging into battle seemingly reckless to the effect that such standards might inflict on his associates.
In our critique of Mann’s principal components methodology, we criticized the standardization on the short calibration period, which introduced a bias in his methodology. It seems that Courtillot et al, the Chevalier’s opponents, also carried out short segment standardization, to which the Chevalier takes great umbrage as follows:
By snipping out just the last bit of the curve and normalizing to unit standard deviation, Courtillot inflates the variability and makes the fit look better than it would be if the full data set were used. As a bit of deceptive data manipulation, this has to go down in history with the selective smoothing used on some of the solar records that Damon and Laut discuss in their critique of the Danish solar work.
Obviously we share the Chevalier’s concern about the perils of short segment standardization. However, I would have thought that a more obvious comparandum for the perils of short segment standardization – and one surely familiar to RC coauthors was MBH98 itself, where much publicity has been attached to the effect of short segment standardization. One would welcome the Chevalier’s expert opinion on whether short segment standardization in MBH98 was also a “bit of deceptive data manipulation [that] has to go down in history”, given that our own formal opinions have used more moderate language than that employed by the Chevalier.
The Chevalier also condemns opportunistic truncation of series – another issue discussed on this blog, where the truncation of the post-1960 values of the Briffa et al 2001 reconstruction by IPCC TAR and AR4 has been sharply criticized on several occasions. In Briffa et al 23001, post-1960 values go down showing a sharp mismatch between the Briffa et al 2001 recon and both temperature and other reconstructions. As a reviewer of IP AR4, I specifically asked IPCC to show the post-1960 values and explain the mismatch as best they could. They refused, saying only that it would be “inappropriate” to show the post-1960 mismatch. The post-1960 values of Rutherford, Mann et al 2005 were also truncated in IPCC AR4. Here’s the Chevalier’s view on opportunistic truncation:
there is no legitimate reason in a paper published in 2007 for truncating the temperature record at 1992 as they did. There is, however, a very good illegitimate reason, in that truncating the curve in this way helps to conceal the strength of the trend from the reader, and shortens the period in which the most glaring mismatch between solar activity and temperature occurs.
Again, I completely share the Chevalier’s severe views on opportunistic truncation – especially where the effect of the truncation is to conceal a mismatch. Having condemned the practise, perhaps the Chevalier will move his sights to other even more prominent truncations, such as the truncation of post-1960 values in the Briffa et al 2001 and Rutherford, Mann et al 2005 reconstructions in IP AR4 (and for the former, in TAR). As a reminder, here is a graphic showing the untruncated Briffa reconstruction (you never see the downturn on the right in the spaghetti graphs.)
On to the Farce
In his description of the really “Ugly” part of Courtillot et al, the Chevalier reported:
Bard and Delaygue noticed another strange thing. Courtillot’s “Tglobe” curve did not look much like the curve published by Jones. Jones’ curve, plotted from his actual data files, is shown in Bard and Delaygue’s corrected version of the figure; they also show the NASA reconstruction for comparison. These two curves are in agreement, but neither shows the sharp rise/dip pattern between 1940 and 1970 which is seen in Courtillot’s figure. So if Courtillot’s data is not Jones’ global mean temperature, what is it that Courtillot plotted? We may never know. In his response to Bard and Delaygue, Courtillot claims the data came from a file called: monthly_land_and_ocean_90S_90N_df_1901-2001mean_dat.txt. Bard and Delaygue point out, however, that Jones has no record of any such file in his dataset, and does not recognize the purported “Tglobe” curve as any version of a global mean temperature curve his own group has ever produced.
The Bard and Delaygue comment, cited by the Chevalier, included the following statement as a Note Added in Proof:
For the global temperature Tglobe curve cited from Jones et al. (1999) in Courtillot et al. (2007), these authors now state in their response that they had used the following data file: monthly_land_and_ocean_90S_90N_df_1901-2001mean_dat.txt We were unable to find this file even by contacting its putative author who specifically stated to us that it is not one of his files (Dr. Philip D. Jones, written communication dated Oct. 23, 2007).
The Chevalier chimed in with the following comment:
In the revised “Response” Courtillot now admits that the temperature record called “Tglobe” is not from any of Phil Jones’ datasets at all. Courtillot now claims that the data came from a study by Briffa et al. (2001), giving the address of a file stored at NCDC.
Now that the characters in this French farce are now more or less all on stage, let’s examine some actual data. On the left is the Courtillot et al figure being criticized, as posted at realclimate from Courtillot et al. On the right is a graphic that I produced from the data in column 7 of ftp://ftp.ncdc.noaa.gov/pub/data/paleo/contributions_by_author/briffa1998/briffa2001jgr3.txt, which is entitled:
Observed temperatures from Jones et al. (1999) Rev Geophys
To produce the graphic on the right, I first applied a filter (running 11-year mean) and then standardized on the 1900-1990 period – both operations frequently carried out in paleoclimate. The code for retrieving the data and producing the graphic is shown in the first comment below. I’ve looked at hundreds of data versions of various climate series, and it is my opinion that the Tglobe series illustrated in the Courtillot graphic is a filtered and rescaled version of the digital data cited above.
Left: Diagram from Courtillot et al 2007 with temperature series in red; right – plot of column 7 from ftp://ftp.ncdc.noaa.gov/pub/data/paleo/contributions_by_author/briffa1998/briffa2001jgr3.txt, entitled “Observed temperatures from Jones et al. (1999) Rev Geophys”
As noted above, Chevalier P-Humbert says that Courtillot has “admitted” that the temperature date did not come from any of Phil Jones’ datasets and comes instead from a Briffa et al 2001 study as follows:
Courtillot now admits that the temperature record called “Tglobe” is not from any of Phil Jones’ datasets at all. Courtillot now claims that the data came from a study by Briffa et al. (2001), giving the address of a file stored at NCDC.
We are well in the middle of the farce by this point. Readers of this blog know that Briffa and Jones are close associates and frequent coauthors, so, before concluding that an article by Briffa et al is unconnected with Jones, one should obviously check the co-authors, who in this case are:
K. R. Briffa, T. J. Osborn, F.H. Schweingruber, I.C. Harris, P. D. Jones, S.G. Shiyatov, and E.A. Vaganov.
As noted above, column 7 of the NCDC archive for Briffa et al 2001 states that the temperature data is:
Observed temperatures from Jones et al. (1999) Rev Geophys
So Chevalier P-Humbert’s allegation that the data in question did not come from any of Jones’ data sets appears to be simply untrue and an allegation that P-Humbert could have checked with the most elementary due diligence.
(Interestingly, Briffa et al 2001 referred to here is the very study whose post-1960 values were truncated by IPCC TAR and AR4, a truncation that, in my opinion, is as egregious or even more egregious than the truncation in Courtillot et al 2007 and, to avoid any perception of hypocrisy, I’m sure that we can expect the Chevalier to promptly communicate his vehement opposition to this truncation to IPCC authors, the Briffa et al 2001 coauthors and the Rutherford et al 2005 coauthors.)
So the citation for column 7 in the above archive by Briffa, Jones and coauthors was Jones et al. (1999) Rev Geophys; the citation in Courtillot et al 2007 was: P.D. Jones, M. New, D.E. Parker, S. Martin, G. Rigori, Surface air temperature and its changes over the past 150 years, Rev. Geophys. 37 (1999) 173199.
Notwithstanding the identity established above, Chevalier P-Humbert reported that Phil Jones claimed that the series illustrated in Courtillot et al 2007 has nothing to do with him:
its putative author specifically stated to us that it is not one of his files (Dr. Philip D. Jones, written communication dated Oct. 23, 2007)
Unfortunately, the statement by the “putative author” was false. But, hey, this is climate science. Why should Jones have to undergo paternity tests? The Chevalier no doubt correctly reported that Jones had denied paternity, but the fact that Jones denied paternity did not mean that he was not a father or co-father of the data (even though he failed to recognize the filtered and smoothed version of the data provided in Briffa, Jones et al 2001).
As noted above, the editor of the journal removed the allegations added in proof, to which the Chevalier took great exception. However, in this case, and undoubtedly merely by happenstance, the editor’s judgement has been vindicated, as he removed an untrue allegation by one set of disputants.
Despite all the clumsiness by Phil Jones and the Chevalier, this is, after all, a French farce and there is plenty of confusion to go around and Courtillot et al are not innocent: indeed their failure to provide a proper data citation, even if standard in climate science has fostered the comedy. They also misconstrued the series archived in Briffa et al 2001 as a global series instead of a 20-90N series. As noted above, the data archive referred to above column 7 provides data purporting to be:
Observed temperatures from Jones et al. (1999) Rev Geophys
Jones et al 1999 reports Global, NH and SH temperature and apparently Courtillot unwarily assumed that the digital data said to come from Jones et al (1999) actually did come from that publication and then to have assumed that it was the global temperature (as opposed to the NH temperature). As it turns out, the data here does not appear to be any of the three series (NH,SH, Global) illustrated in Jones et al 1999, but a special-purpose series calculated for Briffa et al 2001 covering 20-90N. In addition to the temperature said to come from Jones et al 1999, the Briffa et al 2001 archive lists 6 temperature reconstructions with the following comment:
The following reconstructions have been taken from the source “references listed below, and then RECALIBRATED to obtain estimates” of April-September mean temperatures from all land regions north of 20N. All series are temperature anomalies in degrees C with respect to the 1961-1990 mean.
The reconstruction data is listed in columns 1-6. It appears that the column 7 data said to come from Jones et al 1999 is actually a new composite for 20N-90N calculated for Briffa Jones et al 2001 – it doesn’t say this – indeed it might be interpreted not to say this – but, based on familiarity with Team methods, this would be my guess and Courtillot seems to have been wrongfooted here. (My guess is that it’s probably not a whole lot different from the contemporary NH version but that’s just a guess)
Adding to the comedy is the obsolete version used here. The series in Briffa et al 2001 only comes up to 1997 and was itself over 3 years obsolete when published (although I don’t recall any previous objections). When Courtillot applied an 11 year filter, the smoothed series had 5 fewer years and thus ended in 1992, exacerbating the problem from using an obsolete data version in the first place.
I strongly disagree with the use of obsolete data – a position espoused on other occasions at this blog. I entirely concur with the Chevalier’s objection on this point. I’ve consistently objected to the use of obsolete data versions – as far back as MM03, we reported MBH98 use of obsolete data versions. But in Mann et al 2007, Mann hasn’t changed a comma in the MBH98 data set, continuing the use of data versions already obsolete in 1998. The Chevalier is silent as a lamb. Mann’s PC1 has surely become obsolete in light of the criticisms of Wegman and the NAS panel and yet Mann et al 2007 perpetuates its use. The Chevalier remains silent. So while I strongly agree with the Chevalier’s criticism of obsolete data, I disagree with the hypocrisy of his silence to other similar transgressions.
All in all, a comedy of errors and mistaken identity – a French farce indeed. Courtillot used an obsolete data set without providing a data citation. Courtillot did not realize that the series said to come from Jones et al 1999 was not one of the series illustrated in Jones et al 1999. Courtillot did not realize that the data entitled “observed temperatures from Jones et al. (1999) Rev Geophys” (which showed global, NH and SH temperatures) was actually temperature for 20-90N, a subcalculation nowhere mentioned in Jones et al 1999. When Courtillot identified the provenance (it seems accurately), the originating author, Jones, did not recognize the series from Briffa, Jones et al 2001 in Courtillot’s smoothed and re-scaled version and falsely denied paternity. Bard and Delaygue accepted Jones’ denial of paternity at face value and falsely accused Courtillot of misrepresenting where they obtained the data from. Neither Bard nor the Chevalier actually bothered plotting the data from the location (Briffa et al 2001 at NCDC) to see whether it matched the Courtillot figure. The Chevalier then recklessly accused Courtillot of making a “suspicious error” even though a simple plot of the ncdc data as done here would have proved the provenance. The editor of EPSL removed these allegations (which have proven false), and, in doing so, has incurred the wrath of the Chevalier for removing the allegations (now shown to be false).
I have only discussed the farce of the temperature measurements. I have not visited the issues of magnetic data or solar data or correlations. None of the authors in this dispute provided proper data citations (with exact URLs). I have no doubt that there are many more episodes in this comedy, but even I have a limited appetite for French farce.
Code for Right Panel Above
In case the Chevalier or his entourage elect to deny that the temperature series in Courtillot can be derived from a Jones data set, here is source code which yields the above graphic on a turnkey basis:
temp=(Briffa< (-999)); Briffa[temp]=NA