Temperature stations are known to be affected by numerous forms of inhomogeneity. Allowing for such inhomogeneities is an interesting and not very easy statistical problem. Climate scientists have developed some homemade methods to adjust for such homogeneities, with Menne’s changepoint-based algorithm introduced a few years ago in connection with USHCN among the most prominent. Although the methodology was entirely statistical, they introduced it only in climate literature, where peer reviewers tend to be weak in statistics and focused more on results than on methods.
Variants of this methodology have since been used in several important applied results. Phil Jones used it to purportedly show that the misrepresentations in the canonical Jones et al 1990 article about having inspected station histories of Chinese stations “didn’t matter” (TM- climate science). More recently, the Berkeley study used a variant.
In commentary on USHCN in 2007 and 2008, I observed the apparent tendency of the predecessor homogenization algorithm to spread warming from “bad” stations (in UHI sense) to “good” stations, thereby increasing the overall trend.
I requested a copy of Menne’s algorithm at the time of its introduction, but was refused. While information on the algorithm has subsequently improved, I haven’t subsequently had occasion to re-visit the issue. At the time, I observed:
Menne’s methodology is another homemade statistical method developed by climate scientists introduced without peer review in the statistical literature. As a result, its properties are poorly known.
I expressed a particular concern that Menne’s algorithm might be spreading UHI warming at low-quality stations to better-quality rural stations through biased detection of changepoints. In a comment on the Berkeley study,which used a similar method, I noted their caveat that the methodology had not been demonstrated against systemic biases (such as widespread UHI)
however, we can’t rule out the possibility of large-scale systematic biases. Our reliability adjustment techniques can work well when one or a few records are noticeably inconsistent with their neighbors, but large scale biases affecting many stations could cause such comparative estimates to fail.
In a post on the application of changepoint methods to radiosonde data, I cited Sherwood’s similar criticisms of changepoint methods for homogenization, noting that his criticisms were similar to mine on USHCN adjustments:
Finally, when reference information from nearby stations was used, artifacts at neighbor stations tend to cause adjustment errors: the “bad neighbor” problem. In this case, after adjustment, climate signals became more similar at nearby stations even when the average bias over the whole network was not reduced.
Working through the homogenization algorithms is not a small job and, unfortunately, it’s one of many issues that I haven’t pursued. Nonetheless, I’ve continued to be somewhat wary of changepoint algorithms as an automated method of curing defective data. My own instinct, based on practices of geologists, is that, for practical purposes, it’s best not to assume that all data is of equal quality, but to work outwards from the “best” data, “best” being in terms of ex ante standards. In the case of temperature data, long rural stations with known and consistent measurement methods.
I do not plan to parse the new study or to examine the impact of the biases identified in this study on the major temperature indices. It is evident to me that it is presently warmer now than in the 19th century – a point that has never been disputed at this site. However, the statistical properties of changepoint methods deserve very close examination, and the new study by Sterou and Koutsoyannis will help to mitigate the neglect thus far of this important issue by the “community”.