A couple of years ago, Matthew Menne of NOAA applied a form of changepoint algorithm in USHCN v2. While changepoint methods do exist in conventional statistics, Menne’s use of these methods to introduce thousands of breaks in noisy and somewhat contaminated data was novel. BEST’s slicing methodology, in effect, implements a variation of Menne’s methodology on the larger GHCN data set. (It also introduces a curious reweighting scheme discussed by Jeff Id here.) With a little reflection, I think that it can be seen that the mathematics of Mennian methods will necessarily slightly increase the warming trend in temperature reconstructions from surface data, an effect previously seen with USHCN and now seen with GHCN.
Mennian changepoint methods break series into segments at breakpoints and the segments are then introduced into the averaging machine rather than the longer series. BEST describe their slicing as follows:
Our method has two components: 1) Break time series into independent fragments at times when there is evidence of abrupt discontinuities, and 2) Adjust the weights within the fitting equations to account for differences in reliability.
The second aspect of this method was discussed by Jeff Id here. I limit myself in this post to the first aspect.
BEST’s variation of Mennian changepoint methods applied to GHCN resulted in thousands of slices:
This empirical technique results in approximately 1 cut for every 12.2 years of record, which is somewhat more than the changepoint occurrence rate of one every 15-20 years reported by Menne et al. 2009.
I invite readers to momentarily reflect upon the properties of slicing methodology.
If Mennian changepoint methods remove more negative steps than positive steps, it will increase the trend of the final temperature series (and vice versa). This is a simple and fundamental point about slicing methods that is not made in the articles, but is worth holding on to.
If there is an overall warming trend in the data (the mix between climatic and urbanization effects doesn’t seem to matter for this point), it seems highly plausible to me that the changepoint method will be more likely to pick up a negative step and miss a positive step. ( I realize that the size of the effect would need to be established in the situations at hand. My guess is that a more artful mathematician than myself could prove the point from first principles but the existence and direction of the effect seems self-evident.)
A couple of years ago, I observed that the introduction of Mennian methods to USCHN appeared to impact GISS US (resulting in warming relative to USHCN v1), where the difference between the 1930s and the early 2000s increased by 0.3 deg C between 2007 and 2011, an increase that I postulated to arise from Mennian methodology, though I did not further analyse these methods at the time.
Figure 1. NASA Adjustments. Post-2007 adjustments were surmised to arise primarily from changes in USHCNv2 (Mennian) relative to USHCNv1.
The fact that BEST is also running somewhat hotter than NOAA or CRU using the same GHCN data indicates to me that a similar phenomenon is at work here.
BEST do not directly reflect on this problem. They state that the introduction of unnecessary breakpoints “should be trend neutral”, though increasing uncertainty somewhat:
The addition of unnecessary breakpoints (i.e. adding breaks at time points which lack any real discontinuity), should be trend neutral in the fit as both halves of the record would then be expected to tend towards the same 𝑏! value; however, unnecessary breakpoints can amplify noise and increase the resulting uncertainty in the record (discussed below).
This argument, as far as it goes, seems fair enough to me. But it doesn’t deal directly with a potential bias towards detecting negative breaks over positive breaks.
From the perspective proposed in this post, that Mennian slicing methods applied to GHCN data yield a slightly warmer trend than CRU and NOAA using unsliced data should not be viewed as an unexpected result. Indeed, rather than classifying the result as unexpected, I think that the result is better described as “trivial” (as this term is used in mathematics to denote something simple rather than unimportant.)
We already knew the results from (more or less) averaging unsliced GHCN without allowance for urbanization and landscape changes. So the results from sliced data also without allowance for urbanization and landscape changes are unsurprising.
The issue is, as it always has been, the contribution, if any, of urbanization or landscape changes to the trend. Here BEST provide a large loophole, as they warn that their methods will not cope with “large scale systematic biases”:
however, we can’t rule out the possibility of large-scale systematic biases. Our reliability adjustment techniques can work well when one or a few records are noticeably inconsistent with their neighbors, but large scale biases affecting many stations could cause such comparative estimates to fail.
Unfortunately, “large scale systematic biases” from urbanization and landscape changes are precisely what’s at issue and BEST’s slicing methodology does not directly bear on this problem. They purport to address this issue in a companion paper on urbanization, which I will discuss in a forthcoming post in which I will discuss the large discrepancy between BEST and satellite data, a point not touched on in the articles themselves.