In the 1980s, John Christy and Roy Spencer revolutionized the measurement of temperature data through satellite measurement of oxygen radiance in the atmosphere. This accomplishment sidestepped the intractable problems of creating (what I’ll call) a “temperature reconstruction” from surface data known to be systemically contaminated (in unknown amounts) by urbanization, land use changes, station location changes, measurement changes, station discontinuities etc etc.
Also in the 1980s, Phil Jones and Jim Hansen created land temperature indices from surface data, indices that attracted widespread interest in the IPCC period. The source data for their indices came predominantly from the GHCN station data archive maintained by NOAA (who added their own index to the mix.) The BEST temperature index is in this tradition, though their methodology varies somewhat from simpler CRU methods, as they calculate their index on “sliced” segments of longer station records (see CA here) and weight series according to “reliability” – the properties of which are poorly understood (see Jeff Id here.)
The graphic below compares trends in the satellite period to March 2010 – the last usable date in the BEST series. (SH values in Apr 2010 come only from Antarctica and introduce a spurious low in the BEST monthly data.) Take a look – comments follow.
The BEST and CRU series run hotter than TLT satellite data (GLB Land series from RSS and UAH considered here), with the difference exacerbated when the observed satellite trends are “downscaled” to surface using the amplification factor of approximately 1.4 (that underpins the “great red spot” observed in model diagrams). An amplification factor is common ground to both Lindzen and (say) Gavin Schmidt, who agree that tropospheric trends are necessarily higher than surface trends simply though properties of the moist adiabat. In the left barplot, I’ve divided the satellite trends by 1.4 to obtain “downscaled” surface trends. In a comment below, Gavin Schmidt observes that an amplification factor is not a property of lapse rates over land. In the right barplot, I’ve accordingly shown the same information without “downscaling” (adding this to the barplot in yesterday’s post.) (Note Nov 2, 2011 – I’ve edited the commentary to incorporate this amendment and have placed prior commentary in this section in the comments below.)
The UAH trend over land is 0.173 deg C/decade (0.124 deg C downscaled) and the RSS trend from 0.198 deg C/decade (0.142 deg C/decade). I will examine this interesting property of the Great Red Spot on another occasion, but, for the purposes of this post, defer to Gavin Schmidt’s information on its properties.
The simple barplot in Figure 1 clearly shows which controversies are real and which are straw men.
Christy and Spencer are two of the most prominent skeptics. Yet they are also authors of widely accepted satellite data showing warming in the past 30 years. To my knowledge, even the most adamant skydragon defers to the satellite data. BEST’s attempt to claim the territory up to and including satellite trends as unoccupied or contested Terra Nova is very misleading, since this part of the territory was already occupied by skeptics and warmists alike.
The territory in dispute (post-1979) is the farther reaches of the trend data – the difference between the satellite record and CRU, and then between CRU and the outer limits of BEST.
BEST versus CRU
On this basis, BEST (0.282 deg C/decade) runs about 0.06 deg C hotter than CRU (0.22 deg C/decade). My surmise, based on my post of Oct 31, 2011, is that this results from the combined effects of slicing and “reliability” reweighting, the precise proportion being hard to assign at this point and not relevant for present purposes.
Commenter Robert observed that CRU now runs cooler than NOAA or GISS. In the corresponding 1979-2010, NOAA has a virtually identical trend to BEST (0.283 deg C/decade) also using GHCN data. It turns out that NOAA has changed its methodology earlier this year from one that was somewhat similar to CRU to one that uses Mennian sliced data. (I have thus far been unable to locate online information on previous NOAA versions.)
This indicates that the difference between BEST (and NOAA) versus CRU is probably more due to slicing than to reweighting.
CRU vs Satellites
CRU runs about about 0.03-0.05 deg C/decade warmer than TLT satellite trends over land and about 0.08-0.10/decade warmer in the 1979-2010 period than downscaled satellite data.
Could this amount of increase be accounted for by urbanization and/or surface station quality problems?
In my opinion, this is entirely within the range of possibility. (This is not the same statement as saying that the difference has been proven to be due to these factors. In my opinion, no one has done a satisfactory reconciliation.) From time to time, I’ve made comparisons between “more urban” and “more rural” sites in relatively controlled situations (e.g. Hawaii, around Tucson following a predecessor local survey) and when I do the comparisons, I find noticeable differences of this order of magnitude. I’ve also done comparisons of good and bad stations from Anthony’s data set and again observe differences that would contribute to this order of magnitude. But this is not the same thing as proving the opposite.
In the past, I’ve been wary of “unsupervised” comparisons of supposedly “urban” and supposedly “rural” subpopulations in papers by Jones, Peterson, Parker and others purporting to prove that UHI doesn’t “matter”. Such papers set up two populations – one “urban” and one ‘rural’, purport to show that the trends for each population are similar and claim that this “shows” that UHI is a non-factor in trends. In my examination of prior papers, each one has tended to founder on similar points. All too often, the two populations are very poorly stratified – with the “rural” population all too often containing urban cities, sometimes even rather large cities.
The BEST urbanization paper is entirely in the tradition of prior studies by Jones, Peterson, Karl etc. They purport to identify a “very rural” population by MODIS information and show that they “get” the same answer. Unfortunately, BEST have not lived up to their commitment to transparency in this paper. Code is not available. Worse, even the classification of sites between very rural and very urban is not archived, with the pdf of the paper disconcertingly pointing to a warning that the link is unavailable (making it appear like noone even read the final preprint before placing it online.) Mosher has noted inaccuracies in their location data and observes that there are perils for inexperienced users of MODIS data, Mosher reserving his opinion on whether the lead author of the urbanization paper, a grad student, managed to avoid these pitfalls until he’s had an opportunity to examine the still unavailable nuts and bolts of the paper.
Mosher, who’s studied MODIS classification of station data as carefully than anyone, observes that there are no truly “rural” (in a USHCN target sense) locations in South America – all stations come from environments that are settled to a greater or lesser degree. Under Oke’s original UHI concept, the cumulative UHI effect was, as a rule of thumb, proportional to log(population). If “urbanization” is occurring in towns and villages as well as in large cities – which it is, then the contribution of UHI increase to temperature increase will depend on the percentage change in population (rather than absolute population). If proportional increases are the same, then the rate of temperature increase will be the same in towns and villages as in cities.
If one takes the view that satellite trends provide our most accurate present knowledge of surface trends, then one has to conclude that the BEST methodological innovations (praised by realclimate) actually provide a worse estimate of surface trends than even CRU.
In my opinion, it is highly legitimate (or as at least a null hypothesis) to place greatest weight on satellite data and presume that the higher trends in CRU and BEST arise from combinations of urbanization, changed land use, station quality, Mennian methodology etc.
It seems to me that there is a high onus on anyone arguing in favor of a temperature reconstruction from surface station data (be it CRU or BEST) to demonstrate why this data with all its known problems should be preferred to the satellite data. This is not done in the BEST articles.
In discussions of proxy reconstructions, people sometimes ask: why does anyone care about proxy reconstructions in the modern period given the existence of the temperature record? The answer is that the modern period is used to calibrate the proxies. If the proxies don’t perform well in the modern period (e.g. the tree ring decline in the very large Briffa network), then the confidence, if any, that can be attached to reconstructions in pre-instrumental periods is reduced.
It seems to me that a very similar point can be made in respect to “temperature reconstructions” from somewhat flawed station records. Since 1979, we have satellite records of lower tropospheric temperatures over land that do not suffer from all the problems of surface stations. Yes, the satellite records have issues, but it seems to me that they are an order of magnitude more tractable than the surface station problems.
Continuing the analogy of proxy reconstructions, temperature reconstructions from surface station data in the satellite period (where we have benchmark data) should arguably be calibrated against satellite data. The calibration and reconstruction problem is not as difficult as trying to reconstruct past temperatures with tree rings, but neither is it trivial. And perhaps the problems encountered in one problem can shed a little light on the problems of the other.
Viewed as a reconstruction problem, the divergence between the satellite data and the BEST temperature reconstruction from surface data certainly suggests some sort of calibration problem in the BEST methodology. (Or alternatively, BEST have to show why the satellite data is wrong.) Given the relatively poor scaling of the BEST series in the calibration period relative to satellite data, one would have to take care against a similar effect in the pre-satellite period. However, the size of the effect appears likely to have been lower: both temperature trends in the pre-satellite period and world urbanization were lower in the pre-satellite period.
One great regret about BEST’s overall strategy. My own instinct as to the actual best way to improve the quality of temperature reconstructions from station data is to really focus on quality, rather than quantity. To follow the practices of geophysicists using data of uneven quality – start with the best data (according to objective standards) and work outwards calibrating the next best data on the best data.
They adopted the opposite strategy (a strategy equivalent to Mann’s proxy reconstructions). Throw everything into the black box with no regard for quality and hope that the mess can be salvaged with software. Unfortunately, it seems to me that slicing the data actually makes the product (like NOAA’s) worse product than CRU (using satellite data as a guide). It seems entirely reasonable to me that someone would attribute the difference between higher BEST trend and satellite trends not to the accuracy of BEST with flawed data, but to known problems with surface stations and artifacts of Mennnian methodology.
I don’t plan to spend much more time on it (due to other responsibilities).
[Nov 2 – there’s a good interview with Rich Muller here where Muller comes across as the straightforward person that I know. I might add that he did a really excellent and sane lecture (link) on policy implications a while ago that crosscuts most of the standard divides. ]
A Closing Editorial Comment
Finally, an editorial comment on attempts by commentators to frame BEST as a rebuttal of Climategate.
Climategate is about the Hockey Stick, not instrumental temperatures. CRUTEM is only mentioned a couple of times in passing in the Climategate emails. “Hide the decline” referred to a deception by the Team regarding a proxy reconstruction, not temperature.
In an early email, Briffa observed: “I believe that the recent warmth was probably matched about 1000 years ago.” Climategate is about Team efforts to suppress this thought, about Team efforts to promote undeserved certainty – a point clearly made in CRUTape Letters by Mosher and Fuller.
The new temperature calculations from Berkeley, whatever their merit or lack of merit, shed no light on the proxy reconstructions and do not rebut the misconduct evidenced in the Climategate emails.
[Nov 2. However, in fairness to the stated objectives of the BEST project, I should add the following.
Although I was frustrated by the co-mingling of CRUTEM and Climategate in public commentary – a misunderstanding disseminated by both Nature and Sarah Palin – and had never contested that fairly simple average of GHCN data would yield something like CRUtem, CRUTEM and Climategate have become co-mingled in much uninformed commentary.
In such circumstances, a verification by an independent third party (and BEST qualifies here) serves a very useful purpose, rather like business audits which, 99% of the time, confirm management accounts, but improve public understanding and confidence. To the extent that the co-mingling of Climategate and CRUTEM (regardless of whether it was done by Nature or Sarah Palin) has contributed to public misunderstanding of the temperature records, an independent look at these records by independent parties is healthy – a point that I made in my first point and re-iterate in this post. While CA readers generally understand and concede the warming in the Christy and Spencer satellite records, this is obviously not the case in elements of the wider society and there is a useful function in ensuring that people can reach common understanding on as much as they can. ]