"Worse than We Thought"

I’ve just collated ERSST v3b and ERSST v2 versions for the tropics and compared ERSST v3 and ERSST v2 versions. CA readers were undoubtedly ready for adjustments, but, using the technical terminology of the leading climate journals, the adjustments were “worse than we thought”. ERSST v3 lowered SSTs before 1880 by up to 0.3 deg C.

Figure 1. Tropical SST: ERSST v3(B) less ERSST v2

The reference article for ERSST v3 is here, which states in the abstract:

Improvements in the late nineteenth century are due to improved tuning of the analysis methods

Update: I’ve amended this article in light of Ryan O’s comments, with which, on further reading, I agree. The difference between ERSST2 and ERSST3 shown in the above graphic is related to the information in SR Figure 2 shown below. You can sort of see the 0.3 deg C shift in the 19th century in this figure, though it’s not explained. SR Figure 2.

The changes that precipitate the differences are changes in parameters for the Smith-Reynolds “extended reconstruction of SST” (ERSST) algorithm, summarized in the Table below: SR08 Table 1.

What is a bit surprising, I guess, is that 19th century SST estimates are so volatile in respect to the parameters. The analysis reported uses a hurdle of 2 months per year (was 5); 2 years per LF (was 5); 25 deg area (was 15 deg)

This entry was written by Stephen McIntyre, posted on May 27, 2009 at 9:03 PM, filed under General. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

58 Comments

Ryan O

Posted May 27, 2009 at 9:31 PM | Permalink

In other news, either Nature or Science or Reader’s Digest, I forget which, reported that the newest climate models are “better than we thought”, noting a “remarkable” low-frequency fit between ERSST v3 and GFDL Model CM2.1.

.
*rolls eyes*
.
I would say that you have to be kidding me . . . but I know you’re not.
Soronel Haetir

Posted May 27, 2009 at 9:41 PM | Permalink

Did that abstract just admit that their historical temperature record is based in part on what computer models tell them to expect?
Dishman

Posted May 27, 2009 at 9:49 PM | Permalink

When the data doesn’t fit the model, change the data.

*sob*
Steve McIntyre

Posted May 27, 2009 at 10:11 PM | Permalink

I think that’s what it says. But it’s possible that they’ve done something else. It’s a typical climate science article – no Supporting Information providing details or source code for exactly what they did.
- stan
  
  Posted May 28, 2009 at 7:53 AM | Permalink
  
  Re: Steve McIntyre (#4),
  
  It’s a typical climate science article – no Supporting Information providing details or source code for exactly what they did.
  
  And I suspect that any requests for this information will be stonewalled. Which would leave us with nothing more than an unsupported assertion shielded from the scientific method.
  
  I have a question. We recently learned that the USCHN temperature data is considered to be accurate despite the fact that 90% of the monitoring stations are sited in ways that fail to meet basic scientific standards. This is because they use a super duper, all-knowing computer algorithm which fixes all the bad data. I assume that the details of how the algorithm works have not been made available for review by outsiders. Is that correct?
  
  If so, that would be another failure of the scientific method. Unsupported assertions unfettered by the inconvenience of real science.
  
  Steve: To my knowledge, USHCN has not published source code; their adjustments are described in cursory publications. While I obviously object to the lack of transparency, readers have to continually remind themselves, as I remind myself, that lack of transparency does not in itself invalidate the results. As to temperature increases, there is considerable evidence of temperature increase. The increase is not a concoction of USHCN adjustments. Having said that, I think that USHCN would do themselves and everyone a service by simply archiving all their methods and code in a completely transparent way and letting people check their adjustments.
  - stan
    
    Posted May 28, 2009 at 10:19 AM | Permalink
    
    Re: stan (#24),
    
    Steve,
    
    I’m still trying to get an understanding of the facts, but I think we are talking about different (albeit related) “adjustments”. I believe that before Anthony Watts started his project, the daily temperatures for each site were adjusted before recordation by a method that used the temperatures from other sites of various proximities. Anthony’s project then began to demonstrate that the vast majority of sites had been producing bad data. Even assuming that it is possible for surrounding sites to be used to “correct” the data of a bad site, they now had a much bigger problem — if 90% of the sites are bad, and they don’t know which ones are bad, or by how much and under what conditions, it would appear that they are trying to use bad data to correct other bad data. Yet, weren’t we recently told that their algorithm can take all this garbage and weave it into gold? That’s one heck of an achievement that serious people ought to take with perhaps a tad bit of skepticism. Without resort to the code, that’s not possible. We have an assertion of an extraordinary capability without anything to support it. Why would any scientist trust it?
    
    And the biggest issue with this is not whether temperatures have risen or not (although obviously the extent does matter). The issue goes to scientific competence. Quality scientists not only avoid the pitfalls which come from abandoning the scientific method, they also refuse to rely on the work of those who do. The biggest outrage regarding the hockey stick was not the shoddiness of the work. It was the acceptance by other scientists of a study which overturned all previous scientific understanding on the subject without so much as a raised eyebrow (no audit, no replication, no nothing). It would be an outrage, even if Mann’s work had eventually turned out to be flawless.
    
    It isn’t that the lack of transparency automatically makes the work wrong. It’s that it makes the work and the asserted “findings” untrustworthy. It’s a failure to use the scientific method. No scientific method? Then it’s not reputable science. It hasn’t reached the threshold that quality scientists ought to demand, if it is going to be considered reliable.
  - John
    
    Posted May 28, 2009 at 11:34 AM | Permalink
    
    Re: stan (#24),
    “Steve: To my knowledge, USHCN has not published source code; their adjustments are described in cursory publications. …”
    
    The problem is that this, while it may not invalidate results, takes the discourse out of the realm of a scientific discourse. One of the immensely harmful confusions of our time is the confusion of scientific and legal argument. Denying information to a “jury” is legal argumentation. Denying information to your scientific peers is a practical guarantee of being embarrassed. The fact that you might be well-intentioned and convinced of the correctness of your interpretation is no defense against shutting out your peers from relevant disciplines and refusing the advice of experts where your own understanding is less than expert (e.g. most scientific investigators vs. professional statisticians).
Steve McIntyre

Posted May 27, 2009 at 10:22 PM | Permalink

OK, it wasn’t Reader’s Digest. It was International Journal of Climatology. One of Stephen Schneider’s … debaters, Ben Santer, wrote:

In summary, considerable scientific progress has been made since the first report of the U.S. Climate Change Science Program (Karl et al., 2006). There is no longer a serious and fundamental discrepancy between modelled and observed trends in tropical lapse rates, despite DCPS07’s incorrect claim to the contrary. Progress has been achieved by the development of new TSST , TL+O, and T2LT datasets …

One of the ways that this discrepancy was “resolved” – and we’ve not talked about this yet – was to eliminate the comparison of satellite temperatures to GISS and NOAA surface temperatures. CCSP had used GISS and NOAA. Santer eliminated them, replacing them by the “independent” series: ERSST v2 and ERSST v3. Santer:

The other two SST products are versions 2 and 3 of the NOAA ERSST (‘Extended Reconstructed SST’) dataset

GFDL CM2.1 was one of the models in the Santer comparison. You;d think that it must have done pretty well in a low frequency comparison of models and observations.
John A

Posted May 27, 2009 at 10:34 PM | Permalink

This is beyond a joke. This is wholesale …snip
MarcH

Posted May 27, 2009 at 10:41 PM | Permalink

Seems like “The Ministry of Truth” have been hard at it again. In Newspeak this might read as “NOAA ERSST results malreported doubleplusungood-rectify”
To paraphrase the great George Orwell “He who controls the data, controls the past. He who controls the past, controls the climate.”
rxc

Posted May 27, 2009 at 11:04 PM | Permalink

This is what is called “generating data”. I used to regulate nuclear power plants, and we once had one of the reactor vendors try to do this, because it was not possible to measure some important parameters during an experiment. They used a sophisticated CFD model to “generate data” that was used to validate a large systems model.

We laughed them out of the room.
Manfred

Posted May 28, 2009 at 1:12 AM | Permalink

was this lowering of past tropical sst also included in the computation of the global temperature trend ?
david elder

Posted May 28, 2009 at 1:17 AM | Permalink

I am trying to grasp what this new work even if valid would really tell us. The key question surely is the AGW issue. The old and new tropical SST estimates differ prior to 1880, but our greenhouse emissions did not begin to rise until well after this, from around mid-20th-century. There would seem to be little change in the SST estimates for that period.
PaulM

Posted May 28, 2009 at 2:17 AM | Permalink

Ah, but they say they use “improved statistical methods”, and “Most of the improvements are justified by testing with simulated data.” so how could there be any doubt?
So the adjustment for 1880 is -0.3C. And what is the v3 SST anomaly for 1880? Yes, about -0.3C.
A useful project would be to collate all these examples of ‘adjustments’, including
1. Land temperatures
2. Sea surface temperatures
3. Satellite troposphere temperatures (adjusted upwards)
4. Sea level rise (eg Cazenave 2008, raw satellite data shows no rise, until a ‘correction’ obtained from modelling, is applied).
Other examples?
- JamesG
  
  Posted May 28, 2009 at 8:32 AM | Permalink
  
  Re: PaulM (#11),
  You can add radiosondes to the list.
Sylvain

Posted May 28, 2009 at 2:27 AM | Permalink

Is it just me or they didn’t bother to fix the known 1940s glitch reported by Thompson more than a year ago?

Even though it was reported by Steve earlier.
Papertiger

Posted May 28, 2009 at 3:32 AM | Permalink

They are trying to crumple, spill coffee on, and generally distress the TANG Memo saying Bush was AWOL to make believe it was written on a seventies era typewriter, when everyone knows there wasn’t such a thing as Times New Roman font with nerling, before Windows 98. < just on the off chance there is someone reading Climate Audit who would like a non climate world analogy.

Isn’t this an attempt to conjour the tropical troposphere agw fingerprint?
Geoff Sherrington

Posted May 28, 2009 at 4:44 AM | Permalink

On this and other famous blogs I have made the accusation that some of the fine texture wriggles in atmospheric CO2 levels are added cosmetics to make the graphs more believably like the Mauna Loa data. It is hard to imagine such fine detail preseved from Barrow to the South Pole (where suddenly the phase shifts 180 degrees or so) when CO2 is described as a well-mixed gas.

Here was have a worse problem, where the graph shown first above is derived from a synthesis of a low frequency trends and an added high frequency set of wriggles. As admitted,

To make our study more realistic, HF variations from observations are added. The HF observations are from a combination of the optimum interpolation (OI) SST over oceans and from the Global Historical Climate Network (GHCN) over land, for the recent period (Reynolds et al. 2002; Peterson and Vose 1997).

Further, from Smith 2008-

Thus, the same HF anomalies are repeated with a 10-yr cycle over the 1860–2000 period. To simulate random errors in the test data, the variance from the base period is scaled by random noise to-signal variance ratio estimates.

Is this science or is this visual graphics with the intent to decieve? Ask that qiestion of the poor statistician who one day uses these values for a power spectrum analysis, without knowing that they are partly synthetic.
Demesure

Posted May 28, 2009 at 5:11 AM | Permalink

Each time “considerable scientific progress” is made in climate science,it’s worse than expected.
James P

Posted May 28, 2009 at 5:33 AM | Permalink

In UK tabloid vernacular, “you couldn’t make it up”. Except, of course, they just did…
Henry

Posted May 28, 2009 at 5:49 AM | Permalink

snip – enough piling on.
bill-tb

Posted May 28, 2009 at 6:09 AM | Permalink

snip – please don’t pile on.
hunter

Posted May 28, 2009 at 6:16 AM | Permalink

snip – I ask that people refrain/limit “venting” and “piling on”.
Craig Loehle

Posted May 28, 2009 at 6:58 AM | Permalink

Ok I am “NOT” piling on…but I guess that leaves me speachless.
Jeff Id

Posted May 28, 2009 at 7:04 AM | Permalink

I’m angry and getting tired. Mashing data to fit the models and models to fit the data, it looks like the ‘peers’ all work for the same club. This is ugly work.
Stuart H

Posted May 28, 2009 at 7:19 AM | Permalink

Bournemouth attacks Met Office for ‘negative’ forecasts that ‘cost millions’

Headline from an article in the Daily Telegraph which you can read here: http://www.telegraph.co.uk/news/newstopics/howaboutthat/5394066/Bournemouth-attacks-Met-Office-for-negative-forecasts-that-cost-millions.html

A quote from the Met Office:

Helen Chivers, from the Met Office, admitted it made a mistake with its Bank Holiday predictions, but denied that forecasters had become overcautious.

“We get observations from satellites and local stations and all of that goes into the computers, and you take the best guidance out of that,” she said.

mmmmmmm would this computer by any chance use models and what did they do pre-computers?

I quite understand if you think this is off topic?
Steve McIntyre

Posted May 28, 2009 at 7:47 AM | Permalink

As a point of clarification on piling on here: as always, I don’t want people to editorialize about policy and that sort of stuff. Of course, there are consequences – that’s why we’re interested. If people want to object to the mixing of models into data, that’s OK; I just don’t want to have a lot of commentary about the world economy or generalized comments about AGW. I understand the concerns, but editorially they can easily make every thread look identical – so try to connect things to this particular data set.
Bill Illis

Posted May 28, 2009 at 8:13 AM | Permalink

This has been going on for a long time now. We don’t even know how much the raw data has been adjusted.

Ocean SSTs were always a problem to start with since the records are so sparse especially for the earlier time periods and for areas outside the main shipping routes.

But we have no idea how much the global temperature record has been “up-trended” by the adjustments. At least, the NCDC has confirmed the trend in US temps has been adjusted upward by 0.425C since about 1920.

Effectively we have to rely on just the satellite temperatures from now on.
- Steve McIntyre
  
  Posted May 28, 2009 at 8:19 AM | Permalink
  
  Re: Bill Illis (#25),
  
  At least, the NCDC has confirmed the trend in US temps has been adjusted upward by 0.425C since about 1920.
  
  Please don’t overstate things. I’m as annoyed at the next person, but there has been no such “confirmation”.
Craig Loehle

Posted May 28, 2009 at 8:35 AM | Permalink

If the purpose were to reconstruct SST in the past “just for basic science” then the circularity would not exist. BUT if you use models to reconstruct the SST and then test the models against this reconstruction (for purposes of proving global warming)…why is that not obviously a problem?
- Steve McIntyre
  
  Posted May 28, 2009 at 8:58 AM | Permalink
  
  Re: Craig Loehle (#29),
  
  Craig, of course, it’s a problem. It’s totally absurd.
  
  Something else that I’ll try to get to sometime – remind me if I forget – the GISS TRP data set is a bit of an outlier relative to CRU, NOAA and ERSST. It would be worth checking to see if GISS models verify better against GISS TRP data (and HadCRU against HadCRU TRP data.)
- Jeff Id
  
  Posted May 28, 2009 at 9:59 AM | Permalink
  
  Re: Craig Loehle (#29),
  
  You’d never get it through an undergrad class in any science. Somebody’s got their hand in the cookie jar.
Bill Illis

Posted May 28, 2009 at 8:59 AM | Permalink

USHCNV2 adjustments are included in Figure 4 and Figure 7 in this paper.

Click to access 10.1175_2008BAMS2613.1.pdf
Steve McIntyre

Posted May 28, 2009 at 9:21 AM | Permalink

The article doesn’t prove the efficacy of their new changepoint adjustments – it cites their own article: Menne and Williams, J Clim, in press, as statistical authority. What bothers me, here as with Mann, is the use of some hot-off-the-press and poorly understood statistical methodology developed in-house for a controversial applied result.

I haven’t seen the paper in question yet – maybe it’s brilliantly documented and an R package implementing the procedures has already been archived at CRAN. But I’d be astonished if that were the case.
- Mike B
  
  Posted May 28, 2009 at 10:27 AM | Permalink
  
  Re: Steve McIntyre (#32),
  
  Perhaps I’m missing the point somewhat, but from the way I read your graph, these latest adjustments cool the LIA, and don’t impact much else.
  
  One can quarrel with the methodology, or the serial check-kiting of the references, but in the large scheme of things, this doesn’t really change much, does it? It’s not like that graph just happens to track C02 emmissions or anything like that.
Dishman

Posted May 28, 2009 at 9:47 AM | Permalink

From what I’m understanding, the low frequency elements for early portions of this data set are now synthetic. This means that this data set is not valid for low frequency analysis.

If you’re checking for LF correlation against various oscillators (PDO, ENSO, AMO, SSN, etc), this data set is not suitable and will produce synthetic results.

It should be flagged as such.
Jason

Posted May 28, 2009 at 10:58 AM | Permalink

Mike B.,

The adjustments do not just cool the late 19th century.

They also increase the amount of warming since then.

More warming implies greater climate sensitivity.

Thus, lower temperatures in the 19th century result in higher temperature predictions for the 21st century.
- Dean
  
  Posted May 29, 2009 at 6:22 AM | Permalink
  
  Re: Jason (#37),
  
  Jason,
  
  You’re on the right track here… but the key is that this is non-CO2 sensitivity. If we take it at face value (and I’ll leave it in the capable statistical hands of Steve, Ryan and Jeff to show how bad an assumption that is), then there’s a huge warming that cannot be attributed to CO2.
  
  Eyeballing figure 2, from 1890 to 1950, the anomaly went from -0.6 to -0.15, or 0.75°C/decade. From 1970-2000 it went from -0.2 to 0.05, or about a 0.83°C/decade.
  
  So fine, say it was cooler back in the late 1800s, but please explain how something that was completely natural back then is now obviously the result of man.
  - Dean
    
    Posted May 29, 2009 at 6:49 AM | Permalink
    
    Re: Dean (#57),
    
    OOPS, in my last post the rates of change should read 0.075°C/decade and 0.083°C/decade… Off by a factor of ten…
  - Steve McIntyre
    
    Posted May 29, 2009 at 7:11 AM | Permalink
    
    Re: Dean (#56),
    
    I see little purpose in trying to convert using CO2 impact as a metric for every discussion.
    
    What we’ve learned here is something quite different – that seemingly innocuous parameter selections in ERSST have a large impact on SSTs before 1880. Why is this? What is it exactly in the interaction of data and parameter selection and methodology that leads to this enormous and unexpected impact? And what are the lessons of this for related applications of the methodology (which include studies that we’ve spent a lot of time – Steig et al 2009, Mann et al 2008.).
    - Dean
      
      Posted May 29, 2009 at 7:21 AM | Permalink
      
      Re: Steve McIntyre (#58),
      
      Steve,
      
      But the “worse than we thought” comment that they used implies just that: CO2 is a much worse problem than we even imagined it was. I was just pointing out the flaw in their logic.
      
      Steve: “They” didn’t use ‘worse than we thought’ in this context. I used it sarcastically to comment on the methodology. And until we understand the instability in the SST estimates from parameter selection, it’s pointless speculating on its downstream impact and I don’t want to spend time or energy on such speculations at this time or on this thread.
Ryan O

Posted May 28, 2009 at 11:30 AM | Permalink

Hm.
.
I read their method descriptions, and I do not believe they incorporate any model data into the observations. My read of their methods is the following:
.
1. They extract the LF component from the model.
2. They sample the LF component at various rates to determine the MSE associated with each rate.
3. Based on this, they determine the minimum sampling rate below which the reconstruction algorithm will damp the values toward a zero anomaly.
.
They then perform a similar analysis using synthetic HF data based on actual observations from 1982-2001.
.
I do not believe any of the model data is then reincorporated into the observations. The point of the study seems to be focused on determining the sampling cutoff below which the reconstruction error widens dramatically. In the case of V2, it had a higher cutoff, resulting in the pre-1890 reconstruction being damped more than V3.
.
So that alleviates one of my concerns. Without seeing the reconstruction statistics nor a full description of how the reconstructions are done, I now have an additional (and greater) concern about the reconstruction methods generating bad trends. Prior to this, I didn’t realize how much of the SST data was reconstructed instead of actual data. The data sets are already synthetic.
.
And with my only other encounters with “reconstructions” being the Mannian and Steigian varieties, I will express a deep reservation about the accuracy of the SST reconstruction – especially since they practically admit that not enough “modes” (I’m assuming PCs) are used in certain areas:
.

That is because the HF analysis will
damp anomalies in regions where too few modes are
chosen for the analysis. For example, the Niño-3.4
(5°S–5°N, 120°–170°W) area SSTs may be slightly
damped early in the twentieth century, as discussed in
section 2c (V. Kousky 2006, personal communication;
see Fig. 8).

.
An additional concern is that when the error gets too high, the global averages are apparently calculated by simply truncating the offending data:
.

Damping of large-scale averages may be reduced
by eliminating poorly sampled regions because
anomalies in those regions may be greatly damped. In
Smith et al. (2005) error estimates were used to show
that most Arctic and Antarctic anomalies are unreliable
and those regions were removed from the globalaverage
computation.
…
Using the default SR05 parameters, the merged global
MSE is minimized when the 25° region has at least
35% sampling. Using the improved tuning discussed
above, the MSE is minimized when there is at least 20%
sampling. The improved parameters yield a lower optimal
sampling for global averages because they produce
a less-damped analysis in the presence of sparse
sampling. However, even with the improved parameters
the MSE for global averages can be reduced by
omitting some sparsely sampled regions.

.
This would seem to cause potential biases when just using the “global average” for subsequent analysis, since the spatial content of the average changes with time. To me, this seems to be simply a method for reducing the uncertainty ranges merely by truncating the “bad” data, rather than simply admitting that the sampling is so sparse that large uncertainties are appropriate.
.
On the HF side, they repeat the 1982-1991 (half of the sampling period) data over and over again throughout the 1860-2000 test period – yet they do not discuss any kind of correction for spatial or temporal autocorrelation. If you repeat the same data over and over again, I would expect some significant corrections would need to be applied. The article doesn’t mention any, and that, too, is concerning.
.
I also do not understand their satellite bias adjustments yet. That will require some additional reading. Something seems not quite right about it, but I cannot put my finger on it yet.
- Steve McIntyre
  
  Posted May 28, 2009 at 12:15 PM | Permalink
  
  Re: Ryan O (#38), you could be right about model data not being used. If models aren’t used, then it seems like a very poor idea to open the explanation of methodology changes with an exposition of model properties. But maybe it’s just execrable exposition.
  
  There is a connection back to Mannianism from this literature. MBH cites Smith-Reynolds “optimal interpolation” as the inspiration for Mannian “climate fields” – which are nothing more than principal components. The reduction of the Steig AVHRR data set to three PCs is very much in the same tradition – so your perception of a linkage here is very astute.
  
  Equally some of the puzzling features of ERSST that are unclear from the articles may seem a bit clearer if the methods are interpreted in Mann-Steig terms as you suggest – an excellent idea.
  - Ryan O
    
    Posted May 28, 2009 at 12:18 PM | Permalink
    
    Re: Steve McIntyre (#42), When I read the abstract, I thought the same thing you did. It wasn’t until the second read of the LF part of the paper (because I didn’t understand it the first time) that I realized that the model data is just used as a test target.
    .
    Regardless, I’m still wondering where their verification statistics are. They obviously ran multiple reconstructions, yet the article mentions no verification statistics at all. Is there an SI associated with this with the statistics tabulated?
  - Steve McIntyre
    
    Posted May 28, 2009 at 1:15 PM | Permalink
    
    Re: Steve McIntyre (#42),
    Here’s the relevant quote from MBH:
    
    Our approach to climate pattern reconstruction relates closely to statistical approaches which have recently been applied to the problem of filling-in sparse early instrumental climate fields, based on calibration of the sparse sub-networks against the more widespread patterns of variability that can be resolved in shorter data sets [22. Smith, T. M., Reynolds, R. W., Livezey, R. E. & Stokes, D. C. Reconstruction of historical sea surface temperatures using empirical orthogonal functions. J. Clim. 9, 1403–1420 (1996). 23. Kaplan, A. et al. Analyses of global sea surface temperature 1856–1991. J. Geophys. Res. (in the press).
    
    Smith, Reynolds 1996 is the predecessor to the ERSST analyses.
    - Ryan O
      
      Posted May 28, 2009 at 2:09 PM | Permalink
      
      Re: Steve McIntyre (#45), Ah. Now I have another paper to read. 😉
    - Steve McIntyre
      
      Posted May 28, 2009 at 2:44 PM | Permalink
      
      Re: Ryan O (#46),
      
      Ryan, here’s a quote form Smith et al 1996 that pertains to your recent analysis of Steig retained PCs:
      
      This is exactly what’s going on with the Peninsula.
    - Ryan O
      
      Posted May 28, 2009 at 3:01 PM | Permalink
      
      Re: Steve McIntyre (#50), Fantastic. That is EXACTLY what is happening.
      .
      Jeff threw up the verification stats post on tAV, and not only can you see it in the reconstruction PCs themselves, you can also see that effect in the RegEM verification for Steig. RegEM simply ignores some of the actual station data because truncating to 3 eigenvectors does not provide enough flexibility for the fit.
      .
      I added the 2003 and 2004 Smith papers to my collection for reading as well. Hopefully the earlier ones have good descriptions of the reconstruction methods. It would be interesting to replicate their work and perform the same kind of sensitivity analysis we did with Steig.
Alan S. Blue

Posted May 28, 2009 at 11:58 AM | Permalink

Mike B,
.
Beyond the shift in the trend (and the focus always on the trend, not the absolute temperature values) there is also the effect on examining the models.
.
If I’m understanding correctly, using this same technique would make -any- model provide better hindcasts. Where a hindcast is just using your model to predict what should have happened historically and comparing that with what did actually happen.
.
Picture what would happen with a clearly crazy model. Assume it predicts temperatures falling off a cliff at 5C per century monotonically. Using the techniques outlined above, all the early -physical-measurements- would be strongly adjusted upwards. To better accord with the model. Voila – instant ice age data.
.
Not just a -model- that is -predicting- an ice age. But the raw data themselves have been massaged into becoming pretty darn convincing predictor of an upcoming ice age entirely on their own.
Ryan O

Posted May 28, 2009 at 12:35 PM | Permalink

You know, I just had another thought that has me puzzled at the moment.
.
The concern with the sets is the uncertainty associated with the spotty historical raw data. This revision was attempting to quantify the minimum required sampling rate to provide an accurate picture of SSTs during periods of spotty coverage.
.
So to test, they took a coupled ocean model – but not an unforced one – to provide a sampling testbed:
.

This coupled general circulation model (CGCM) simulates
the large-scale climate signal using variations in forcing
by greenhouse gases, aerosols, and the best available
estimates of solar radiation changes (Delworth et al.
2006).

.
So the first puzzling aspect is why they would take a forced model to simulate a period where the forcings are presumed to be negligible. Why not use a control run?
.
The second puzzling aspect is that the coupled ocean models have known problems with reproducing regional characteristics. For example, one of the big problems is the strange massive cooling off the east African coast that happens sometimes during spin-up (not sure if GFDL 2.1 has this specific problem or not). So if you’re already not going to use an unforced model, why not simply resample actual data to determine the appropriate cutoff?
Ivan

Posted May 28, 2009 at 2:13 PM | Permalink

CA readers were undoubtedly ready for adjustments, but, using the technical terminology of the leading climate journals, the adjustments were “worse than we thought”. ERSST v3 lowered SSTs before 1880 by up to 0.3 deg C.

One more application of Hansen’s Golden Rule of climate science: Old data are always too warm, while new data are always too cold.
Steve McIntyre

Posted May 28, 2009 at 2:24 PM | Permalink

I’ve amended the head post in light of Ryan O’s comments. As Ryan observes, the models were used to change certain parameters in the algorithm. What’s surprising is the size of the impact of these parameter settings on the results – amounting to up to 0.3 deg C in the 19th century – something that readers should keep in mind when presented with arbitrary parameter choices in things like Steig.
- Ryan O
  
  Posted May 28, 2009 at 3:06 PM | Permalink
  
  Re: Steve McIntyre (#48), The size of the change has me wondering, too. That’s an awfully big jump. Methinks one might be able to generate a “64 flavors of SST” like Burger and Cubasch’s analysis of MBH.
Steve McIntyre

Posted May 28, 2009 at 2:25 PM | Permalink

For reference, here’s what I removed from the head post in amending it in light of Ryan’s comment:

As I read the article, here’s what they’ve done. They state that GFDL CM2.1

simulates interdecadal signals with characteristics similar to the observations where data are available. However, shorter period signals are not simulated as well by this climate model.

In areas where data is spotty, they appear to coerce the “observations” to model output, described as follows:

“output is filtered to extract the model LF component. This is done using the 15-yr LF filtering described by SR05. Briefly, the LF is computed by first averaging anomalies spatially over 15° latitude–longitude moving areas and then annually. The smoothed annual averages are then median filtered using 15 annual averages to produce the LF anomaly analysis. Because the model outputs are complete, there is no damping of this test LF output. These model LF anomalies are used for the 1860–2000 test period.

They continue:

To make our study more realistic, HF variations from observations are added. The HF observations are from a combination of the optimum interpolation (OI) SST over oceans and from the Global Historical Climate Network (GHCN) over land, for the recent period (Reynolds et al. 2002; Peterson and Vose 1997). To form the merged complete data, the OI SST anomalies are averaged to the monthly 5° grid boxes for 1982–2001 and merged with the GHCN 5° monthly LST anomalies. These data fill nearly all monthly 5° grid squares within 1982–2001. The remaining unfilled grid squares are filled using linear spatial interpolation of the anomalies from their nearest neighbors.

In other news, either Nature or Science or Reader’s Digest, I forget which, reported that the newest climate models are “better than we thought”, noting a “remarkable” low-frequency fit between ERSST v3 and GFDL Model CM2.1.
david elder

Posted May 28, 2009 at 5:12 PM | Permalink

Jason (#37) says:

‘The adjustments do not just cool the late 19th century. They also increase the amount of warming since then. More warming implies greater climate sensitivity. Thus, lower temperatures in the 19th century result in higher temperature predictions for the 21st century.’

This is what I am wrestling with (#10). Would this work if valid imply increased climate sensitivity to human emissions? If so, why would the putative extra warming be essentially prior to 1880, well before our emissions rose substantially? Wouldn’t the new work even if valid only imply a higher estimate of natural variation in the late 19th century?
Jeff Alberts

Posted May 28, 2009 at 7:34 PM | Permalink

So, without the SI, this paper deserves no more attention than if I had published it.
Jeff Alberts

Posted May 28, 2009 at 7:35 PM | Permalink

A whopping .2c since about 1860. Simply alarming.
Jan W Merks

Posted Jun 8, 2009 at 6:06 PM | Permalink

Steve, Why nor chart a sampling variogram and derive the statistics for this set?

3 Trackbacks

By When the Correction Becomes the Signal « the Air Vent on May 28, 2009 at 8:29 AM

[…] “Worse than We Thought” […]
By Temperature Data - “Worse than we thought” – NearWalden on May 28, 2009 at 11:42 AM

[…] how much the data we have from the past continues to be massaged and manipulated. At Climate Audit, “Worse than We Thought” outlines the recent changes to one […]
By Peggio di quanto si pensasse | Climate Monitor on Jun 11, 2009 at 3:28 AM

[…] tratto il titolo di questo post da un articolo pubblicato sul blog di Steve McIntyre, Climate Audit, lo statistico che nella vita ha deciso di rovinare le giornate all’IPCC ed ai professionisti […]

Climate Audit