Kaufmann and Stern [2005] on GCMs

I have much unfinished business with multiproxy studies, but am getting dragged into discussing GCMs, where I wish to make clear that I am not familiar with the literature and am merely commenting on individual articles as I read them in the context of current discussion. If I miss some nuance, I apologize and will try to correct. Rasmus observed here that:

the most appropriate null-distributions are derived from long control simulations performed with such GCMs. The GCMs embody much more physically-based information, and do provide a physically consistent representation of the radiative balance, energy distribution and dynamical processes in our climate system.

Kaufmann and Stern [2005], A Statistical Evaluation of GCMs: Modeling the Temporal Relation between Radiative Forcing and Global Surface Temperature here consider the interesting qustion of whether GCMs out-perform elementary statistical models, given the same inputs, and come to very negative conclusions about GCMs and even question whether GCMs are an appropriate tool for assessing the issue of global temperature change (as opposed to regional impacts.) They say:

none of the GCM’s have explanatory power for observed temperature additional to that provided by the radiative forcing variables that are used to simulate the GCM…

Curiously, Kaufmann weighed in at realclimate in a thread which was specifically discussing the utility of GCMs for making null distributions, but did not mention this article, instead mentioning some of his other work purporting to show cointegration between CO2 and global temperature change. In passing, I might observe once again that a GCM step of 25 years takes 1 calendar day. To obtain a null distribution of 1000 runs of 25,000 years (still less than 1 obliquity cycle) is well beyond the range of present computing power. So the "null distributions" that Rasmus is talking about are not "null distributions" as they are understood in statistics, where 1000 would be a bare minimum. Anyway on to some excerpts from Kaufmann and Stern.

The Abstract is as follows:

Abstract: We evaluate the statistical adequacy of three general circulation models (GCMs) by testing three aspects of a GCM’s ability to reconstruct the historical record for global surface temperature: (1) how well the GCMs track observed temperature; (2) are the residuals from GCM simulations random (white noise) or are they systematic (red noise or a stochastic trend); (3) what is the explanatory power of the GCMs compared to a simple alternative model, which assumes that temperature is a linear function of radiative forcing. The results indicate that three of the eight experiments considered fail to reconstruct temperature accurately; the GCM errors are either red noise processes or contain a systematic error, and the radiative forcing variable used to simulate the GCM’s have considerable explanatory power relative to GCM simulations of global temperature. The GFDL model is superior to the other models considered. Three out of four Hadley Centre experiments also pass all the tests but show a poorer goodness of fit. The Max Planck model appears to perform poorly relative to the other two models.

The running text contains the following observations:

These results indicate that the GCM temperature reconstruction does not add significantly to the explanatory power provided by the radiative forcing aggregate that is used to simulate the GCM (Table 4). Conversely, we strongly reject (p< .01) restrictions that eliminate X and/or DX for all eight experiments. This indicates that the radiative forcing variables used to simulate the GCM have explanatory information about observed surface temperature that is not present in the GCM simulation for global surface temperature.

As described in section 4.3, none of the GCM’s have explanatory power for observed temperature additional to that provided by the radiative forcing variables that are used to simulate the GCM… we cannot make any precise determinations as to the cause for the explanatory power of the model inputs relative to the GCM output.

Conclusions about the effect of human activity on surface temperature are based in large part on comparisons of observed temperature and GCM simulations (Mitchell et al., 2001). But this may not be most effective means for attribution: the noise in even the best simulation (in this case the GFDL simulation) increases the uncertainty involved in attributing and predicting climate change. This uncertainty could be reduced by using appropriately specified and estimated statistical models to simulate the relation between human activity and observed temperature (Kaufmann and Stern, 1997, 2002; Stern and Kaufmann, 2000; Stern, 2004). Initial results with a new generation of time series models that are constrained by recently available observations on ocean heat content and energy balance considerations are very promising, yielding realistic rates of adjustment of atmospheric temperature to long-run equilibrium (Stern, 2004).

I’ve not attempted to parse through all the reasons for their findings and do not specifically endorse them. I’m merely mentioning them. If their conclusions are correct, then the GCMs do not actually assist in the calculation of global temperature, as compared to linear models from the input forcing series themselves.

Reference: Kaufmann and Stern [2005], A Statistical Evaluation of GCMs: Modeling the Temporal Relation between Radiative Forcing and Global Surface Temperature here. http://www.bu.edu/cees/people/faculty/kaufmann/documents/Model-temporal-relation.pdf


  1. John A
    Posted Dec 21, 2005 at 11:14 AM | Permalink

    What are we paying all this money for?

  2. Terry
    Posted Dec 21, 2005 at 11:38 AM | Permalink

    The role of GCM’s in the AGW debate has always baffled me. I can understand their use in modeling local climate — to model climate at a fine scale requires a complicated model capable of resolving small areas … no argument there.

    But what do they add when talking about AGW which is by definition global and where local effects are secondary or irrelevant?

    It seems we should be focusing on a reduced-form system where global temperature is a function of various forcings with perhaps some lags to incorporate persistence in the system. Perhaps some non-linear terms too (although it sounds like the model would quickly become intractible … which would tell us something about our actual level of understanding and predictive ability).

    Such models would give a lot of intuition about how much we actually understand the issue as well as quick and intuitive estimates of the uncertainty about the predicitons. I seem to remember Michael’s making some good points about how the magnitude of C02 forcing affected predictions. I also seem to remember RealClimate making some good points about the magnitude of the effects of various forcings.

    I asked this over at RealClimate and Gavin gave a good reply saying that such simple models do exist (I can’t have been the only one who has thought of this). So why aren’t those the focus of attention and the workhorse for prediction purposes? Why doesn’t someone write a summary of these simple models so we can play around with them? Anyone with a modest scientific background would be able to manipulate and understand them. They would make things much more transparent.

  3. Pat Frank
    Posted Dec 21, 2005 at 7:37 PM | Permalink

    A climate modeling study that apparently hasn’t been widely appreciated is relevent to the GCM question. That’s (2004) Craig Loehle “Climate change: detection and attribution of trends from long-term geologic data” Ecological Modeling 171(4), 433-450. Sorry I don’t have a link.

    Loehle fit independent long-term climate trends obtained from a South African cave stalagmite and another from the Sargasso Sea. Both series extend over about 3000 years. He excluded the 20th century temperature trend from his fits. His models are all a simple linear term times a set of cosine functions, all with no physical meaning. They end up nicely reproducing known climate cycles, and pass right through the MWP and LIA. The extrapolation of the models into the 20th century is especially fun because they produce a cooling trend relatively early in the 21st century, with the temperature maximum right about, ummm, now.

    Here’s the abstract: “Two questions about climate change remain open: detection and attribution. Detection of change for a complex phenomenon like climate is far from simple, because of the necessary averaging and correcting of the various data sources. Given that change over some period is detected, how do we attribute that change to natural versus anthropogenic causes? Historical data may provide key insights in these critical areas. If historical climate data exhibit regularities such as cycles, then these cycles may be considered to be the “normal” behavior of the system, in which case deviations from the “normal” pattern would be evidence for anthropogenic effects on climate. This study uses this approach to examine the global warming question. Two 3000-year temperature series with minimal dating error were analyzed. A total of seven time-series models were fit to the two temperature series and to an average of the two series. None of these models used 20th Century data. In all cases, a good to excellent fit was obtained. Of the seven models, six show a warming trend over the 20th Century similar in timing and magnitude to the Northern Hemisphere instrumental series. One of the models passes right through the 20th Century data. These results suggest that 20th Century warming trends are plausibly a continuation of past climate patterns. Results are not precise enough to solve the attribution problem by partitioning warming into natural versus human-induced components. However, anywhere from a major portion to all of the warming of the 20th Century could plausibly result from natural causes according to these results. Six of the models project a cooling trend (in the absence of other forcings) over the next 200 years of 0.2-1.4 degrees C.”

  4. Paul
    Posted Dec 21, 2005 at 9:32 PM | Permalink

    But where are the physics? (oops! Sorry…wrong thread)

  5. Terry
    Posted Dec 22, 2005 at 12:30 AM | Permalink

    Re: #3.

    This sounds amazing because 1) it is such an abvious thing to do, and 2) I haven’t heard a peep about it.

    Any ideas why we haven’t heard about it? What kind of a journal is it? Are these series in the MBH proxy studies? If so, shouldn’t the MBH algorithm have put a lot of weight on them because they fit the 20th century data well?

  6. Posted Dec 22, 2005 at 3:18 AM | Permalink

    re 4: The problem with cycles is that they are non-linear, and not a priori derived from the physics as they relate to model geometry. You may compare it to resonance when shaking a tree.

    Here is a simple exercise in excel using only four frequencies: millenial, bicentenial, multidecadal and enso

  7. Paul Gosling
    Posted Dec 22, 2005 at 4:54 AM | Permalink

    The Sargasso sea data is more convincing than the S. Africa data, which is not very convincing at all. Even so, is this not a case of fitting your model to the data, then claiming that the model describes your data so it must be correct?

  8. Posted Dec 22, 2005 at 5:09 AM | Permalink

    re 7 no it claiming that cycles have a stronger signal than forcings, which is clearly the case in ENSO. The problem is that mechanisms for cycles are speculative.

    see eg Keeling and Whorf, who postulate tidal mechanisms

    C. D. Keeling and T. P. Whorf, 2000, The 1,800-year oceanic tidal cycle: A possible cause of rapid climate change, PNAS, April 11, 2000; 97(8): 3814 – 3819.

    C. D. Keeling and T. P. Whorf, 1997, Possible forcing of global temperature by the oceanic tides, PNAS 1997 Aug 5;94(16):8321-8

  9. John A
    Posted Dec 22, 2005 at 5:27 AM | Permalink

    Curiously, the late Theodor Landscheidt also predicted a cooling to 2030. I should be then old enough to see yet another “Global Cooling” scare.

  10. Posted Dec 22, 2005 at 6:09 AM | Permalink

    L.B. Klyashtorin & A.A. Lyubushin, 2003, On the Coherence between Dynamics of the World Fuel Consumption and Global Temperature Anomaly, Energy & Environment Vol 14 No. 6 p773-783

    Analysis of the long-term dynamics of World Fuel Consumption (WFC) and the Global Temperature anomaly (dT) for the last 140 years (1961-2000) shows that unlike the monotonously and exponentially increasing WFC, the dynamics of global dT against the background of a linear, age-long trend, undergo quasi-cyclic fluctuations with about 60 a year period. No true linear correlation has taken place between the dT and WFC dynamics in the last century.
    Spectral analysis of reconstructed temperature for the last 1420 years and instrumentally measured for the last 140 years global dT shows that dominant period for its variations for the last 1000 years lies in the 50-60 years interval.
    Modeling of roughly 60-years cyclic dT changes suggest that the observed
    rise of dT will flatten in the next 5-10 years, and that we might expect a lowering of dT by nearly 1-0.15°C to the end of the 2020s.

  11. Jeff Norman
    Posted Dec 22, 2005 at 6:22 AM | Permalink

    Re:#9 John,

    Why, are you not old enough now? 😉

  12. Steve McIntyre
    Posted Dec 22, 2005 at 11:07 AM | Permalink

    Kaufmann posted about this article at realclimate today as follows (#60):

    I would like to pick up on a comment made by per (#58) about testing GCM’s against real-world data. As an outsider to the GCM community, I did such an analysis by testing whether the exogenous inputs to GCM (radiative forcing of greenhouse gases and anthropogenic sulfur emissions) have explanatory power about observed temperature relative to the temperature forecast generated by the GCM. In summary, I found that the data used to simulate the model have information about observed temperature beyond the temperature data generated by the GCM. This implies that the GCM’s tested do not incorporate all of the explanatory power in the radiative forcing data in the temperature forecast. If you would like to see the paper, it is titled “A statistical evaluation of GCM’s: Modeling the temporal relation between radiative forcing and global surface temperature” and is available from my website

    Needless to say, this paper was not received well by some GCM modelers. The paper would usually have two good reviews and one review that wanted more changes. Together with my co-author, we made the requested changes (including adding an errors-in variables” approach). The back and fourth was so time consuming that in the most recent review, one reviewer now argues that we have to analyze the newest set of GCM runs – the runs from 2001 are too old.

    The reviewer did not state what the “current generation” of GCM forecasts are! Nor would the editor really push the reviewer to clarify which GCM experiments would satisfy him/her. I therefore ask readers what are the most recent set of GCM runs that simulate global temperature based on the historical change in radiative forcing and where I could obtain these data?

    Gavin replied. Worth reading. Free Rasmus now, free Rasmus now, free Rasmus now…

  13. John A
    Posted Dec 22, 2005 at 11:14 AM | Permalink

    Re: #11

    I feel like the two animals staring at the Revolutionary Slogan wall at the end of Animal Farm. I’m being told something now that I distinctly remember not being the case in the 1970s.

  14. hank
    Posted Dec 22, 2005 at 6:23 PM | Permalink

    > something now that I distinctly remember
    > not being the case in the 1970s

    I remember being taught, by science teachers, to expect that would be the case. They were right.

  15. Pat Frank
    Posted Dec 23, 2005 at 1:28 AM | Permalink

    Re: #5, the journal is published by Elsevier and seems to be of the usual peer-reviewed sort. I don’t know whether MBH used the Sargasso or South Africa proxies. Steve could answer that.

    Loehle’s paper has only one independent citation since publication, but that’s an interesting pretty high resolution Austrian O18 stalagmite proxy study, which showed clear evidence of a MWP and a LIA. It also corroborated M&M in showing a more intense MWP than the present. Likewise the Roman Warm Period. I wonder why the polar bears aren’t already extinct; they’ve had two good opportunities already.

    Anyway, that paper is Mangini, A., Spotl, C. and Verdes, P. (2005) “Reconstruction of temperature in the Central Alps during the past 2000 yr from a delta O-18 stalagmite record” Earth and Planetary Science Letters 235, 741-751. The proxy record also plots very closely onto tree-ring C-14 data, which the authors take to indicate a NH solar-climate connection.

    I don’t expect climatologists will pay much attention to Loehle’s study because it’s not based in physics as Paul already intimated in #4. Besides, it doesn’t support the AGW claim and so it’s wrong.

    Re: #7, Loehle’s fits EXcluded the 20th century, and were then propagated _through_ the 20th century. So, the fits were independent of the 20th century, and relied only on long-term periodicities detected in the earlier data sets. Several models then did a good job simulating 20th century temperature trends, and projected cooling later in the 21st century.

  16. Geoff Smith
    Posted Jan 6, 2006 at 9:46 PM | Permalink

    So many threads in the last month have dealt with the overlooked problems in climate models, but I imagine that CA readers would be interested to have a look at the recent posting by Hendrik Tennekes, who is the retired Director of Research, Royal Netherlands Meteorological Institute, on Roger Pielke Sr.’s Climate Science site ( http://climatesci.atmos.colostate.edu/2006/01/06/guest-weblog-reflections-of-a-climate-skeptic-henk-tennekes/ ).

    Dr. Tennekes gives an overview of some of the problems with models. Not to be missed is a link to one of his own cited papers, which argues against the view that “clouds are clocks”, i.e., “if we knew as much about clouds as we do about clocks, clouds would be just as predictable as clocks”.

  17. beng
    Posted Jan 18, 2006 at 10:17 AM | Permalink

    Alittle OT, but the below link (and pdf file) from Pielke Jr’s site has a fascinating cultural analysis on the GCM community:


  18. Steve McIntyre
    Posted Aug 9, 2006 at 6:07 AM | Permalink

    bender, thanks for pointing out the realclimate discussion on climate sensitivity. Here’s a comment by Isaac Held:

    I would also like to thank readers #1,#4, and #5 for pointing out this error in the figure. It is interesting how these things get through the review process!

    Another way of stating the results from this paper is that the feedbacks that we are moderately confident about (water vapor, lapse rate, and snow/sea ice albedo) seem to generate a sensitivity near the low end of the canonical range, with the more uncertain cloud feedbacks then providing the positive push, in these models, to generate all of the higher sensitivities. I think the picture that many of us had, speaking for myself at least, was that the first set of feedbacks brought us with moderate confidence to the middle of the canonical range, with cloud feedbacks, both positive and negative, then providing the spread about this midpoint. One evidently has to argue for a signficantly positive cloud feedback to get to the 3K sensitivity that various empirical studies seem to be pointing towards.

    We needed to make a lot of approximations in this analysis, especially for the cloud feedback term, because of the limitations of what we could do with the model results that have been archived, so it will be interesting to see if this picture holds up. If, in fact, this is an accurate diagnosis of what the models are doing, why is it that they all have positive cloud feedbacks? This is in itself a bit surprising given the diverse schemes used to predict clouds in these models.

    Doesn’t it seem ridiculous that Held is surprised at the allocation between cloud feedback and the rest-of-the-model and surprised that the cloud feedbacks are all strongly positive in the models?

    I recall an arch comment by Ellingson when he did an intercomparison of infrared schemes in GCMs in the early 1990s. He found that the radiation schemes were all over the place with some of them sipmly being wrong. However they all agreed on one thing – what happened with 2xCO2. He archly wondered about tuning.

    My understanding is that shortwave absorption by clouds is substantially under-estimated in many GCMs (18-30 wm-2 in ARESE II) and it’s disquieting to see this type of systemic effect together with such strong connection of model response to positive cloud feedback.

  19. Dave Dardinger
    Posted Aug 9, 2006 at 8:16 AM | Permalink

    My first response on reading this was, “What!” but then I realized that it’s to be expected, and rather a boon for those of us who think the whole AGW thingee is as much political as scientific (if not more so). It pretty much proves that there was a conscious or unconsious selection of models or model parameters to match expectations.

    If it can be verified that the major models all assume a high positive cloud feedback (I suppose I should run over to RC and see what the thread’s about), then it’s not just another nail in the coffin of AGW but a lowering of the coffin into the ground. It’s funny though that the friends of the occupant of the coffin don’t realize he’s dead but believe his report (via mental telepathy; see discussion with/concerning Hotblack Desiato in the Restaurant at the End of the Universe by Douglas Adams) that it’s just been kinda dark lately.

  20. Ron Cram
    Posted Mar 5, 2011 at 2:09 PM | Permalink

    Interesting but the link to the Kaufmann and Stern paper is broken. I was not able to locate the paper using Google Scholar either. I was able to locate a 2004 paper which appears to be a precursor of sorts to the 2005 paper quoted by Steve.

    See http://www.economics.rpi.edu/workingpapers/rpi0411.pdf

    If the 2005 paper or an abstract is available online, please post URL here. I will greatly appreciate it.

  21. John M
    Posted Mar 5, 2011 at 2:25 PM | Permalink


    The Wayback Machine crawled the url linked to in this original post several times.

    Here’s a hit from 2007.

    Still looks like it might be a preprint.

    Click to access Model-temporal-relation.pdf

  22. John M
    Posted Mar 5, 2011 at 2:29 PM | Permalink

    Doesn’t show up in Stern’s CV.


One Trackback

  1. By Willis on GISS Model E « Climate Audit on May 15, 2011 at 8:17 PM

    […] subversive discussion online, Schmidt asked they take the conversation offline. See CA discussions here […]

%d bloggers like this: