Gerry Browning: Numerical Climate Models

Gerry Browning of CIRA has contributed a post today discussing climate models. If you go to Google Scholar and search “Browning Kreiss”, you will get a list of formidable papers on numerical questions. Gerry has tried to distill the issues for a wider audience here.

Recent awards include the NOAA Environmental Research Laboratories’ Outstanding Scientific Paper Award for: Browning, G.L. and H.O. Kreiss, “The Role of Gravity Waves in Slowly Varying in Time Mesoscale Motions,” Journal of the Atmospheric Sciences , 54, 9 1166 – 1184 (1997).

Here is Gerry’s citation as winner of a 2002 Research Initiative Award from the Cooperative Institute for Research in the Atmosphere:

In a series of papers, Browning and Professor Heinz Kreiss, a colleague and mentor, have extended Kreiss’ Bounded Derivate Theory (BDT) to multiscale flows in the atmosphere and oceans. There are many ramifications of this new theory. The first is that the well posed system introduced by Browning & Kreiss as a replacement for the ill-posed primitive equations (used in all models for large-scale atmospheric flows) also accurately describes both the dominant and gravity wave portions of all remaining atmospheric flows, i.e., the new system is the only well-posed multiscale system that accurately describes all atmospheric motions. The second is that the reduced system clearly indicates what balances are appropriate for all diabatic cases, i.e., it is the only method that has provided a hot start initialization for cases where the heating is the controlling influence on the solution. The impact of this theory is being felt in many areas, and Browning’s cuttingedge research and the potential for breakthrough applications in numerical modeling

An Introduction to Climate and Weather Models

Continuum Considerations

Although there remains residual debate about the validity of various time dependent systems used to describe fluid motions, a number of these systems are in general use in both the engineering and scientific communities, e.g. the viscous, compressible Navier-Stokes equations, the Euler equations of gas dynamics (essentially the inviscid, compressible Navier-Stokes equations), and the magneto-hydrodynamic (plasma) equations. The continuum behavior of specific solutions of these systems can sometimes be understood by considering special cases (such as the propagation of sound) that lead to simpler systems that are more amenable to classical analysis. Sometimes simplifications of these systems are also made to make the numerical approximations of specific solutions of the continuum system on a computer more tractable. There can be two problems associated with either type of simplification.

Although the original systems usually have known mathematical properties, e.g. the Navier-Stokes equations are a quasilinear differential system, the simplifications sometimes can lead to systems with unknown or even bad mathematical properties. A well known example of an equation with bad mathematical properties is the heat equation run backwards in time. A small perturbation of the initial conditions for this equation can lead to instantaneous, unbounded growth and time dependent systems that exibit this type of behavior are called ill-posed systems. It is quite surprising how often simplifications that have been made in practice have led to this type of problem. Therefore, any simplification of the original continuum equations should be checked to ensure that the simplified system accurately approximates the continuum solution of interest and is properly posed

Difference Methods for Initial-Value Problems: Richtmyer and Morton

Numerical Modeling Considerations

Once the continuum system to be approximated has been determined to be properly posed, it can be approximated by a number of numerical methods, but all must be both accurate (consistent) and stable for the method to converge to the continuum solution of the initial-value problem (see Lax Equivalence Theorem in above reference). The accuracy of the numerical method determines how fast the numerical method will converge to the continuum solution, e.g. a fourth order method will take fewer mesh points than a second order method (assuming both are stable). However, the numerical accuracy can be reduced by a number of factors, e.g. errors in the approximations of the continuum equations or errors in the model.

There can be two other significant problems with numerical models. If there are any boundaries present, those boundaries must be dealt with very carefully both in the continuum system and in the numerical model. This is an extremely delicate process and if handled improperly can reduce the accuracy of a numerical method and even lead to an incorrect solution. The other major problem is that the solution of the continuum system may contain a complete spectrum of waves, but a numerical model can only compute a finite part of that spectrum. This is a typical and very serious problem. Henshaw, Kreiss, and Reyna have determined the minimal scale that will be produced by the nonlinear, incompressible, Navier-Stokes equations for a given viscosity coefficient (molasses has a large viscosity and air a very small one).

Convergent numerical solutions have shown that the estimates of this scale are extremely accurate. If the numerical model does not resolve the correct number of waves indicated by the estimate, the model blows up. If the model resolves the number of waves indicated by the estimate, the numerical method will converge to the continuum solution for long periods of time. Thus, if a numerical model is unable to resolve the spectrum of the continuum solution, the model is forced to artificially increase the viscosity coefficient or use a numerical method that has nonphysical viscosity built into the method.

Analysis of Numerical Methods: Isaacson and Keller
Time Dependent Problems and Difference Methods: Gustafsson, Kreiss, and Oliger
Initial-Boundary Value Problems and the Navier Stokes Equations: Kreiss and Lorenz

Large-Scale Weather Prediction Models

Given the above brief introduction to time dependent partial differential equations and numerical methods for those systems, we can now discuss large-scale weather prediction and climate models.

Clearly, the atmosphere contains motions of many spatial and time scales and no numerical model can hope to resolve all of those motions. For large-scale motions in the midlatitudes above the turbulent lower boundary layer, the inviscid, unforced Euler equations on a rotating sphere can be scaled under certain mathematical assumptions. For these motions, the vertical acceleration term is approximately 6 orders of magnitude smaller than the remaining terms in the time dependent equation for the vertical velocity and thus are typically neglected in large-scale weather prediction models leaving only the hydrostatic balance terms. The resulting system is sometimes referred to as the primitive or hydrostatic equations. The neglect of the vertical acceleration term made the equations tractable for computing, i.e. the inclusion of the vertical acceleration term would have required too small a time step to satisfy the stability criterion mentioned above, but altered the mathematical properties of the original system.

After the derivation of the hydrostatic equations, approximations of the turbulent boundary layer, eddy viscosity (much larger than the true atmospheric viscosity and sometimes even of a different type, e.g. hyperviscosity), and all kinds of approximations to various
atmospheric phenomena (parameterizations) are added onto the hydrostatic equations.

In the fall of 2001, Sylvie Gravel (RPN) ran a series of tests on the Canadian large-scale operational large-scale numerical weather prediction model. The parameterizations could all be turned off and the turbulent boundary layer approximation greatly simplified without significant difference in the 36 hour model forecast. However, the large-scale weather prediction model quickly started to deviate from the observations at a later time and only by updating the winds in the jet stream every 12 hours did the model stay on track. (The satellite data did not help the model forecast unless there was also radiosonde data available at the same site.) A simple change in the data assimilation program based on the Bounded Derivative Theory had a substantial impact on the forecast and the Canadian global weather prediction model continues to perform better than the NOAA global weather prediction model even though the latter model employs a more accurate numerical method.

Browning and Kreiss, 1986: Scaling and Computation of Smooth Atmospheric Motions
Tellus, 38A, 295-313 (and Charney reference therein)
Browning and Kreiss, 2002: Multiscale Bounded Derivative Initialization for an Arbitrary Domain, JAS, 59, 1680-1696
CMC Website

Climate Models

The updating discussed above is not possible in a climate model and because climate models use even a coarser mesh than a large-scale weather prediction model, they must use an effectively larger viscosity than a global weather prediction model. Recently (BAMS, 2004), it has been shown that a climate model also deviates from reality in a matter of hours because of the errors in the parameterizations (not unexpected based on result above) and over longer periods of time the effectively larger viscosity causes the numerical solution to produce a spectrum quite different than the real atmosphere unless forced in a nonphysical manner.


  1. Steve McIntyre
    Posted May 15, 2006 at 4:54 AM | Permalink

    I’ve discussed some related issues in a couple of earlier posts. The Navier-Stokes equations discussed briefly here are a highly intractable problem. One of the Clay Institute’s 21st Century Hilbert prizes is for any progress on understanding the equations.

    I also posted up a note on Holloway on ocean dynamics, who seemed to have got to similar conclusions to Browning from a different vantage point. Here are a couple of quotes from Holloway:

    traditional geophysical fluid dynamics (GFD), with traditional eddy viscosities, violates the Second Law of Thermodynamics, assuring the wrong answers. …

    In principle we suppose that we know a good approximation to the equations of motion on some scale, e.g., the Navier–Stokes equations coupled with heat and salt balances under gravity and rotation. In practice we cannot solve for oceans, lakes or most duck ponds on the scales for which these equations apply.

    This enterprise is like seeking to reinvent the steam engine from molecular dynamics’ simulation of water vapour. What a brave, but bizarre, thing to attempt!

    I also posted up some comments on Kaufmann and Stern who observed that:

    none of the GCM’s have explanatory power for observed temperature additional to that provided by the radiative forcing variables that are used to simulate the GCM…

    They have had difficulty getting this paper published, despite being well-known authors. Kaufman entered into an interesting exchange at realclimate raising some pretty salient issues. He was too prominent for Gavin to simply censor him, so Gavin asked that the discussion, which was substantive and very interesting, be taken offline. I expressed my contempt here.

  2. Paul Linsay
    Posted May 15, 2006 at 6:45 AM | Permalink

    There’s also the issue of correct radiative modeling of the atmosphere. Instantaneous doubling of CO2 is not exactly physical, except maybe in the case of a giant meteorite hitting the earth.

  3. John A
    Posted May 15, 2006 at 6:52 AM | Permalink

    Re: #2

    Somehow I think that the greenhouse effect of large meteorite hitting the earth will be the least of our problems…

  4. Doug L
    Posted May 15, 2006 at 6:53 AM | Permalink

    It’s not clear if the shell game on this one can continue indefinitely. On the one hand we read in Kaufmann’s RealClimate comment he’s told that the 2001 models are too old to be important, but Gavin indicates that the mean SAT is “a done deal”. I wonder who’s signed this contract and where it is?

    Apologies if I’m reading this wrong.

  5. Dave Dardinger
    Posted May 15, 2006 at 7:10 AM | Permalink

    none of the GCM’s have explanatory power for observed temperature additional to that provided by the radiative forcing variables that are used to simulate the GCM…

    So perhaps instead of a carbon tax to reduce AGW what we need is a VAT for GCMs.

  6. Pat Frank
    Posted May 15, 2006 at 3:21 PM | Permalink

    It also doesn’t matter how stable and well-posed a numerical model becomes, if the physics underlying the model is poorly understood.

    Recently (BAMS, 2004), it has been shown that a climate model also deviates from reality in a matter of hours because of the errors in the parameterizations (not unexpected based on result above) and over longer periods of time the effectively larger viscosity causes the numerical solution to produce a spectrum quite different than the real atmosphere unless forced in a nonphysical manner.

    Modelers have shown* that state-of-the-art GCMs cannot predict global climate further out than a year, and cannot predict smaller-scale regional responses further than 10 years. Despite this, the GCMs are used to support the ‘disaster in 2100’ claims of AGW. The disconnect between the state of the actual science and the claims made in its name is incredible.

    *e.g., M. Collins (2002) “Climate predictability on interannual to decadal time scales: the initial value problem” Climate Dynamics 19, 671-692, and; Collins, ea (2002) “How far ahead could we predict El Nino?” GRL 29, 1492-95

  7. Mats Holmstrom
    Posted May 15, 2006 at 5:36 PM | Permalink

    Apart from the problem of constructing a correct physical model, there is the problem of implementing it correctly. I made some comments about this last year at Roger Pielke Sr’s blog,

    I can recommend those interested in GCMs the posts over there at

  8. Posted May 15, 2006 at 6:11 PM | Permalink

    First I want to thank Steve for his diligence. I visit both this site and RC and appreciate the back and forth between the competing camps, with exception of the occasional venomous ad-hominem attacks. I am a skeptic by nature and believe that the best science occurs when hypothesis and theory survives aggressive challenges intact. Throw the things repeatedly against the rocks and see if they continue to stand. All of you here at ClimateAudit are doing a fine job providing this service. No matter the eventual outcome of this debate, there are few better examples of the real scientific process than this.

    I know that CM’s can be very helpful in many areas of scientific / engineering research. But it amazes me that so many people put so much faith in them, considering the garbage in / garbage out nature of data crunching for predicting long term trends, especially when dealing with complex systems containing many unknowns. I have heard some PETA types advocate for banning drug or cosmetic testing on animals, because, they say, the modern computer is capable of simulating the testing process. What they can’t seem to understand or grasp is that it is impossible to code every single chemical process in the animal / human body. For as long as we humans have been poking, prodding, analyzing and tearing through the human body, there is still so much we don’t know, which is why drug manufacturers don’t put a drug on the market simply because it did well against a bacteria or virus in a petri dish. GCM’s are no different. They only show possible outcomes based on the limited data fed into the model.

    Then again, what do I know. I’m a geology school drop-out (calc killed me).

    PS. I almost misspelled geology.

  9. Pat Frank
    Posted May 15, 2006 at 7:42 PM | Permalink

    I haven’t found any 2004 climate model papers for either GL Browning or Browning and Kreiss. Is it possible to get the actual citation for “BAMS (2004)”? Thanks

  10. Posted May 15, 2006 at 11:51 PM | Permalink

    Does anyone else have a problem with calling these things ‘gravity waves’ instead of just atmospheric waves? Is this some kind of physics envy?

  11. Thomas Bolger
    Posted May 16, 2006 at 1:28 AM | Permalink

    Climate is probably a Chaotic system.
    Chaos Theory tells us that any any computed prediction of climate will increasingly diverge from reality as time increases.
    Thus no GCM will be reliable.
    Climate Prediction Net has surely demonstrated that.

  12. Stephan Harrison
    Posted May 16, 2006 at 1:59 AM | Permalink

    Hello all.
    I think that this is a fascinating discussion, and maybe scale is at the heart of the problem. When we try and anlayse reductively a complex system like the sea, we are unable to make meaningful predictions. For example, we can’t predict the movement of a sand grain in a breaking wave, nor the height of the 10th next wave. This seems to me to be analagous to the chaos and unpredictability of the weather. However, we can predict the evolution of other characteristics of the ocean system (e.g. the timing of high tides). This is analagous to predicting the large scale behaviour of the climate system. Certain large-scale characteristics of complex systems emerge at the large scale. What we appear to be dealing with is the tension between reductionism and emergence as explanatory devices in science.

  13. Louis Hissink
    Posted May 16, 2006 at 7:00 AM | Permalink

    # 10 – Gravity Waves


    I do have a problem with them, apart from the obvious, but it’s logical when dealing with a strictly mechanical system as the model for the atmosphere.

    The ubquitous presence of lightning somewhere on the surface of the earth at any given time, together with the enormous voltages measured over hurricanes (some +6K V), would suggest that a simple mechanical model for the atmosphere is over simplistic, despite the elegant mathematics.

    Much like asking a mechanic to describe a modern computer circuit board when the only objects allowed are spanners, spark plugs and batteries.

  14. Posted May 16, 2006 at 7:51 AM | Permalink

    Re #13. I mean they are just a form of atmospheric wave, not ‘waves of gravity’ in the sense predicted by general relativity theory.

  15. Posted May 16, 2006 at 2:18 PM | Permalink

    For the concept in physics, see gravitational radiation.

    In fluid dynamics, gravity waves are those generated in a fluid medium or on an interface (e.g. the atmosphere or ocean) and having a restoring force of gravity or buoyancy.

    gravity wave”¢’‚¬?(Also called gravitational wave.) A wave disturbance in which buoyancy (or reduced gravity) acts as the restoring force on parcels displaced from hydrostatic equilibrium.

  16. Mats Holmstrom
    Posted May 16, 2006 at 5:05 PM | Permalink

    Re #16.
    It is not that simple. Lorenz found that simple mechanical systems can exhibit chaotic behaviour, i.e. small differences in initial conditions will grow exponential as time progress. But all solutions will stay on the chaotic attractor (the Lorenz butterfly). So even if you cannot say where on the attractor a solution will be, you know that it is on the attractor. Thus, even for chaotic systems you can extract information from the solutions, in principle.

    There are a lot of things to critique regarding computer modeling, but the fact that a system is chaotic does not rule out the possibility of, e.g., extracting average properties.

  17. Pat Frank
    Posted May 16, 2006 at 6:06 PM | Permalink

    #17, if you look at the Lorentz Butterfly, here for example with two attractor sites, you’ll see that small changes in initial conditions (click rapidly twice) means that over time your prediction can be ~180 degrees out of phase with physical reality. And for a chaotic system, that would be true even if your physical model was exactly correct.

    If you think of the inner trajectories of the two attractors as representing glacial or tropical climates, resp., with temperate climates represented by trajectories further out along each radius, then it would be possible for your model to be predicting temperate climates while reality was, e.g., a glaciation. As surface temperatures would certainly follow such climates, the conclusion is that not even global average surface temperatures are predictable at long times, even with a perfect model.

  18. Mats Holmstrom
    Posted May 16, 2006 at 7:06 PM | Permalink

    Re #18
    Yes, that is what I just said: small differences in initial conditions will grow exponential as time progress.

    My point is that for different set of parameters you get different attractors. Making a hypotetical analogue to climate predictions, x2 CO2 might give you an attractor with different properties (average temperature) compared to the attractor for x1 CO2.
    One cannot conclude that no information can be extracted just because a system is chaotic.

  19. Paul Linsay
    Posted May 16, 2006 at 7:57 PM | Permalink

    #17-19: You and the climate modelers are assuming that a chaotic attractor is a useful concept for a system with literally millions of degrees of freedom. It works well for the Lorenz equations because there are only three degrees of freedom and it’s easy to integrate the equations for a great many cycles so that it’s possible to extract average properties.

    To do the same with the climate models you have to make a lot of assumptions. Assumption number one is that the equations of the climate system are completely known and understood. This is a big and very questionable assumption. Just wander over to Roger Pielke Sr’s web site for the new and different climate forcing of the day.

    Assumption two, you can accurately integrate the equations without wandering off the attractor, whatever it means to have a multi-million dimension attractor. The fluid dynamics people don’t even consider the concept of an attractor particularly useful except at the very onset of turbulence. Once turbulence sets in statistical measures are more meaningful and useful. (See the thread on this web site by Gerry Brown for problems with integrating the climate equations.)

    Assumption three: Even if one and two are ok, you still have to integrate the equations for a very long time to be able to generate enough phase space trajectories to be able to take the averages. Is there enough computer power in the universe to do this? Integrating one or even a few hundred sets of initial conditions to simulate 100 years of climate is not enough. You have to integrate thousands(millions?) of initial conditions or a few initial conditions over a very long period of simulated time. You’d like to do both to make sure that you’re doing things right.

    OK, supposing that you are successful at integrating the multi-million degrees of freedom equations. Now you want to use it to predict the climate. You still have to measure all those millions of degrees of freedom in the real world and plug them into your model to make predictions. Good luck or you better start getting real clever in reducing the dimensionality of your model so that it’s actually tractable for use with available data.

  20. Pat Frank
    Posted May 16, 2006 at 9:19 PM | Permalink

    #20 (and #19), not only that, but in any set of attractor trajectories, the average for any given time is a transect across the entire set of possibilities. Such an average is likely to have very large uncertainties, making a time-wise ‘prediction’ extremely uncertain.

    One could use a good model to study the behavior of the system under various conditions, and discover what forcings can radically perturb a given quasi-stable state. However, that is not the same as predicting which new stable state, of the set of possible stable states, will emerge.

  21. MrPete
    Posted May 16, 2006 at 11:01 PM | Permalink

    [Just stopping by quickly in the midst of of wild times… no time for substantial reflection for a month or three :(]

    Question: is there a standard statistical/numerical method that can be used to create an hypothesis surrounding ongoing discovery of important new elements in GCMs? I’m thinking this should be doable in a manner similar to proven/unproven natural resource reserves, and our confidence (or lack thereof) in the current or future state of GCM’s.

    * January 2006, it is announced that “…plants worldwide produce millions of tonnes of methane each year, with the greatest share coming from the tropics, and that the plant contribution is likely to count for 10–30 per cent of annual methane emissions…plants produce more methane at higher temperatures, the amount doubling every ten degrees above 30 degrees Celsius.”

    Let’s put that in context.

    2001 total radiative forcing was 2.425 W/m2
    2005 Updated Total was 2.7911 (a 15.1% increase)

    2006 discovery (above) now produces a total of 2.9351 (another 5.2% increase, a 21.0% increase over the 2001 estimate)

    Interestingly, ALL of the increases recorded are not due to measured changes in prior values, but to additional factors not previously identified.

    We could be silly and fit this data to a growth curve, to produce a reasonable (???) estimate that the total forcing discovered by 2020 will be ~18.3 W/m2, and thus that perhaps a discount rate of (1 – (2.9/18.3))= 84 percent could be applied to current GW forcing estimates (based on the assumption that we understand ‘A’ sources much better than ‘non-A’ sources). But let’s not do that.

    Instead, I’m curious how the above facts about ongoing factor-discovery impact quantitative confidence levels attributable to the various models and estimates used in climate science. I’ve seen awesome confidence level statements in many current papers. How much should the error bars grow, to take into account our obvious need for humility in claiming GCM completeness?

  22. MrPete
    Posted May 16, 2006 at 11:03 PM | Permalink

    Oops, a typo: that 18.3 value is for 2025. [For the curious, I just used Excel’s GROWTH() function to create the trend.]

  23. Steve McIntyre
    Posted May 16, 2006 at 11:12 PM | Permalink

    #22. There were clerical errors in HITRAN for the near infrared absorption of water vapor used in IPCC TAR which were larger in wm-2 than the impact of 2xCO2. To my knowledge, no one ever reported on what the GCM s looked like merely with the changed water vapor values. Instead, the models were re-tuned.

  24. MrPete
    Posted May 16, 2006 at 11:18 PM | Permalink

    While I’m at it: I’ve yet to discover evidence that missing data is generally being handled in a proper way. As others have noted, this can be an extraordinary source of misunderstanding, miscalculation, and so forth.

    Put simply, zero and “I dunno” are NOT the same. And interpolating to fill gaps is not a valid solution, particularly for noisy data sources.

    When a new measurement source is added to the mix, it is wildly inaccurate to presume that its data can simply be incorporated into the calculations from year XXYY on, and ignored before that date. Yet that is mostly what appears to happen here! I was stunned when I saw that.

    As shown in my example posted above, one can observe an “increase” in measured data simply by converting values from “unknown” to a known value. Yet, it may be that no increase of any kind has been actually measured; all that’s happened is more complete data is available.

    See any parallels to financial hype? Sigh.

  25. MrPete
    Posted May 16, 2006 at 11:28 PM | Permalink

    #24 “the models were re-tuned.”

    Wow. 20/20 hindsight is such a wonderful thing!

    Who cares if the hypothesis represented by the old model fails to accomodate the new information? We just create a new model (hypothesis) and “move on.”

    Does any climate model, anywhere, attempt to accomodate future climate discoveries that might tend to invalidate present understanding? Steve M, I presume that natural resource models/estimates are relatively sophisticated in this regard, correct?

    (I’m becoming suspicious the whole house may fall down if realistic uncertainty levels are inserted… but don’t have the mathematical muscles to prove it. Is it possible to calculate something along the lines of “this modeling methodology will fail to converge if the data uncertainty exceeds N percent”?)

    Sorry for my uneducated and ignorant vocabulary. I’m a practician (knowing how numbers “feel” in practice), not an academic (able to prove how the numbers ought to work). No slight intended to either area of expertise BTW — we need both!

  26. Mats Holmstrom
    Posted May 17, 2006 at 12:32 AM | Permalink

    Re #20
    why do you write You and the climate modelers? I did not write about climate models above, but commented on chaotic systems in general. Please read what I wrote in #16:

    There are a lot of things to critique regarding computer modeling, but the fact that a system is chaotic does not rule out the possibility of, e.g., extracting average properties.

    Thats all I said. Then you get into a different discussions with all the assumptions going into climate models, that I mainly agree with (and could add some more).

  27. Paul Linsay
    Posted May 17, 2006 at 5:52 AM | Permalink

    #27 Matt, my apologies. I read your statement as a defense of the current state of climate models.

  28. Paul Linsay
    Posted May 17, 2006 at 5:58 AM | Permalink

    #27 Please expand on the add some more in

    assumptions going into climate models, that I mainly agree with (and could add some more)

  29. Steve McIntyre
    Posted May 17, 2006 at 7:10 AM | Permalink

    #26. With respect to the multiproxy models, the claimed confidence intervals are wildly inappropriate. If you look at the posts and exchanges in connection with MBH Confidence Intervals (see MBH98 Category), they calibrate confidence intervals in MBH98 (and other studies) based on calibration period standard errors. The standard error of the residuals is MUCH higher in the verification period (verification r2 of about 0 is the same effect expressed in different terms). Calculation of confidence intervals based on the verification period would lead to confidence intervals from the floor to teh ceiling. Wahl and Ammann argue that MBH somehow salvages “low-frequency” results but then you have only a couple of degres of freedom and again no ability to establish confidence intervals.

  30. MrPete
    Posted May 17, 2006 at 11:54 AM | Permalink

    #30 “they calibrate confidence intervals in MBH98 (and other studies) based on calibration period standard errors”

    The implication of this statement is finally sinking in. Unbelievable. Another case of obfuscation via fancy terminology.

    That’s not a confidence level, it is (perhaps) a “quality of fit” measure, and of course only applies to the calibration period itself.

    I realize I’m saying nothing new: standard error of a calibration period cannot possibly predict confidence levels for the future/past. Obvious when observing any measure you like, for real-world data that has some level of variability.

    An acquaintance was recently asked to provide a peek into the future for some high net worth folks. He spent the first half hour demonstrating that nobody has ever predicted significant futures with any accuracy. I’ll see if I can dig up his data sometime…

  31. Peter Hartley
    Posted May 17, 2006 at 11:56 AM | Permalink

    I think the tendency to present “scenarios” in climate modeling rather than probabilistic projections is entirely because the researchers want to finesse the issue of how much confidence one could reasonably have in the forecasts. I think there have also been indications that the error terms in the model forecasts are likely to be highly positively skewed so that the extreme upper projections that get emphasized in the media releases are extremely low probability events. I think that scientists in the climate field should start demanding their GCM colleagues produce probabilistic forecasts that would be more useful for others trying to do related research into likely ancillary effects, likely costs and benefits of different actions etc. The “scenario story lines” are really worthless as a basis for any sensible policy analysis.

  32. Steve McIntyre
    Posted May 17, 2006 at 12:08 PM | Permalink

    #31. it’s hard to believe that some of these practices could be in usage. I think that’s why people tend to disbelieve some of my findings.

  33. Peter Hartley
    Posted May 17, 2006 at 12:26 PM | Permalink

    The Idso’s web site has an interesting review of what appears to be a related paper Williams, P.D. 2005. Modelling climate change: the role of unresolved processes. Philosophical Transactions of the Royal Society A 363: 2931-2946. It appears to propose “stochastic techniques” as “an immediate, convenient and computationally cheap solution” to the problem that some potentially important climatological phenomena are simply too small to be adequately modeled at the present time. I guess this raises an issue that is perhaps contrary to my previous post #32 as to whether stochastic models can really serve as an adequate simplification of chaotic models that cannot be solved. There is also the issue that even if the underlying non-linear dynamic equations could be solved for known parameter values, there is uncertainty about some of the parameter values. What happens when you mix non-linear dynamics with stochastic parameter distributions? Can some of the physicists here enlighten us?

  34. Willis Eschenbach
    Posted May 17, 2006 at 1:39 PM | Permalink

    Re 32, thanks, Peter, an interesting post. You say:

    I think the tendency to present “scenarios” in climate modeling rather than probabilistic projections is entirely because the researchers want to finesse the issue of how much confidence one could reasonably have in the forecasts. I think there have also been indications that the error terms in the model forecasts are likely to be highly positively skewed so that the extreme upper projections that get emphasized in the media releases are extremely low probability events. I think that scientists in the climate field should start demanding their GCM colleagues produce probabilistic forecasts that would be more useful for others trying to do related research into likely ancillary effects, likely costs and benefits of different actions etc. The “scenario story lines” are really worthless as a basis for any sensible policy analysis.

    There is an interesting paper on this question at the Climate Science web site. The basic message of the study is that the size of the albedo is very poorly represented in the climate models, with a mean! error on the order of 3-4 w/m2.

    Since this error is about the size of the IPCC value for a doubling of CO2, you can see what this would do to a confidence interval for their results … like Steve has said, “floor to ceiling” …


  35. Peter Hartley
    Posted May 17, 2006 at 2:55 PM | Permalink

    Willis: Knowing that the distribution would reach from “floor to ceiling” certainly would have dramatic implications for greenhouse policy. Such a situation would imply that any investments in controlling CO2 would be much more risky in terms of their payoffs. That in turn would imply that the expected return on those investments would have to be much higher (ie the mean “bad effects” from the increase would have to be greater) in order to justify the policy as a rational expenditure of resources. Perhaps even more significantly, an increase in the variance of the effects would greatly increase the value of any “options” (in the financial sense) associated with greenhouse policy. In particular, the benefits of waiting to get information that reduces the uncertainty in the forecasts before taking action increase substantially. That is, waiting for more satellite observations etc to settle the issue of how well the models really work becomes a much more attractive alternative.

  36. Peter Hartley
    Posted May 17, 2006 at 3:12 PM | Permalink

    Further to #36: Thinking about this some more, the options point implies that we are interested not just in the mean and variance of the distribution but also the “tails” — skewness, kurtosis etc. Indeed, one can think of the “precautionary principle” as an analogous options pricing issue. If there is a really bad outcome and an available option that you are sure would enable you to avoid that outcome, the value of the option would be increased by anything that raised the probability of the bad outcome. The conclusion si that to make sensible deductions about greenhouse policies one needs the models to deliver forecasts not just of means and variances but also higehr moments of the outcomes. The scenario story lines are a wholly inadequate basis for constructing rational policies.

  37. Paul Linsay
    Posted May 17, 2006 at 5:29 PM | Permalink


    What happens when you mix non-linear dynamics with stochastic parameter distributions?

    I don’t know about stochastic parameter distributions but I can say something about what happens when parameters change in a non-linear dynamical system. Most of the time the attractor changes shape but basically looks the same, it’s fatter, thinner, … But there are parameter values where the system will bifurcate and suddenly the attractor will be different. For example it might change from periodic to chaotic.

    My favorite is hysteretic transitions. In this case as you increase a parameter P there will come a value Pb where the system switches from attractor A to attractor B. But on lowering the parameter the switch back to attractor A won’t occur until you reach Pa. This means that there is a region Pa Taylor-Couette flow is famous for doing this.

    This throws an extra twist into nonlinear systems beyond the famous Lorenz “Butterfly effect.” Not only can’t you model the future very well because small errors get amplified exponentially, you also have to know how you got there because a different parameter path to the identical set of parameters could lead you to a completely different scenario.

  38. Paul Linsay
    Posted May 17, 2006 at 5:33 PM | Permalink

    End of the second paragraph #38 should read

    This means that there is a region Pa less_than P less_than Pb where two distinct states are possible but the one you’re in depends on past history. This may sound like fiction but Taylor-Couette flow is famous for doing this.

  39. MrPete
    Posted May 17, 2006 at 6:06 PM | Permalink

    #36: “an increase in the variance of the effects would greatly increase the value of any “options” (in the financial sense) associated with greenhouse policy.”

    …particularly so since, as in the matter of tree-sourced methane, what was once thought to be a beneficial “option” may turn out to accomplish the opposite of what was intended at the first level. (Yes, planting trees has other benefits so no we don’t cut down all the forests… just saying that to avoid troll responses…)

  40. Reid
    Posted May 18, 2006 at 4:59 PM | Permalink

    The problem with computer models is far deeper than getting the equations right. Complex kinematic systems where the underlying equations-functions are known precisely cannot be accurately modeled. The output of the model diverges from reality as time proceeds in the model. In fact, when I visited NCAR in the late 90’s they had a display that illustrated the point. It was a swinging arm kinematic device that looked simple enough to any physics 101 student. But the problem was even though the kinematics was completely understood, the computer models all diverged from reality as time progressed in the model. Reason enough for extreme scepticism at any computer model that claims skill at predicting the future. That would be any future whether the economy, the stock market, climate, resources, population, etc.

  41. Paul Linsay
    Posted May 20, 2006 at 7:13 AM | Permalink

    James Annan has an extremely revealing observation in the context of this thread (see reply 6 at Pielke’s site)

    You seem to be defining “skill” simply in terms of the model agreeing with the observations (to within their error), which is clearly an invalid use of the term, because under this definition, no model can ever be expected to have skill. That’s a simple exercise in elementary algebra.

    I’m flabbergasted by this statement, though I probably shouldn’t be.

  42. Dave Dardinger
    Posted May 20, 2006 at 9:57 AM | Permalink

    re: #42

    In this case I agree with Annan, at least within the context where he was speaking. The definition of skill being used was that the model must do better than some other method of post-dicting the known climate. His point that you’re flabbergasted by was simply that using the actual instrumental data as the ‘method’ creates an impossible barrier. It’s impossible by definition for a model to do better at matching the actual climate than the measurements of climate themselves. Of course, as perhaps Steve will explain, this isn’t where MBH98 fails.

  43. Willis Eschenbach
    Posted May 20, 2006 at 10:23 AM | Permalink

    Re 42, 43: I don’t understand Annan’s point. He says that “skill’ consists doing better than some other prediction method. Post 43 agrees.

    But Pielke is not proposing that the computer models should “do better at matching the actual climate than the measurements of climate themselves” as # 43 proposes. For example, Pielke says:

    Skill can be quantified, for instance, by a model forecast as having a root mean squared error of 1.2C with respect to 500 hPa temperatures, for example. This is one type of model “skill”.

    The idea that Pielke is asking that the computer models should do better than the observations is a straw man. Pielke is only (and reasonably, in my opinion) asking that the results be what I call “lifelike” — that the standard deviation, mean, first derivatives, etc. of the model results be similar (in some specified mathematical sense) to the observational data.

    The appropriate metric is not, as Annan proposes, whether a computer model does better than some alternative method such as persistence or climatology. It is whether the results are in close agreement (again in some specified mathematical sense) with observational data.

    Annan claims that

    There are no examples using your definition to be found in the literature, because it simply doesn’t make sense.

    This is simply not true. The “Gerrity Score”, for example, is used by the UK Met Office to assess the “skill” of their forecasts. From their web site:

    The Monthly Outlook includes forecasts of expected temperature and rainfall categories. Five categories are used; (1) well below average, (2) below average, (3) near average, (4) above average or (5) well above average conditions for the time of year.

    To assess the accuracy of the forecast we compare the predicted category with the category that was actually observed to occur. We use a points-based scoring system in which maximum points are awarded to forecasts that are ‘spot on’ (i.e. the forecast category exactly matches the category that actually occurred), fewer points are awarded for ‘near misses’ (e.g. the forecast is wrong by one category), and points are subtracted for misleading forecasts (i.e. a forecast of above normal when below normal is observed). The score used is called the Gerrity Skill Score (GSS), and is one of the scores recommended by the World Meteorological Organization (WMO) for evaluation of long-range forecasts. The score is designed so that forecasts that are always ‘spot-on’ would achieve a score of 1.0, and forecasts based on simply ‘forecasting’ the long-term average (category 3) would receive a score of zero. Thus a positive score means the forecast is better than guesswork and better than assuming future conditions will be similar to the long-term average. Although the theoretical maximum score is 1.0, best scores achieved at the monthly range are of order 0.6, and found in the more predictable tropical regions.

    Note that they are not comparing to other forecasts, but to observational reality.

    I have repeatedly asked Gavin Schmidt whether they use the Gerrity Score or some other method to assess the “skill” of their models. To date, he has refused to reply to the question.


  44. Paul Linsay
    Posted May 20, 2006 at 1:11 PM | Permalink

    #43. All the uproar in climate science is because the models are predicting catastrophic warming a century from now. Yet Annan doesn’t want to test the models against observations. There is no scientific context in which his statement makes sense. What possible other criterion is there? If the models don’t agree with the observations why the excitement? Model versus model is just an exercise in computer generated fantasy. Who cares except the people who wrote the code?

  45. Dave Dardinger
    Posted May 20, 2006 at 2:08 PM | Permalink

    Well, I’m not claiming that the definition of “skill” used by Annan et. al. is all that useful, at least not if you don’t have a process for estimating future temperatures which you’ve presented and are comparing your model to. Still, something like the HS isn’t the same as a model, so one can compare them to see which has the most ‘skill’ in matching the observational data. However there are other possibilities such as Annan mentions, such as assuming the continuation of various cycles and trends. I don’t know that this is actually what MBH98 did however. But I’ll wait to see what Steve has to say on the subject.

  46. TCO
    Posted May 21, 2006 at 9:55 AM | Permalink

    Hartley’s econ-jock oriented comments were very interesting. However, I doubt that even if the danger is well identified, that we will really take firm steps to limit GHG’s. The large issue is NOT Kyoto. It is the funding of Mannites and other sillies.

  47. Steve Sadlov
    Posted May 22, 2006 at 1:21 PM | Permalink

    It’s really annoying to me, as a data driven guy, to see how after he made his post at RC, there came a flurry of scare mongering, junk science links, which were as far away from any sort of real mathmatical rebutal as tree sitters are from particle physics experiments.

  48. Steve Sadlov
    Posted May 22, 2006 at 1:46 PM | Permalink

    RE: #20. This is the essence of the problem. Full stop.

  49. Steve Sadlov
    Posted May 22, 2006 at 1:49 PM | Permalink

    RE: #32. Monte Carlo Simulation would be a start, but with millions of degrees of freedom, that’d be some fair gnarly calculatin’ 😉

  50. Francois Ouellette
    Posted May 24, 2006 at 5:14 PM | Permalink

    There was this very interesting post on modelling on Roger Pielke Sr.’s blog, by a fellow named Gregor L. ( I’m copying it here for your interest:

    “After reading several of the comments, I had to chime in given that I formerly studied to be both a statistician and atmospheric scientist but am now engaged in creating highly sophisticated economic/statistical models for one of the top 5 largest hedge funds. Because of the proprietary nature of what I do, I cannot go into the nature of the models that I create and work on, but suffice it to say that both the complexity and computing power required are entirely on order of many GCMs – one may notice how many big/fast computers in the top 500 list are at financial institutions or trading organizations.

    Modeling financial markets is brutally difficult. The fundamental relationships are highly nonlinear and the associated data possess pathological distributions. Even worse, the nature of the markets means that one does not have a convenient set of Navier-Stokes equations upon which to ground the modeling, but instead has to deal with:

    1) Underlying equations that may not be knowable from any form of first principles, and
    2) Even when known, equations that are nonstationary in a very nasty and nonlinear way owing to the psychological/sociological aspects of financial markets

    More importantly, we have a very natural measure of skill. It is:

    1) Be better on a relative level than our competition, and
    2) Be right significantly more often than we are wrong

    The climate community too often embraces only the first aspect. On the other hand, if we ignore either one of our “skill” dimensions in my line of work, then the lights get turned off and we are out of business.

    Moreover, because of the very natural feedback of our industry, we cannot “game” our models. For example, I cannot create a model that looks great in testing but loses money in actual use; the finance literature is full of such models, but using them will cause one to go bankrupt. In the same fashion, I cannot examine behavior in individual financial markets and then “tweak” various differential equations and statistical parameters so that the model forecasts better match that market’s returns – such gamed models also fail miserably when used in reality.

    Now I will tie this in more directly with climate research. The measure of skill quoted by Pielke from Annan’s website:

    “Forecast skill is generally defined as the performance of particular forecast system in comparison to some other reference technique.”

    is exactly the definition I would expect from someone who does not have to be right more often than wrong. In fact, unlike what I do in the applied finance world, there has been little feedback whatsoever on whether or not climate model forecasts created to date have been correct thus far. Instead, the feedback seems to be how well one can fit a GCM to an observed data set in a publication (such data include historical near-surface temps, satellite observations, etc.) to see if it indeed matches historical data. Even though such GCM forecasts are integrated over time, the continual process of re-running and re-running them until better matches are created is a form of in-sample maximization that provides no information on what the model’s realized forecast efficacy will be (even the in-sample matches are currently poor, as has been noted above).

    From an outsider’s viewpoint, I have to ask a troubling question: What are the incentives to create an accurate climate model forecast going forward? I am not talking about short-term (upcoming season) or even early medium-term forecasting, but the long-run enhanced CO2 emissions GCM forecasts. Who will validate these forecasts? Will the validations be during the researchers’ careers? One can cynically note that there is certainly incentive to create attention-getting forecasts, a result that is generating a lot of climate research funding without any direct tie-in as to whether or not the forecasts will later prove to be true (which, I should note, they very well could be right after all).

    I am left with only two incentives to get the forecasts right: The integrity of the researchers/groups, and the overall integrity of the scientific process. I do believe that most researchers have such integrity, including those posting here, but I call into question the scientific process here, especially in the short-run. “Group think” behavior has been a stunning and recurrent problem throughout the history of scientific progress. And it can flourish when there is a lack of experimental verification and predictive accuracy feedback, especially when performance measures do not require one to be right. I know that in my world, if we could be successful under such lax performance measures, my own life would be a lot less stressful. “

  51. Steve Sadlov
    Posted Jun 14, 2006 at 1:36 PM | Permalink

    This is rich. My retort to a raypierreism. Probably will never be allowed to post over there, so here it is:

    RE: “Response: At this point, forecasting hurricane trends based on the supposed future of the AMO seems more speculative and less based on fundamental physics than factoring in the contribution of anthropogenic global warming. –raypierre”

    So, are you implying some sort of conspiracy, where NOAA are silencing those who share your own bias, and are promoting those who are, as you would call them, “deniers?” You actually believe that NOAA would risk poor forecasts to satisfy some sort of “Bushie political agenda?” You actually believe that? And you call yourself a scientist? Got tin foil? ….

    by Steve Sadlov

One Trackback

  1. […] model predictions, you would indeed need to have faith and/or belief that they are accurate. This is an interesting summary of some of the key concerns about the predictive skill of climate models. […]

%d bloggers like this: