March 2106

According to KNMI’s version of UKMO CM3, in March 2106, the tropics (20S-20N) will temporarily have an inhospitable temperature of 0.1E21.

In a statement, the Hadley Center said that these results showed that the situation was “worse than we thought”. In an interview, Stefan Rahmstorf said that not only was it worse than we thought, it was happening faster than we thought. Using the most recent and most sophisticated parallel-universe embedded canonical manifold multismoothing of Simpson et al, (H. Simpson et al, Springfield Gazette, June 11, 2009), Rahmstorf said that we could expect a temperature of 0.1E19 by 2020. [For realclimate readers, I’m just joking – they didn’t really say that.]

If any life survives, it will fortunately revert to more hospitable temperatures in April 2106. Below is a screenshot of my download.


  1. Andrew
    Posted Jul 4, 2009 at 10:04 PM | Permalink

    Hehe, now that’s funny. Nice one Steve! I’d file this under “even Scientists are human”-and can make simple archiving errors (I’m assuming, of course, that the data point is a mistake and not real, which would be even more embarrassing!).

  2. Steve McIntyre
    Posted Jul 4, 2009 at 10:07 PM | Permalink

    This is all computer generated stuff. It isn’t a typo. It;s not an “archival” error; it’s a computer error. I have no idea why it would get one month so weird. Or whether it originates at KNMI, PCMDI or UKMO.

  3. Andrew
    Posted Jul 4, 2009 at 10:12 PM | Permalink

    Er, sorry, I didn’t quite understand. Apologies.

  4. Nicholas
    Posted Jul 4, 2009 at 11:33 PM | Permalink

    It could be memory or disk corruption on the machine running the model. Flip one bit in the floating point number somewhere during the output phase and you can get a crazy result like this. Who knows how many other more subtle errors could be in the data set?

    • TerryS
      Posted Jul 5, 2009 at 2:22 AM | Permalink

      Re: Nicholas (#5),

      It could be memory or disk corruption on the machine running the model. Flip one bit in the floating point number somewhere during the output phase and you can get a crazy result like this.

      Wow, a comprehensive model simulation that takes into account the effect of cosmic rays on the system.

  5. Barclay E. MacDonald
    Posted Jul 5, 2009 at 12:16 AM | Permalink

    Are you telling us we can no longer rely on the Springfield Gazette for our climate science? Oh, its really just a computer error. Whew!

  6. Posted Jul 5, 2009 at 12:31 AM | Permalink

    Try smoothing that spike baby!

  7. Richard Henry Lee
    Posted Jul 5, 2009 at 12:47 AM | Permalink

    Maybe March 2106 is when mankind finally gets nuclear fusion to work on an industrial scale, but something very bad happened.

    BTW, I got the same temperature results using the same input data. I also noticed the Chinese characters on Steve’s Firefox browser. Is this a clue about future data mining?

  8. Bob Koss
    Posted Jul 5, 2009 at 1:13 AM | Permalink

    I cropped out a different grid area, hadcm3 sresa1b tas 5-15E 35-40N, and the erroneous value is still there. Also appears when masking land or sea.

  9. Dr Virtanen 2nd
    Posted Jul 5, 2009 at 2:31 AM | Permalink

    Just send enough information about the incident to replicate the problem to the programmer. That includes all of your inputs and selections and the above screen capture to document what happened. I assume that you can find someone who knows the programmer.

  10. benpal
    Posted Jul 5, 2009 at 2:52 AM | Permalink

    Blame it on the computer? You must be kidding. A computer with this type of error will likely crash with a blue screen before finishing the calculation.

  11. Demesure
    Posted Jul 5, 2009 at 3:06 AM | Permalink

    Rahmstorf could not have said “0.1E19 by 2020”. He would have, with “hindsight”, changed the manifold of his turbocharged filter to have a warming (compared to 2016), not a cooling.

  12. TAC
    Posted Jul 5, 2009 at 4:09 AM | Permalink

    Very funny!

    Also, in addition to being funny, it is a huge service to the Community to point out errors like this, because they may somewhat humble the modelers (who, like any parent, tend not to notice their “child’s” problems) and cause others to have a more realistic assessment of the models’ qualities.

    Finally, this error got caught because it’s a whopper; less dramatic errors are found only through careful analysis — which is what SteveM and ClimateAudit do so wonderfully.

    Thank you!

  13. Bart van Deenen
    Posted Jul 5, 2009 at 5:25 AM | Permalink

    This points out exactly the problem with all these climate models. They are unverifiable; basically they find bugs when the results look funny, otherwise bugs are not found. Has anyone ever seen any “Software Verification and Validation Plan” for any of these models? I’ve had to write quite a few of them, and that was for projects of only some 5 million Euro. So where are they for the climate models?

  14. tesla
    Posted Jul 5, 2009 at 6:00 AM | Permalink

    Without knowing any of the details of the calculation this smells like a floating point arithmetic glitch (dividing by a very small number).

  15. Posted Jul 5, 2009 at 6:20 AM | Permalink

    Very likely it’s a variable BIG hardwired to 1.0E20. What I can’t figure out is the editing format. Typically, the Gw format can be used to handle this kind of situation. And looking at all the other numbers w is something that has 4 decimal places. So, it’s very strange, I think. Maybe a scaling factor has been used?

    All software, every piece from the pre-processing routines that handle ICs and BCs, to the post-processing that does these kinds of data handling and display, are required to be Independently Verified. It is after all these latter post-processing routines that ultimately determine the numerical valves that are the focus of all decisions.

  16. Michael Jankowski
    Posted Jul 5, 2009 at 7:26 AM | Permalink

    The lack of QA/QC is disheartening and shockingly poor.

    But I am more confused as to how the output suddenly returns back to normal. Usually when a model diverges to producing ridiculous values, it keeps producing nonsense. I would think with all of the feedback mechanisms and potential for “runaway greenhouse,” we should keep seeing nonsensical values.

    It is as if all of the conditions and parameters that exist at time i are not used in the calculations for time i+1. In the context of a supposedly sophisticated climate model, I don’t see how that could be the case.

  17. missingdata
    Posted Jul 5, 2009 at 7:29 AM | Permalink

    Before people start screaming conspiracy, has anyone thought to check what the missing data indicator actually is in the source file or what the KNMI data explorer software would convert it to? I suspect its set to a large number such as this one and would indicate a failure to archive the timestep – nothing more, nothing less. There are probably numerous such examples. Given the data is sensible before and after its highly unlikely to be anything more complex.

    Steve: no one has used the word “conspiracy”, let alone screamed it. Please do not use this word at this site.

    I observed in the post that the error might well arise from KNMI software. However it is popular and useful software and the error interferes with making averages – that’s how I noticed it.

    As a matter of interest (I’m not familiar with numerical aspects of climate models), what plausible circumstances would result in March 2106 results being missing? It seems like an odd thing to go AWOL.

    • Posted Jul 5, 2009 at 8:00 AM | Permalink

      Re: missingdata (#19),

      Yet Another Naked Strawman (YANS):

      Before people start screaming conspiracy . . .

      A clear display of a massive misunderstanding of the basic nature of numerical solution methods:

      . . . a failure to archive the timestep – nothing more, nothing less.

      If this step failed, it was apparently corrected in a manner that does not involve the evolution of the solution by means of the discrete approximations of the PDEs and ODEs. It was re-set by an algebraic IF statement independent of the calculated state of the physical system.

      How many of these are acceptable? Do you fly on aircraft the flight-control software of which exhibit this kind of behavior?

      Note that in my comment at 17 above, I assumed the code was written in (shudder) fortran. Fortran is my most favorite language for parsing character strings and GOTO is my most favorite construct 🙂

  18. Bob Koss
    Posted Jul 5, 2009 at 7:59 AM | Permalink

    The last entry in the matrix is -999.900 for Dec 2199 and seems to be what they are using for missing data.

    Why the digit 1 isn’t put to the left of the decimal seems very odd. 1.00E20 would be the usual way to write that figure.

    Maybe that is where they balanced the energy in the system. 😉

  19. Bob Koss
    Posted Jul 5, 2009 at 8:00 AM | Permalink

    1.00E+20 would be the usual way to write that figure.

  20. missingdata
    Posted Jul 5, 2009 at 8:20 AM | Permalink

    Archival is as I understand it independent of the underlying model – its simply a timestamp dump. No need to restart and if a dump failed it wouldn’t likely be picked up instantaneously. It would have no impact on the model which would carry on merrily on its way. And 1.E+20 is an obvious mdi setting.

  21. AnonyMoose
    Posted Jul 5, 2009 at 8:29 AM | Permalink

    I wonder if such values leaked into the wild, where they got used. Include that number in an average and there will be heating in the results.

  22. missingdata
    Posted Jul 5, 2009 at 8:38 AM | Permalink

    Addendum: Or the file could have become corrupted in the archive (generally to magnetic tape) at source or anywhere along the way.

    • TAG
      Posted Jul 5, 2009 at 8:49 AM | Permalink

      Re: missingdata (#25),

      Addendum: Or the file could have become corrupted in the archive (generally to magnetic tape) at source or anywhere along the way

      Error detection and correction was invented to handle issues like this. I wonder how banks which handle very large datasets address the problem.

      “Dear Mr. Smith:

      It has come to our attention that your current credit card balance is $1,000,000,000,000,000,000,000.00. The bank suggests that you reduce this balance to a manageable amount. You may contact our credit counselling service


      Your Bank Manager””

    • TerryS
      Posted Jul 5, 2009 at 9:03 AM | Permalink

      Re: missingdata (#25),

      Addendum: Or the file could have become corrupted in the archive (generally to magnetic tape) at source or anywhere along the way.

      When data is saved to magnetic tape there are checksums saved with the data in order to pick up errors (and sometimes correct them). For this to be corruption on a magnetic tape the corruption would have to occur in a minimum of 2 separate places and be so fortuitous as to not be picked up by any of the checksums. A highly improbable occurrence.

  23. missingdata
    Posted Jul 5, 2009 at 9:26 AM | Permalink

    It could have corrupted after archive and before retrieval. Magnetic tape isn’t exactly a robust medium. Pen and paper is still the best …

    • TerryS
      Posted Jul 5, 2009 at 12:07 PM | Permalink

      Re: missingdata (#28),

      It could have corrupted after archive and before retrieval. Magnetic tape isn’t exactly a robust medium. Pen and paper is still the best …

      They have been using magnetic tape to store data for over 50 years. Over that period they have managed to work one or two methods to detect and let the user know the data is corrupt. The most common way of letting the user know is to stop reading it from the tape and say “data corrupt, retrieval aborted” or something similar.

    • Paul Penrose
      Posted Jul 5, 2009 at 12:46 PM | Permalink

      Re: missingdata (#28),
      OK, Now you are pulling our legs. It seemed to me that your previous messages were subtle little jokes, but this makes it obvious you are not serious. Now that you’ve had your fun, let’s get serious.

      I tend to agree with Dan Hughes. This is probably an error in the output conversion/formatting routine. It likely came across some condition that it was not programmed for and substituted this bogus value. Another possibility is that it did not properly initialize a local variable because of this condition. It seems much less likely that it’s in the KNMI software given that it’s just a data retrieval tool. You would expect blanks if it missed a value, or maybe asterisks.

  24. Hu McCulloch
    Posted Jul 5, 2009 at 9:31 AM | Permalink

    According to KNMI’s version of UKMO CM3, in March 2106, the tropics (20S-20N) will temporarily have an inhospitable temperature of 0.1E21.

    But is that °C or °F ??

    • MarkR
      Posted Jul 6, 2009 at 4:28 PM | Permalink

      Re: Hu McCulloch (#29), No, no, no! The degrees used are Mannian. They are time sensitive, so they are always colder in the past, and warmer in the future, and because of Teleconnection, one can measure them anywhere in the world from atop a Bristlecone Pine on Sheep Mountain, and still be at Starbucks in time for Tea.

  25. tesla
    Posted Jul 5, 2009 at 9:48 AM | Permalink

    I have on occasion seen output like this when doing polynomial fits and matrix inversions. If you get an ill-conditioned polynomial / inverse all hell can break loose and it’s not necessarily obvious ahead of time that it will happen.

  26. Posted Jul 5, 2009 at 9:58 AM | Permalink

    Or, although it is not a number calculated by the underlying fundamental equations, it could be a number produced by a GCM and thus not an artifact of any pre- or post-processing.

    For me, that is the real problem. And I consider it to be a very serious problem. One that very likely has been present in the GCM for nobody knows how long. The type of problem that should never have existed long enough for its first discovery to be the subject of a blog post by an outsider interesting only in understanding the data produced by GCMs. The problem cannot be dismissed by hand-arm waving speculation about the thousands of potential sources for the number; its source is required to be determined.

    Very like, KNMI merely reads the presented data and passes the info along to whoever requested the information. KNMI, should have Verified that the software that they use cannot introduce such artifacts.

  27. Posted Jul 5, 2009 at 10:01 AM | Permalink

    I’ll wager that the number was not calculated as a result of any numerical operation. It was instead set as a result of a comparison test.

  28. Steve McIntyre
    Posted Jul 5, 2009 at 10:06 AM | Permalink

    March 2106 tos and pr are both shown as -999.90. So something’s going on with tos and pr as well as tas in March 2106. Whatever it isseems to affect more than one number.

  29. Kenneth Fritsch
    Posted Jul 5, 2009 at 10:46 AM | Permalink

    “Dear Mr. Smith:

    It has come to our attention that your current credit card balance is $1,000,000,000,000,000,000,000.00. The bank suggests that you reduce this balance to a manageable amount. You may contact our credit counselling service


    Your Bank Manager””

    Just to show that these glitches are contextual, change that customer to Uncle Sam and date it July 4 and we have something more believable.

  30. Bill Drissel
    Posted Jul 5, 2009 at 11:04 AM | Permalink

    Haven’t these “programmers” heard of sanity checks? Where my output is likely to be used downstream, I sanity check all inputs and outputs. If an output is out of range, I fill the field with asterisks so that any attempt to use the field will crash … also makes visual appearance of the field stand out.

  31. Posted Jul 5, 2009 at 11:20 AM | Permalink

    I find it hard to believe that anyone that would dare call him/herself a scientist upon reaching a data figure of …..

    100,000,000,000,000,000,000 F or C

    wouldn’t immediately re-evaluate their data model based upon past known data.

    snip – please do not editorialize like this

  32. tty
    Posted Jul 5, 2009 at 11:56 AM | Permalink

    Probably somebody has forgotten the eleventh commandment:

    Thou shalt not divide by zero.

  33. Craig Loehle
    Posted Jul 5, 2009 at 12:02 PM | Permalink

    It is nice to get messages from the future. I get spam sometimes dated 2038 for some reason, for example. I won’t be outside on that date in 2006 March and I’m buying lots of ice the day before.

  34. Chad
    Posted Jul 5, 2009 at 12:56 PM | Permalink

    Did anyone both to check the metadata?

    float tas(time,lat,lon), shape = [2399 73 96]
    tas:standard_name = “air_temperature”
    tas:long_name = “Surface Air Temperature”
    tas:units = “K”
    tas:cell_methods = “time: mean”
    tas:coordinates = “height”
    tas:_FillValue = 100000002004087730000.000000 f
    tas:missing_value = 100000002004087730000.000000 f
    tas:history = ” At 12:31:57 on 01/12/2005: CMOR altered the data in the following ways: replaced missing value flag (-1.07374E+09) with standard missing value (1.00000E+20); Dimension order was changed; lat dimension direction was reversed;”

    I processed the data myself and for 2106 found (in °C)
    2106 29.0254 29.3977 NaN 30.1693 30.1227 29.6709 29.3102 29.2326 29.4684 29.5792 29.2292 28.5928

    When I looked at the data for the timestep corresponding to the third month it’s all NaNs. The Matlab function I use automatically converts the 1.00000E+20 flag to NaN. Mystery partially solved. It’s KNMI’s fault for not changing it to the usual -999.9.

  35. Gary Strand
    Posted Jul 5, 2009 at 1:36 PM | Permalink

    The IPCC AR4-defined value for missing data is 1.e20, ergo, that value for that gridbox for that month. It’s possible that the data for that month was missing and unable to be regenerated during the original model run, so that it was filled with the defined missing value.

    No need for other, more-exotic, explanations. Something mundane and while not common with climate models, not unheard-of.

    • Steve McIntyre
      Posted Jul 5, 2009 at 2:25 PM | Permalink

      Re: Gary Strand (#44),

      Gary, as Chad notes above, that’s only a “partial” explanation. You haven’t explained why March 2106 went AWOL – other than saying that months sometimes go AWOL in climate models.

      For those of us (including myself) who were previously unaware of this phenomenon, can you explain why this happens and where in the literature it is reported.

    • Michael Jankowski
      Posted Jul 5, 2009 at 4:02 PM | Permalink

      Re: Gary Strand (#44), why exactly is there data missing in the first place? Why didn’t anyone seem to notice? It seems to stand-out.

  36. Thor
    Posted Jul 5, 2009 at 2:44 PM | Permalink

    I tried exporting these data to netCDF instead of standard text file, and used a NetCDF file browser (ncBrowse) to convert to CDL format (which I believe is readable netCDF). I got the results below for a subset including 2106. The extreme temperature jumps to 1e+21 etc are not here. Could it be a file conversion issue at KNMI ?

    netcdf tas {
    time = 180 ;
    float time(time);
    time:units = “months since 1400-01-15”;
    float tas(time);
    tas:long_name = “hadcm3 sresa1b tas 0-360E -20-20N”;

    time = 8400.0, 8401.0, 8402.0, 8403.0, 8404.0, 8405.0, 8406.0, 8407.0,
    8408.0, 8409.0, 8410.0, 8411.0, 8412.0, 8413.0, 8414.0, 8415.0, 8416.0,
    8417.0, 8418.0, 8419.0, 8420.0, 8421.0, 8422.0, 8423.0, 8424.0, 8425.0,
    8426.0, 8427.0, 8428.0, 8429.0, 8430.0, 8431.0, 8432.0, 8433.0, 8434.0,
    8435.0, 8436.0, 8437.0, 8438.0, 8439.0, 8440.0, 8441.0, 8442.0, 8443.0,
    8444.0, 8445.0, 8446.0, 8447.0, 8448.0, 8449.0, 8450.0, 8451.0, 8452.0,
    8453.0, 8454.0, 8455.0, 8456.0, 8457.0, 8458.0, 8459.0, 8460.0, 8461.0,
    8462.0, 8463.0, 8464.0, 8465.0, 8466.0, 8467.0, 8468.0, 8469.0, 8470.0,
    8471.0, 8472.0, 8473.0, 8475.0, 8476.0, 8477.0, 8478.0, 8479.0, 8480.0,
    8481.0, 8482.0, 8483.0, 8484.0, 8485.0, 8486.0, 8487.0, 8488.0, 8489.0,
    8490.0, 8491.0, 8492.0, 8493.0, 8494.0, 8495.0, 8496.0, 8497.0, 8498.0,
    8499.0, 8500.0, 8501.0, 8502.0, 8503.0, 8504.0, 8505.0, 8506.0, 8507.0,
    8508.0, 8509.0, 8510.0, 8511.0, 8512.0, 8513.0, 8514.0, 8515.0, 8516.0,
    8517.0, 8518.0, 8519.0, 8520.0, 8521.0, 8522.0, 8523.0, 8524.0, 8525.0,
    8526.0, 8527.0, 8528.0, 8529.0, 8530.0, 8531.0, 8532.0, 8533.0, 8534.0,
    8535.0, 8536.0, 8537.0, 8538.0, 8539.0, 8540.0, 8541.0, 8542.0, 8543.0,
    8544.0, 8545.0, 8546.0, 8547.0, 8548.0, 8549.0, 8550.0, 8551.0, 8552.0,
    8553.0, 8554.0, 8555.0, 8556.0, 8557.0, 8558.0, 8559.0, 8560.0, 8561.0,
    8562.0, 8563.0, 8564.0, 8565.0, 8566.0, 8567.0, 8568.0, 8569.0, 8570.0,
    8571.0, 8572.0, 8573.0, 8574.0, 8575.0, 8576.0, 8577.0, 8578.0, 8579.0,
    8580.0 ;

    tas = 28.7589, 29.0695, 29.8826, 30.2045, 30.2281, 29.9681, 29.5614,
    29.7699, 29.9789, 29.9171, 29.7522, 29.1971, 28.9005, 29.0066, 29.485,
    29.7662, 29.7117, 29.263, 28.9865, 29.0127, 29.3104, 29.5285, 29.1739,
    28.5424, 28.4209, 28.678, 29.1165, 29.4769, 29.1823, 28.9475, 28.4944,
    28.7312, 29.0313, 29.1388, 28.889, 28.5816, 28.3492, 28.8771, 29.1225,
    29.5649, 29.4781, 29.1362, 28.7393, 29.0191, 29.2482, 29.4283, 29.1953,
    28.7486, 28.6128, 28.9474, 29.6259, 29.8681, 29.7258, 29.4725, 29.0857,
    29.0743, 29.2954, 29.5012, 29.3354, 28.9804, 28.7809, 29.3003, 29.8724,
    30.1412, 30.0722, 29.9423, 29.6167, 29.5947, 29.8137, 29.8554, 29.6346,
    29.2097, 29.0255, 29.398, 30.1691, 30.1227, 29.6709, 29.3103, 29.2325,
    29.4683, 29.5795, 29.2292, 28.593, 28.5018, 28.9702, 29.4003, 29.8085,
    29.6341, 29.3541, 28.9181, 29.0198, 29.2759, 29.5296, 29.3058, 28.7724,
    28.8105, 28.9779, 29.3562, 29.6838, 29.5977, 29.2273, 28.8419, 28.8621,
    29.1122, 29.0973, 28.9226, 28.6669, 28.4789, 28.9636, 29.5978, 29.9213,
    29.8095, 29.4322, 29.2451, 29.2812, 29.6423, 29.7195, 29.2798, 28.8758,
    28.8383, 29.2798, 29.7481, 30.1849, 30.121, 29.7077, 29.2961, 29.2966,
    29.6336, 29.636, 29.3528, 29.0814, 28.6197, 28.8398, 29.4277, 29.7108,
    29.7051, 29.3595, 29.1733, 29.1383, 29.4418, 29.6465, 29.268, 28.9194,
    28.6745, 28.9525, 29.3797, 29.7137, 29.6329, 28.8982, 28.5125, 28.6018,
    28.8697, 28.993, 28.8285, 28.211, 28.1993, 28.7224, 29.0793, 29.3585,
    29.2739, 28.9667, 28.6687, 28.8935, 29.2633, 29.3799, 29.154, 28.8023,
    28.7828, 29.2663, 29.6248, 29.9254, 29.7821, 29.3741, 28.8474, 28.8483,
    29.2046, 29.2339, 28.924, 28.6165, 28.3427 ;


  37. Thor
    Posted Jul 5, 2009 at 2:58 PM | Permalink

    … and just noticed that month 8474 is missing. I guess the question is still – why?

    Perhaps some algorithm is assuming a certain amount of days per month and got the leap year calculations wrong – after all, 2100 is NOT a leap year. And thus, two values landed in March and none in February, and only one was kept. Not a totally unthinkable explanation. But still a wild guess, of course 😀

  38. Gary Strand
    Posted Jul 5, 2009 at 4:59 PM | Permalink

    There are any number of ways an output file from a climate model could have a missing value. I’ve experienced a tape failure causing a loss of data because the file in question didn’t have a backup copy. It’s easy to require backup copies, but that can double your archival storage charge, and at NCAR specifically, storage charges are counted against one’s overall computing allocation, so the more you store (and dual copies consume twice the charges) the less you can compute. There’s also a time factor – the longer you store a file, the more it costs.

    I’ve also had to substitute missing values for months for which the months in question were missing from the model run – usually because the model stopped before that point, and it wasn’t possible to regenerate the missing data. If I’m missing (say) Dec 2099 from a run, and either the compiler, OS, or hardware used to generate the data aren’t available any more, I fill in that data with all missing values.

    Very occasionally a file is corrupted at some point, either during the write to disk from memory or from disk to tape, and I try to repair it, if the above regeneration isn’t possible. Sometimes this is rather difficult.

    One issue I’ve seen mentioned is QC – one problem with exhaustive QC is that it itself is an I/O intensive process, and when running on a “supercomputer”, you don’t want to use up too much time on simple I/O, instead of calculation. There are ways to use other, less expensive machines for QC, but that entails a rather complicated setup to examine files as the model is running, to avoid archival access charges. Additionally, climate models typically dump out hundreds of fields every month (and sometimes daily and subdaily time resolution), so over the course of a 100-year run, you could have hundreds of thousands of fields to examine. Certainly, one can easily check for some ranges of values for some fields, but you don’t want to falsely flag some field at some timestep as wrong, when it could possibly have a valid value.

    I agree that the software that comprises climate models could use a fair bit of work to make it meet higher standards – but software engineering isn’t the “sexy” side of climate science – the science is. Plus, the pay ain’t that great, in comparison to private industry, and the skill set (Fortran, MPI, netCDF/HDF5, etc.) isn’t as common as it used to be. We do the best we can with the resources available.

    • Gary B
      Posted Jul 5, 2009 at 6:05 PM | Permalink

      Re: Gary Strand (#49),
      All very reasonable explanations for what can happen, but in the end it’s simply inadequate justification for the fact that the global economy and human life are at risk from decisions based largely on the model output. It’s not the programmer’s fault they’re forced by circumstance to deliver inexact results despite their best efforts. It is the responsibility of the project managers to make the situation better or to explain that results are less certain than what is assumed by the policy-makers and public.

      That said, thank you for helping to clarify the causes of these weird values.

    • Richard Henry Lee
      Posted Jul 5, 2009 at 7:41 PM | Permalink

      Re: Gary Strand (#49),

      Thanks for the exposition about the lack of QC in the climate models since as you say, it is not as sexy as the science. But the lack of software quality control or quality assurance is troubling since these models are responsible for driving much of the governmental decision-making around the world.
      The problem seems to be institutional. Academic institutions do not have a culture of requiring QA/QC programs for their studies, yet we now rely on these same studies to make far-reaching policy decisions.

  39. Gary Strand
    Posted Jul 5, 2009 at 5:09 PM | Permalink

    I can confirm that March 2106 for SRESA1B, run1, model UKMO CM3, is set to all missing values. The global attributes for the original netCDF file located at PCMDI are:

    // global attributes:
    :title = “Met Office model output prepared for IPCC Fourth Assessment 720 ppm stabilization e
    xperiment (SRES A1B)” ;
    :institution = “Met Office (Exeter, Devon, EX1 3PB, UK)” ;
    :source = “HadCM3 (1998): atmosphere: (2.5 x 3.75); ocean: (1.25 x 1.25); sea ice: ; land: MOS
    ES1” ;
    :contact = “,” ;
    :project_id = “IPCC Fourth Assessment” ;
    :table_id = “Table A1 (17 November 2004)” ;
    :experiment_id = “720 ppm stabilization experiment (SRES A1B)” ;
    :realization = 1 ;
    :cmor_version = 0.94f ;
    :Conventions = “CF-1.0″ ;
    :history = ” At 12:31:57 on 01/12/2005, CMOR rewrote data to comply with CF standards and IPC
    C Fourth Assessment requirements” ;
    :references = “Gordon, C., C. Cooper, C.A. Senior, H.T. Banks, J.M. Gregory, T.C. Johns, J.F.B
    . Mitchell and R.A. Wood, 2000. The simulation of SST, sea ice extents and ocean heat transports in a version
    of the Hadley Centre coupled model without flux adjustments. Clim. Dyn., 16, 147-168. Johns, T.C., R.E. Carnel
    l, J.F. Crossley, J.M. Gregory, J.F.B. Mitchell, C.A. Senior, S.F.B. Tett and R.A. Wood, 1997. The Second Hadl
    ey Centre Coupled Ocean-Atmosphere GCM: Model Description, Spinup and Validation. Clim. Dyn. 13, 103-134.” ;

  40. Gary Strand
    Posted Jul 5, 2009 at 5:26 PM | Permalink

    It appears the UKMO CM3 model has a 360 day year, i.e., 12 x 30-day months.

    • Posted Jul 6, 2009 at 10:05 AM | Permalink

      Re: Gary Strand (#51),

      It appears the UKMO CM3 model has a 360 day year, i.e., 12 x 30-day months.

      Out of curiosity, do they make days a little longer to match the total number of seconds in a year? Or do is a model year 98% the length of an earth year?

      • Gary Strand
        Posted Jul 7, 2009 at 7:34 AM | Permalink

        Re: lucia (#86),

        Out of curiosity, do they make days a little longer to match the total number of seconds in a year? Or do is a model year 98% the length of an earth year?

        I don’t know exactly how their calendar works.

  41. Dishman
    Posted Jul 5, 2009 at 5:54 PM | Permalink


    For comparison, that’s about 15 times the temperature of the Large Hadron Collider using lead nuclei (1200 times LHC using protons). If the air in a region hundreds of miles across had that temperature for an instant, it would be in the nova range. Kaboom.

    More seriously, this is another sad reflection of the overall commitment to quality exhibited throughout “climate change” science.

  42. Leon Palmer
    Posted Jul 5, 2009 at 6:01 PM | Permalink

    I found this interesting recent news item on models vs reality at

    Think of the billions Boeing has spent on their models, and they still don’t have it right!

    “Specifically, Boeing found that portions of the airframe — those where the top of the wings join the fuselage — experienced greater strain than computer models had predicted. Boeing could take months to fix the 787 design, run more ground tests and adjust computer models to better reflect reality.”

    If realclimatescientists ran Boeing, they would defend the “models”, built the plane, kill thousands, bankrupt Boeing and blame Steve!

  43. Dishman
    Posted Jul 5, 2009 at 6:35 PM | Permalink

    Gary Strand wrote:

    I agree that the software that comprises climate models could use a fair bit of work to make it meet higher standards – but software engineering isn’t the “sexy” side of climate science – the science is. Plus, the pay ain’t that great, in comparison to private industry, and the skill set (Fortran, MPI, netCDF/HDF5, etc.) isn’t as common as it used to be. We do the best we can with the resources available.

    I think you’re looking at a false economy here.

    What’s the impact of using something that is later invalidated?

    Suppose some critical tool you used is found to be critically flawed, such that it produces incorrect results. At that point, all the results you paid for are potentially worthless.

    It’s building on a foundation of sand.

    NASA has an excellent standard for quality, NASA-STD-8739.8

    From 8739.8, Table A-3, footnote:

    Potential for waste of resource investment: This is a measure or projection of the effort (in
    work-years: civil service, contractor, and other) invested in the software. The measure of effort
    includes all software life cycle phases (e.g., planning, design, maintenance). This shows the level
    of effort that could potentially be wasted if the software does not meet requirements.

    I’ll note here that as of August 2008, NASA/GISS has not produced a “Software Assurance Classification Report” for either GISTEMP or GCM Model E (GSFC FOIA #08-109, no responsive documents). In other words, they haven’t even evaluated the impact of errors.

    The question I think you should be asking yourself is “What could possibly go wrong?”

    It beats being afraid of having your work burned in front of you, and it definitely beats that actually happening.

    • Geoff Sherrington
      Posted Jul 6, 2009 at 3:24 AM | Permalink

      Re: Dishman (#55),

      The old story of the announcement on board new aircraft “This is a recording. This aircraft is being flown completely by computer. Be relaxed. Nothing can possibly go wrong … go wrong … go wrong … go wrong”

  44. Gary Strand
    Posted Jul 5, 2009 at 6:43 PM | Permalink

    As a software engineer, I know that climate model software doesn’t meet the best standards available. We’ve made quite a lot of progress, but we’ve still quite a ways to go.

    If we can convince funding agencies to better-fund software development, and continued training, then we’ll be on our way. It’s a little harsh, IMHO, to assign blame to software engineers when they’re underpaid and overworked.

    • PhilH
      Posted Jul 5, 2009 at 7:19 PM | Permalink

      Re: Gary Strand (#56), Blaming software engineers is not really the problem here. The problem is that the polticians are asking us, no, ordering us,to bet on the results of these models that are the equivalent of Boeing’s flawed 787. These flaws never, if ever, float to the surface of ordinary attention.

    • Steve McIntyre
      Posted Jul 5, 2009 at 7:44 PM | Permalink

      Re: Gary Strand (#56),

      Gary, if this is what you think, then this should have been reported in IPCC AR4 so that politicians could advise themselves accordingly. I do not recall seeing any such comment in AR4 – nor for that matter in any review comments.

    • Steve McIntyre
      Posted Jul 5, 2009 at 7:49 PM | Permalink

      Re: Gary Strand (#56),

      If we can convince funding agencies to better-fund software development, and continued training, then we’ll be on our way. It’s a little harsh, IMHO, to assign blame to software engineers when they’re underpaid and overworked.

      Boo-hoo. Hundreds of millions of dollars, if not billions of dollars is being spent. PErhaps the money should be budgeted differently but IMO there’s an ample amount of overall funding to have adequate software engineers. MAybe there should be some consolidation in the climate model industry, as in the auto industry. If none of the models have adequate software engineering, then how about voluntarily shutting down one of the models and suggest that the resources be redeployed so that the better models are enhanced?

      • Gary Strand
        Posted Jul 5, 2009 at 8:14 PM | Permalink

        Re: Steve McIntyre (#62), Hundreds of millions of dollars, if not billions of dollars is being spent.

        Billions? Not even close. The total budget devoted to climate modeling worldwide is under $100 million, IMHO. I have no strong evidence of that, but based on what I know of NCAR’s budget (a small fraction of which goes to CCSM), the global total is quite modest. Climate modeling is only a small piece of the money spent on climate science; the costs of the science are inflated by things like satellites, which aren’t cheap.

        PErhaps the money should be budgeted differently but IMO there’s an ample amount of overall funding to have adequate software engineers. MAybe there should be some consolidation in the climate model industry, as in the auto industry. If none of the models have adequate software engineering, then how about voluntarily shutting down one of the models and suggest that the resources be redeployed so that the better models are enhanced?

        Like I said, software engineering isn’t what climate modeling is about – it’s about the science. Yes, the science would be better if the software was better, but scientists don’t do software, they do science.

        PCM is an example of an “expired” climate model. Consolidation across national boundaries would be impossible, IMHO, and across institutions would be a huge task. Part of the reason is that really fast computers are now within reach of rather modest budgets – the days of $35-$40 million Crays are long gone. One can do decent modeling on a machine that costs much much less. Besides, competition is a spur to improvement.

  45. Dishman
    Posted Jul 5, 2009 at 7:05 PM | Permalink


    I regard this as mostly management decisions. In the case of NASA/GISS, it appears to me that the director of that center has decided to ignore NASA standards.

    Software engineers do have the option to make clear to their management that failure to apply quality standards can result in invalidation (loss) of both their work and all derived work. That’s a really big hammer to management if you choose to take it up.

  46. Gary Strand
    Posted Jul 5, 2009 at 7:15 PM | Permalink

    Whether or not SEs are taken seriously depends on the management and the management’s “ideology”.

  47. Dishman
    Posted Jul 5, 2009 at 8:02 PM | Permalink

    Steve, in Gary’s defense:

    Whether or not SEs are taken seriously depends on the management and the management’s “ideology”.

    I think Gary understands your frustration.

    • Steve McIntyre
      Posted Jul 5, 2009 at 8:11 PM | Permalink

      Re: Dishman (#63),

      Gary has spent time at blogs criticizing viewpoints expressed here (and I welcome the exchange – don’t get me wrong on this).

      All I’m saying is that IPCC Review Comments were an opportunity to put this sort of comment on the record that IMO he could have taken advantage of.

      I don’t necessarily fault him for not doing so – he might not have thought of that as a venue. In which case, he could say – “I hadn’t thought of doing that, but it’s a good idea. I’ll say so in Review Comments for AR5.” Unless he’s exhausted such venues, I don’t have much sympathy for the complaints.

      I made Review Comments for AR5 with almost total certainty that they would be ignored and that I’d get cross with the supercilious author responses. All of which happened according to expectation. But at least no one could say – well, you didn’t bother commenting, how could we have read your mind?

  48. Gary Strand
    Posted Jul 5, 2009 at 8:14 PM | Permalink

    Whoops, shoulda closed the HTML tag in #65. My apologies.

  49. Dishman
    Posted Jul 5, 2009 at 8:57 PM | Permalink


    I’m a software engineer. As you’ve noted, the software engineering portion of the work isn’t really solid.

    Steve’s an econometrician. He’s noted that the portions relating to his expertise are not up to standards.

    It seems that wherever experienced people poke, the workmanship is not up to standards, and worse, there seems to be an effort to conceal the quality of the workmanship.

    This does not inspire confidence.

    There seems to be a general theme of ideological motivation, particularly on the part of management.

    You see the symptoms within your focus. I see some of the same symptoms as well, though as an outsider.

    Unfortunately, we’re seeing related symptoms from various places and roles in the organizations. That would seem to indicate the problem is systemic.

    • Richard Henry Lee
      Posted Jul 5, 2009 at 9:37 PM | Permalink

      Re: Dishman (#67),
      Steve is a mining engineer. Ross McKitrick is the econometrician who co-authored some papers with Steve.

      Steve: Nor am I a “mining engineer”.

      • Richard Henry Lee
        Posted Jul 5, 2009 at 9:53 PM | Permalink

        Re: Richard Henry Lee (#70),
        I stand corrected. You have been heavily involved in various aspects of mining and mineral exploration but not as an engineer. I notice that your BS degree is in mathematics. In any case, I prefer to think of you as a Renaissance man who defies labels.

  50. Shallow Climate
    Posted Jul 5, 2009 at 9:01 PM | Permalink

    Well now, I don’t really know what all you guys are talking about: I’m still hung up in total awe over Rahmstorf’s latest and greatest, “the most recent and most sophisticated parallel-universe embedded canonical manifold multismoothing”, that I can’t get to this thingie you-all are talking about, which, apparently, occurs later in the post.

  51. Plimple
    Posted Jul 5, 2009 at 9:07 PM | Permalink


    You make a good point about satellites. Much of the current budget spent on climate science is directed towards satellites. Take Aura, Aqua and Terra, for instance, you’ve got 3 satellites, approx. 12 instruments at $100-300 million a pop and you’ve got over billion dollars of hardware flying around in space. That hardware is only going to last 5-15 years.

  52. James Lane
    Posted Jul 5, 2009 at 9:37 PM | Permalink

    I also have some sympathy for Gary Strand’s POV on this issue, and good on him for acknowledging that the software QC is not what it should be for the climate models.

    That said, the situation is highly unsatisfactory, and as Dishman suggests, is likely systemic.

  53. Jaye Bass
    Posted Jul 5, 2009 at 10:14 PM | Permalink

    Hmm, this sounds familiar. I’ve worked with engineers over the years whose only product and means of analysis was software, yet the code itself was sloppy and inefficient. Any attempts at fixing it by software professionals was sneered at by the engineers. Sometimes I think SME’s don’t get that software, in general, is just as complex and difficult to get right as the actual field of endeavor for which the software was written. In many cases the software is actually harder to do than the equations were to derive.

  54. Gary Strand
    Posted Jul 5, 2009 at 10:21 PM | Permalink

    Let me be clear – I don’t believe the CCSM code is a big mess – far from it. Incredible progress has been made in bringing up its coding standards, testing, version control, performance, portability and so on. There are many very capable people working on it. In a collaborative project involving the input of literally dozens of people, both inside and outside NCAR, it’s not an easy task to properly manage it. Last I heard, the code itself was roughly a million lines long.

    • Steve McIntyre
      Posted Jul 6, 2009 at 5:05 AM | Permalink

      Re: Gary Strand (#74),

      Gary, the programming issues are different in the statistical proxy studies that have been the main topic at this site. In such studoes, you do not have “production” programming, but one-off statistical analyses where the “science” is inseparable from the script. Yes, there are people who collect original data, but the analyses that led to this site (Mann, Jones, Briffa, Hansen …) are done by people who didn’t collect the data that they are analysing.

      As someone who’s spent a lot of time analysing these articles, it is my opinion that muddy scripts are typically associated with muddy “science”. For example, the Mann 2008 script is not simply poorly documented – it’s poorly organized, horrendously repetitive and IMO shows an almost complete lack of insight into the phenomena being analysed.

  55. GaryC
    Posted Jul 6, 2009 at 12:00 AM | Permalink

    I remember listening to a briefing in the mid 1980s about the difficulty in making the software for the Strategic Defense Initiative (SDI or Star Wars) work reliably. The speaker, whose name I have forgotten, said:

    “Some people tell me that they do not know how to make the SDI software system work.

    Others tell me that they do.

    I believe both groups.”

  56. steven mosher
    Posted Jul 6, 2009 at 1:38 AM | Permalink

    74. A million lines is mouse nuts. Android is about 11 million.

  57. Steve McIntyre
    Posted Jul 6, 2009 at 5:09 AM | Permalink

    PLEASE – limit generalized complaining about models. Editorially, generalized complaining gets tiresome, regardless of the justification.

    I’m OK and welcome critical comments about particular details, but let’s stick to details rather than piling on,

    • jeez
      Posted Jul 6, 2009 at 2:03 PM | Permalink

      Re: Steve McIntyre (#79),

      I think you mean I’m ok with. Or should we have been worried about your health Steve?

  58. Robinson
    Posted Jul 6, 2009 at 8:53 AM | Permalink

    As a Software Developer, I feel I have some expertise in this area (in a general sense). It’s certainly true that “muddy” scripts/software will lead to muddy results. Indeed, muddy software is almost always associated with a lack of requirements gathering and piling in new features that weren’t anticipated in the original spec. Given that these models were started perhaps decades ago, I would be surprised if they were anywhere close to clean and tidy today. One gets diminishing returns from maintenance efforts, without a total re-write, over time (entropy always increases).

    This leads me to ask whether or not source is available online for these GCM’s? Statistical practices aren’t the only things that may benefit from an audit.

  59. counters
    Posted Jul 6, 2009 at 9:21 AM | Permalink


    Yes, the source is very readily available. You can find all the source (in downloadable archives or in a browser-friendly viewer) for the NCAR CCSM3 (the model I presume Mr. Strand most closely works with) here. You can also find documentation and input files which, if you really wanted to, could be used to port the actual model to run on your own system. It’s not that hard to do – I ported the model to run on a machine at my University last semester. You can find the source for most models very easily by googling for the model name (GISS ModelE is can be found here, for example).

    I understand the frustration with poorly documented scripts and poor archiving practices that are discussed here. But please don’t jump to sweeping generalizations. For my own research and analysis, I break with the climate science tradition and utilize newer programming tools like Python/Matplotlib/RPy/SciPy, and utilize source version control (in the Python spirit I use Mercurial) to archive and record my work. Although I can work with netCDF directly via CDO/NCO, I prefer using higher-level tools for my analysis. Also, when I need to modify the CCSM3 (the research group I work with utilizes it almost exclusively), I work directly in Fortran, although I admit it can be incredibly frustrating for someone weened on Scheme/Java and O-O programming.

    The people who are piling on about Gary’s ‘admission’ about the software engineering side of climate modeling should check out the CCSM3 source linked above before jumping to conclusions about any implications on the science performed with such tools. You should keep in mind that the models are in a state of constant flux, with many researchers heavily modifying or adding code to the model to perform individual experiments. The piling-on about ‘quality control’ misses the point that we have a dozen+ modeling groups around the world with unique models that all point towards the same conclusions. It would be great if some independent group came in and dove into the code of a model such as the CCSM3 and extensively looked for ways of optimizing it and improving it. But the reality is that there isn’t a great deal of money available to support more than a small number of core workers to perform this task. Then again, the source is readily available, so there’s nothing stopping any of the software engineers here from taking on that task voluntarily in the spirit of science.

    • Steve McIntyre
      Posted Jul 6, 2009 at 9:55 AM | Permalink

      Re: counters (#81),

      Again I re-iterate that in other areas of climate science, requests for data and scripts are routinely denied. Phil Jones has even refused FOi requests for something as mundane as station data. Lonnie Thompson has refused to provide ice core data.

      When scripts have become available under unusual circumstances e.g. the Mann controversy, they evidence muddy thinking, not merely poor documentation, and sometimes even outright errors.

    • Steve McIntyre
      Posted Jul 6, 2009 at 9:58 AM | Permalink

      Re: counters (#81),

      I’ve tried to discourage piling on criticism, but equally let’s not get carried overboard with self-congratulation in this thread.

      I still would like to know why March 2106 went AWOL and why no one seems to care,

      • Michael Jankowski
        Posted Jul 6, 2009 at 10:06 AM | Permalink

        Re: Steve McIntyre (#84), “no one seems to care” because, as counters said, “we have a dozen+ modeling groups around the world with unique models that all point towards the same conclusions” (i.e., the errors don’t matter because other ‘independent’ studies get simialr results).

        As you surmised earlier, “it doesn’t matter.”

  60. Gary Strand
    Posted Jul 6, 2009 at 9:36 AM | Permalink

    Thanks for your comments, counters.

  61. Posted Jul 6, 2009 at 10:00 AM | Permalink

    For computer software the calculated results of which guide decisions that affect the health and safety of the public, there are no valid rationalizations; none.

    If problems exist, all of them must be fixed; whatever the problems and the alleged source of the problems.

    There is not sufficient information available to allow independent Verification of the NASA/GISS GISTemp and ModelE codes. Attempting to reverse engineer documentation out of the coding, and by this method construct specification documents, is a foolish waste of time. Especially if the coding has evolved over decades of time without any considerations given to building production-grade software.

    None of us have University-grade machines at our disposal. And, contrary to much speculation, none of us are in the pay of Big Oil, Big Coal, Big Energy, Big Hydrocarbon, etc.

  62. Posted Jul 6, 2009 at 10:10 AM | Permalink

    Gary Strand:

    PCM is an example of an “expired” climate model.

    PCM is also an example of a model whose runs were used to create the projections in the AR4.

  63. Jaye Bass
    Posted Jul 6, 2009 at 10:15 AM | Permalink

    “we have a dozen+ modeling groups around the world with unique models that all point towards the same conclusions”

    Try to bring that kind of attitude to a company like Airbus, Boeing, LMCO, or anybody else that builds safety critical systems. I think the point here is that vast and sweeping budget/societal changes are hinging on quasi-hobby shop code that is “constantly being modified”. I think to do this properly you would have the scientists tinker THEN put the models into the hands of an organization with a more mature process to do things like IV&V, QC, and other things that would seem to be necessary for work that has the capability to change so many lives. This large unorganized code base has an element of criticality that is not currently being adequately handled.

    • Steve McIntyre
      Posted Jul 6, 2009 at 10:20 AM | Permalink

      Re: Jaye Bass (#89),

      Folks, please stop piling on about quality control. The point’s been made.

      That’s not to say that there isn’t a pressing need for an independent valuation of at least one climate model. I’ve suggested that an engineerinq quality evaluation be properly funded and carried out for some time.

      IMO this is a higher priority than much, if not most, present work and should be a funding recommendation from IPCC or CCSP or NCAR.

      By an independent evaluation, I do not mean fellow climate modelers reviewing one another’s work – all the proponents are committed to the enterprise being roughly right. I mean a truly searching evaluation from competent modeling professionals from another field e.g. aerospace engineers, econometricians, statisticians, whatever.

      Speaking personally, unlike many readers of this blog, I do not assume that professionals screw everything up, despite the discouraging number of incidents otherwise. Nor do I assume that the fundamental conclusions of the models are “wrong” or that people should do nothing pending perfect certainty. Quite the contrary. However, I am very uncomfortable with adequacy of the due diligence and clearly many others share this opinion.

      I firmly believe that a substantial and properly funded due diligence program (and I’d start the budget at $10-20 million) would be very healthy in many respects. Professionals who are confident in their results and methodology should be unafraid of such a program. If $10-20 million wasn’t available, maybe one of the weak models should be shut down to provide the funding. But it’s silly to suggest that a major enterprise like this be carried out in their part-time by volunteer software engineers. (Software engineering, in any event, is merely one aspect of the due diligence.)

  64. counters
    Posted Jul 6, 2009 at 10:42 AM | Permalink


    I’m not trying to excuse your issues. I appreciate your efforts to keep the comment thread on task, and I think you and your moderation team has done a very good job (and has done a good job in the past). I don’t have a solution for trying to sign up all scientists to follow open-source code and data archival ethics. It would be very nice if such an ethic were the status quo in any scientific field, let alone climate science.


    The CCSM3 will run on a desktop; I’ve ported the CAM (its atmosphere model component) to run on my ASUS laptop with a bit of fuss. As you point out, AOGCM’s (and increasingly, Earth-systems Models) are far too complex to reverse-engineer. At this point, I think independent verification would have to rely upon analyses of model output (which is constantly being performed), or building other controlled models from scratch to compare against. The latter is not a trivial task, and as I already mentioned, the former is a constant, continual process. This is where climate modeling is; the reality is that these are the circumstances within which any independent verification of the model must proceed.


    Steve and I weren’t discussing exactly the same thing in those quote snippets. I don’t know why Steve encountered the anomaly/flaw/whatever that he did and I’d be interested to hear an explanation. My point about the various modeling groups pertains to independent verification – the fact that there are many models out there which point to the same scientific answers to experiments is an implicit part of the verification process and a reason why we have some confidence in our results.


    If you check out the sources I previously linked, you’d see that a modern climate model is hardly an unorganized code base. But to address your point, you’re not hitting on what I meant about “constantly being modified.” The base models actually aren’t being constantly modified. They iterate in generations tied to the IPCC reports, with updated models being published ~2 years before an AR. Those updated models will incorporate some of the major changes discussed in the peer-reviewed literature, such as parameterizations, dynamical core formulations, model domains, etc. The base, updated model will be spun up and used to generate control runs and transient runs (such as the IPCC emissions scenarios) which will be used for future reference and for the IPCC report.

    It’s from here that the model will be ‘constantly… modified’. Different experiments require different modifications. For instance, one experiment might deal with aerosol forcing using the most recent science from atmospheric chemistry; a research might add a custom module to the model for this particular experiment, or use a new forcing dataset. The base model code will not have changed; the researcher will have basically ‘plugged in’ a module or patch. Over the course of the model generation’s lifecycle, major advances might be made with respect to dynamical formulations or pertinent physics or chemistry. It is the job of the model’s maintainers to incorporate these major changes into the next iteration of the model. The aim is to improve the model by the time the next IPCC AR rolls around.

    Things are more controlled than just “scientists tinker.”

  65. counters
    Posted Jul 6, 2009 at 10:45 AM | Permalink

    Previous post was started before Steve’s most recent post, so it would probably wise to drop the line of discussion I’m continuing. Also, I’m not seriously suggesting that volunteers or CA readers should download the source and analyze it; I’m merely suggesting that the open-source philosophy would greatly help alleviate the issues that Steve raises across the entire science.

    • Steve McIntyre
      Posted Jul 6, 2009 at 11:17 AM | Permalink

      Re: counters (#92),

      those comments are reasonable.

      From an editorial point of view, I get tired of people complaining about models – or more precisely, making the same complaints, and I want to tone that down. I’m not asking that people give up these opinions, but asking that people avoid piling on.

  66. Jaye Bass
    Posted Jul 6, 2009 at 11:20 AM | Permalink


    Can you reconstitute the model code and input versions for any given set of “runs for record”?

  67. counters
    Posted Jul 6, 2009 at 12:17 PM | Permalink


    I’m not sure I follow what you’re asking. Can you elaborate a bit?

    • Jaye Bass
      Posted Jul 7, 2009 at 10:15 AM | Permalink

      Re: counters (#95),

      Sure, suppose one produces a set of runs for a paper…what I would call “runs for record”. Are the versions of the code (source and/or binaries) and the set-up files sufficiently configuration managed such that, a year from now another researcher could exactly reconstitute that set of runs down to the exact version of any component of the code base and the inputs?

      • counters
        Posted Jul 7, 2009 at 10:42 AM | Permalink

        Re: Jaye Bass (#118),

        Ah, I see exactly what you mean now. Speaking from experience with the CCSM3 (I imagine it would be similar for other models although I don’t work with them), you would only need three things to accomplish this task: 1) the model configuration script (which is just a shell script, although I’ve experimented using Python scripts instead); 2) the same initial data files (which are often times the freely available files published on NCAR’s website); and 3) any SourceMods utilized by the researchers (rather than plugging new code into the model source, the common practice is to copy a Fortran module you’ll modify and place the modified version in a directory off the model root; the model will then use that source file instead of the identically named one in its source).

        Often times, researchers will ‘spin up’ the model to perform a transient experiment. In this case, you could also ask for the branch or restart file from which the actual experiment was run. This is probably the most common practice. All of these files are available on the MSS at NCAR, so accessing them is trivial to active researchers with NCAR accounts. Also, it is good practice to make local copies of the datasets, so most researchers would probably be able to send someone copies of the files directly if someone doesn’t have NCAR access (which is limited to active researchers for obvious reasons).

        There is one caveat, though: if you’re not running the same compilation of the model on the exact same system as the original research run, you won’t get *exactly* the same results. But, this is the computer science *exactly*; there will obvious be minor differences in the model depending on what compiler you use and there are a suite of reasons why different architecture systems will yield slightly different results. For this reason, the first thing everyone does once they port the model is run a validation test against control data from NCAR’s current supercomputer (bluefire). If validation goes well, you still do a full simulation (decade-century) and compare against control runs on bluefire just to be safe. To bypass this issue, most climate scientists running the CCSM3 will just queue up runs on bluefire. Bluefire is a hell of a lot faster than any other computer that most climate scientists have access to (although I’m very interested in setting up a Tesla desktop-supercomputer and porting a simple model to CUDA…)

        I hope this answers your question! Let me know if you want me to elaborate a bit on any part of this answer.

  68. Posted Jul 6, 2009 at 12:32 PM | Permalink

    I find discussion of messy codes or whatever somewhat misses the point about so many climate models coming to the same answer. I have yet to find any accessible critical evaluation of the models themselves – i.e. how the science gets translated into mathematical form and then into code. This process is ‘translation’ from one language to another – and what gets lost as well as what gets captured are important issues. I presume – from lack of detailed knowledge but also ‘first principles’ – that coding issues has some feedback on the mathematics selected, and that mathematical constraints impact upon the way the science is selected. For example – in the science of climate change there are clearly cyclic phenomena at play – but the periodicities are irregular, and hence when it comes to modelling you cannot provide an accurate starting point within a cycle (and as there are more than one and they interact this becomes almost impossible) – so the cycles get replaced with a mathematical form that supposedly replicates the overall pattern of variability – the model is not replicating the climate processes, it is replicating the final pattern.

    There is no way of validating the ‘hindcasting’ used to gain faith in the models.

    But there are things that can arise that shake the faith in the ‘replication’. For example: the AR4 models all replicated the upper ocean heat content signal between 1950-2000, and this was hailed in the pages of Science as a validation; and the models all replicated the ‘global dimming’ episode from 1945-1980 by using a sulphate aerosol cooling component to reflect anthropogenic sources of sulphur. BOTH these validations have proven wrong – the first, after AR4, with the re-analysis of the heat content data and correction of bias, cutting warming by a factor of two; the second, following reanalysis of short-wave flux data, global dimming was not an anthropogenic phenomenon (which was localised on land) but a global (and primarily oceanic) phenomenon – and this did get into AR4, but not with any discussion of its implications for hindcasting!

    There are two other major areas of error/uncertainty: 1) the gain factor of 300% assumed for the water vapour amplifier (which has had plenty of discussion on WUWT) much criticised by Lindzen and central to the issue of cloud thinning and SW flux to the surface (with Nasa’s Wang stating in a recent newsletter that the models cannot discriminate between (AGW)warmer oceans driving thinner clouds, or (natural cyclic) thinner clouds driving warmer oceans. 2) the inability to model cycles such as the PDO/AMO – let alone longer term Little Ice Age/Medieval Warm Period drivers.

    On the latter – I recently talked to two groups of modelers – one at the MetOffice at Hadley Centre, the other at Oxford (UKCIP) – with both groups relying upon the same Hadley Model to provide projections into the future. My interest was triggered by the UK government’s latest projections to 2080, where most of the UK warms by 4 degrees Celsius (plus or minus 3). At Hadley I asked about cycles – because part of their team includes oceanographers who had been party to the 200% revisions, as well as attempts to model the AMO. They told me they had two teams – one working on medium range projections (e.g. 2020) and the other long term (2050-2080) and they thought cycles would only affect the medium term – and yes, they agreed, that there would be some cooling in the medium term, but after that…..normal warming would be resumed.

    At Oxford, I was set to ask about the medium projections, as a seminar had been set up for ‘users’ of the projections and as a land-use advisor I felt 2080 and the large range was of little use, even with ‘probability’ estimates attached – but was astounded at the honesty of the presenters who referred to their methodologies as ‘flakey’ and replete with uncertainty. They agreed that the medium term could include cooling due to cycles – but then re-iterated the faith that warming would resume.

    There was not time for a detailed critical discussion. Nor is one welcomed. The medium projections will not arrive in time for Copenhagen. The UK position is being formulated right now.

    It is obvious to me that the models are potentially seriously flawed. If natural cycles have the power to overide the GHG effect, then they also had the power to amplify the 1980-2005 ‘signal’ of warming that IPCC AR4 assumes is 100% due to GHGs. The fact the models replicate the 1980-2005 temperature curve WITHOUT any reference to the PDO/AMO or Arctic Oscillation, means that without a doubt the models are not replicating the mechanisms of climate change and the GHG effect.

    These are serious issues and obvious criticisms fully supported by peer-reviewed literature. I have done what I can to catalogue the potential flaws (in a book CHILL: a reassessment of global warming theory) in a way that ought to engender a scientific discussion here in the UK – but it appears in no-one’s interests to debate. The modelers are actually quite honest about their limitations – but the institutions of science are not. And then there are few politicians or campaigners who want to muddy the water.

    I therefore agree with your commentators – major decisions are being made on very flimsy grounds.

  69. Richard Henry Lee
    Posted Jul 6, 2009 at 12:48 PM | Permalink

    There is a possible explanation for problems with this model run at the following errata link

    A search for “sresa1b” reveals several instances where the time variable has problems. For example, in an entry dated 12/10/04, it states “Time array is invalid, has zero values.”

    Some of the other errata notations are interesting as well: “Time range is encoded as 1860-1960, should be 2000-2100.”

    • Steve McIntyre
      Posted Jul 6, 2009 at 6:30 PM | Permalink

      Re: Richard Henry Lee (#97),

      Interesting link.

      In June 2005, the absence of March 2106 data from the UKMO model was reported at PCMDI here. The “status” column said “data was available” but the data still is not provided four years later. Maybe someone who is familiar with errata page can explain. There are many curious entries on the errata page BTW.

  70. CG
    Posted Jul 6, 2009 at 1:13 PM | Permalink

    “My point about the various modeling groups pertains to independent verification – the fact that there are many models out there which point to the same scientific answers to experiments is an implicit part of the verification process and a reason why we have some confidence in our results.”

    The problem that most readers here have with this statement is that confirmation bias removes some of the indepedence of these models. If you produced a new model from scratch and it didn’t project things anywhere near what other models did, you’d assume you did something wrong and look at what the others did, which tends to bring yourself in line with the consensus view and defeats most of the purpose of doing the model.

  71. counters
    Posted Jul 6, 2009 at 2:05 PM | Permalink


    But is that really the process used to check models against one another?

    If you follow the literature on modeling efforts, you find a rather different pattern. All models behave somewhat differently – especially now, as various ESM’s have different components incorporated (such as interactive Nitrogen cycling in the most recent CCSM). No one expects them to match exactly. Instead, we expect them to act in a manner that is physically plausible. Or, we expect them to share certain glaring flaws – the double-ITCZ comes to mind.

    What really happens is that you don’t immediately compare models against each other. First, you would want to check if the model is physically plausible – does it behave realistically? Can it reproduce the past century’s climate? Then, you compare how it does in these tasks to other, better-established models. At the end of this process, you don’t ‘tune knobs’ (to borrow a colloquialism often levied against models) to bring them more in line with the other models; instead, you start performing experiments to understand why your model doesn’t match exactly and if the differences are statistically significant.

    You can’t perform transient climate experiments until you understand these core aspects of your model, because you need to be able to account for the differences you’ll see in your projection runs compared to other models’.

    • Pat Frank
      Posted Jul 6, 2009 at 8:00 PM | Permalink

      Re: counters (#100), “First, … does [the GCM] behave realistically?” ‘Realistic’ does not mean physically predictive. An explosion simulated for a movie can look realistic, and have nothing to say about how explosions actually propagate. Likewise climate models relative to Earth climate. They may produce ocean heat contents, for example, that are ‘realistic,’ but may do so in physically non-legitimate ways. That other models may do something similar is not a validation. It just means one model is a kind of simulacrum of another. They may just make similar mistakes. In my Skeptic analysis, for example, cloud projection errors were found strongly correlated among 10 different GCMs, and non-random.

      Parameter sets are chosen to yield ‘realistic’ outcomes. How many different sets of parameters will produce equivalently ‘realistic’ outcomes in a given GCM? What are the predictive uncertainties associated with multiple alternative parameterizations? If climate models are non-physical (atmospheric hyperviscosity, for example), how can the outcomes be physically valid?

      Validation of models by comparison of outputs with other analogous models is anathema to any experimental scientist. Such procedures hardly rise above circular.

      • Craig Loehle
        Posted Jul 6, 2009 at 8:14 PM | Permalink

        Re: Pat Frank (#110), The irony here of course is that the models do NOT behave similarly. In the hotly contested question of the tropical hotspot, the models vary so much the simulation output envelope includes all posibilities. In simulations of climates 6000 years ago or during the ice age or other periods, there are huge discrepancies (which of course are often blamed on the data). The models do not simulate the same absolute average temperature for the Earth, so everyone uses anomalies vs the model’s own mean, which is truly bizarre when black-body radiation operates on the fourth power of temperature. The models do not agree on the distribution of clouds with latitude. They don’t agree on how long Arctic ice will last under a warming scenario. etc. I fail to see consensus among the models except “it will get warmer” which is not very robust IMO.

      • counters
        Posted Jul 7, 2009 at 8:28 AM | Permalink

        Re: Pat Frank (#109),

        Now we’re only looking at the tail end of the modeling process. You need to consider the beginning end as well – the core constituents of the model. At this level, we’re not talking about ‘parameterizations’; the core of a climate model is just a suite of physics equations. For instance, since we’ve been discussing the CCSM3, you can read (in detail) about the core physics and dynamics equations in the CAM3 here. A JoC article from a special edition back in 2006 details these equations in even more detail – I can fish it up if you’re interested.

        Consider your point about cloud errors for a moment. It’s widely known that GCM’s are deficient in their cloud-modeling capabilities. But keep in mind the ‘why’ to this question – model resolution is too coarse (due to limited processing budgets) to accurately resolve clouds. No one ever discovered a panacea for solving this problem without increasing the model resolution. Thus, is it any surprise that most models utilized similar cloud parameterizations? Considering this fact, is it really that surprising that most models suffer this same flaw? I think it’s far more interesting to consider the huge variability in the last model generation’s projections of variables such as precipitation.

        Your point boils down to alleging that the models are unphysical. There is a huge body of literature discussing this topic, and in general, those who work directly with the models disagree with your analysis.

        • Pat Frank
          Posted Jul 7, 2009 at 6:47 PM | Permalink

          Re: counters (#116), I have nothing but admiration and respect for the physics that goes into climate models. I also think the project to model climate is a magnificent enterprise. I also greatly respect the protocols and methodologies that experts apply in their field. Specialists have thought deeply about their problems and their solutions, and are aware of the tricky ways nature has hidden in their observables. Coming newly into a field, it’s easy to fall into traps that experts in the field have learned to avoid, or to make mistakes that experts easily see.

          That said, there are a few sine qua nons of science that are universal. One of them is that models are tested against observables, not against other models. Only climate science seems to claim an exception from this principle. There is no warrant for such exceptionalism. Modelers should be looking at observables to improve their model, not to other models. Model parameter sets ought to be rationalized strictly from physics, not from whether they give good results in someone else’s model.

          You by-passed the mention of hyperviscosity. This non-physical graft is necessary to prevent model divergence. Further, Carl Wunsch has noted that ocean models do not converge. He noted that modelers brush aside his queries about that, because the non-converged outputs “look reasonable” (Wunsch’s description). ‘Looks reasonable’ sounds a lot like your “realistic,” and is no grounds for confidence.

          With respect to clouds, condensation takes place across microns, not kilometers or even meters, and the microphysics is not well-understood. Nevertheless it is critical to cloud formation. Cloud opacity is dependent, in part, on the presence or absence of aerosols and particulates, and condensation depends on the nature of the particulate — silica vs. soot, for example. These differences can’t yet be modeled. The equations of turbulence can’t be solved exactly and Gerald Browning in this forum has gone through the problems of energy dissipation and upward cascades in some detail. These are not problems of resolution.

          You noted that clouds are not well-modeled and stopped there, as though this problem — whatever the cause — is not important to climate projections. Only a few percent change in tropical cloudiness can obviate any warming due to extra GHGs. And yet, whether from resolution or from physics, clouds are poorly modeled. That means climate projections are not reliable. But somehow, this doesn’t seem to matter when modelers publish projected climate futures.

          You mentioned “the huge variability in the last model generation’s projections of variables such as precipitation.” But precipitation problems occur in part when the models have been trained to reproduce observed temperature fields. When models are trained on precipitation fields, the temperature variables are not reproduced. These see-saw divergences show a fundamental difficulty with thermal dynamics within the models, including accounting for the latent heat of condensation – evaporation. This problem probably contributes a good part of the error models produce in the TOA thermal flux. When thermal problems are subsumed within errors in precipitation fields, projections of air temperatures must obviously be in error. But somehow, this is never admitted when air temperature projections are published.

          Climate science exceptionalism. It’s self-serving and pervades the field.

        • John F. Pittman
          Posted Jul 8, 2009 at 6:44 AM | Permalink

          Re: Pat Frank (#122),

          Nor has the heat convection via water vapor to TOA with the hyperviscosity and its effect on this transfer been well defined. When you, Jerry, and I tried to pin Gavin down on this, he also reverted back to 1.) model backcasted well; 2.) it agreed with other models; 3.) and the models were right because they got the changes from major vocano eruptions correct.

  72. Steve McIntyre
    Posted Jul 6, 2009 at 2:09 PM | Permalink

    Here’s another odd UKMO CM3 output from KNMI – March values in this slice are all absolute 0 except for March 2106.

  73. Les Johnson
    Posted Jul 6, 2009 at 2:38 PM | Permalink

    There are only two conditions in this universe that could lead to getting that hot (1E20).

    a. the universe immediately after the big bang.

    b. my ex-wife

  74. Posted Jul 6, 2009 at 3:09 PM | Permalink

    Gary Strand #49,

    What that sounds like to me is: “Quality costs more than we can afford”.

    I can tell you true that in aerospace that would never fly. Why? Government inspectors would never allow it.

  75. Shane
    Posted Jul 6, 2009 at 3:57 PM | Permalink

    snip – I repeatedly ask people not to pile on with generalized complaints.

  76. Neil Fisher
    Posted Jul 6, 2009 at 4:51 PM | Permalink

    This is probably OT, but I can’t think where else to post it.

    Re: counters (#91),

    Things are more controlled than just “scientists tinker.”

    It may seem that way to you, but I assure you it’s not. I have seen similar things happen in the interface between old-fashioned comms people and, um, new-fashioned (if you will) data comms people. In that case, the old-fashioned comms people have “been there, done that” and are fully aware of the problems that sloppy practices introduce – especially sloppy record keeping. Yes, it’s tedious, time consuming and let’s face it, downright boring work, but it needs to be done and the longer it is put off, the worse things get and the harder it is to fix. It may seem easier now to just “go with the flow” and “focus on results” etc, but when something goes wrong – and something will go wrong, if it hasn’t already – and no-one has a reasonable explaination of why, or knows how to fix it quickly, or can even detail exactly what happened, the brown stuff will hit the rotating thing and splatter over a lot of people who probably don’t deserve it.

    That’s why details matter, and that’s the value of CA – the point is not that someone found a mistake, it’s that no-one has any procedures in place to either i) find those mistakes or ii) document them so they don’t propogate any further or iii) make sure similar things can’t happen again. What makes it insulting is not that these things happen – they happen in all areas of engineering and science at some point – but that the people who point these things out are denigrated for doing so. Oddly enough, these people are actually trying to help by providing insights that only specialists in that particular area truely appreciate. Which is exactly the same complaint made by climate scientists about their results! If people are “deniers” for not listening to specialist climate scientists on climate, aren’t climate scientists also “deniers” for not listening to professional software engineers on matters of software, or stats people on matters of stats, or QA/QC people on matter of QA/QC?

  77. ianl
    Posted Jul 6, 2009 at 6:12 PM | Permalink

    snip – please do not editorialize on policy implications.

  78. Steve McIntyre
    Posted Jul 6, 2009 at 9:10 PM | Permalink

    #109, 110. As I’ve said repeatedly, I see no purpose in one paragraph venting about models. Please stick to more finite issues and avoid general complaining.

  79. Poptech
    Posted Jul 6, 2009 at 9:44 PM | Permalink

    Very occasionally a file is corrupted at some point, either during the write to disk from memory…

    Really? Your workstations/servers do not have ECC memory, are using a Journaling file system and your HDs are not in a RAID setup? Are you serious? You have to be kidding me.

    [ed: author renamed to Poptech]

    • DeWitt Payne
      Posted Jul 7, 2009 at 1:44 AM | Permalink

      Re: Andrew (#114),

      Too slow and too expensive would be my guess as to why models aren’t run on true enterprise class systems. I can’t even begin to imagine what it would take to make a massively parallel computer system meet enterprise specs. ECC memory and redundant disk arrays would be just the tip of the iceberg. A first order approximation would be several times as many CPU’s so each calculation could be run in parallel and tested.

    • Gary Strand
      Posted Jul 7, 2009 at 7:47 AM | Permalink

      Re: Andrew (#112),

      Really? Your workstations/servers do not have ECC memory, are using a Journaling file system and your HDs are not in a RAID setup? Are you serious? You have to be kidding me.

      I didn’t mean to say that the hardware is flawed (yes, AFAIK, the systems we use have ECC memory, RAIDs, and so on); the instance I’m recollecting from old PCM runs is that the layout of the file (header + data) was corrupted at some point. It wasn’t reproducible, it was random, and it wasn’t easily detectable. Basically, fixing the files required examining the header of every single file and checking the value of a certain field in the header, and if it was incorrect, then replacing the header (after suitable modification) and rewriting the file. That was a weird instance.

      The greatest non-replaceable data loss I’ve ever had (and this is over a time span of many years and literally hundreds of millions [billions?] of I/O operations) was from an archival tape that was damaged and for which we didn’t have a backup copy of the data stored upon it. I believe I already explained that one.

      At one time, the model would do a checksum on any output file, write it to archival tape, remove it, read it back from archival tape, do another checksum, and compare the new checksum to the older one to make sure they matched, but that’s a bit overboard, IMHO. It also turns I/O into a major chunk of the model’s wallclock and run time, and since run time is finite, and expensive, the process was dropped. It’s still possible to do it, but unnecessary.

    • Andrew
      Posted Jul 7, 2009 at 10:12 AM | Permalink

      Re: Andrew (#112), For the record, this person is subverting my name. I did not say that.

      Since I’m 90% sure I was here first, I think if he comes back, he should change his name.

  80. Jaye Bass
    Posted Jul 7, 2009 at 4:39 PM | Permalink


    Thanks for the direct answer. You are aware that there are rumblings to the contrary out in cyberspace. Its good to know that, at least for this model, some of this repeatability is being handled in a professional manner.

    There is one caveat, though: if you’re not running the same compilation of the model on the exact same system as the original research run, you won’t get *exactly* the same results. But, this is the computer science *exactly*; there will obvious be minor differences in the model depending on what compiler you use and there are a suite of reasons why different architecture systems will yield slightly different results.

    Yes, I mean “exact” up to the uncertainties wrt compiler versions, machine precision, etc.

    (although I’m very interested in setting up a Tesla desktop-supercomputer and porting a simple model to CUDA…)

    I know of several efforts, outside of climate science, where atmospheric models are being ported to this sort of environment quite successfully.

  81. dhmo
    Posted Jul 7, 2009 at 6:07 PM | Permalink

    I am an analyst/developer with a formal education in computer science. After 30 odd years in computing I have retired and one of things I want to do is understand the workings of computer models and to that end I have downloaded a copy of Model E and a Fortran compiler.

    This thread is about how did an obviously incorrect value occur? If such an error occurred in any software I had produced my first assumption would that there is a bug in the code. Having read a book by Mueller and Storch about modelling this should not be a surprise. I have worked for the last ten years for government in a section than ran a financial systems funding universities and students. The testing was rigorous and very necessary. I expected it should be the same for GCMs but from this thread it seems not! Why aren’t assert statements used, it is common in my experience.

    My guess is that it is to do with speed. I have never used a super computer but surely the compilers on these systems are not that much different from the ones I have worked with. You can switch code optimise on or off, if the ultimate goal is speed then you switch it off. This can be only valid if you can adequately test your results are still the same. This is because you are actually giving permission for your code to be messed with. Maybe they are not testable?

    My assessment so far is that GCMs are an attempt at a virtual reality of an analog, event driven world with low resolution (relatively) cells and iteration. My assumption is that this is because computers are digital and too slow for anything other than iteration. Beyond this my world had business, analysts, software developers and testers. All told about 40 or 50 people. Business in consultation with the analysts produced specifications for the developers to code from. The software was then tested against test plans by the testers. I would hope a similar process happens with GCMs. Our operational budget was in the tens of millions so I would hope the budget for climate model is much higher than that or am I wrong. I hope not.

  82. Poptech
    Posted Jul 7, 2009 at 6:59 PM | Permalink

    I do not find the excuses for not having backups acceptable at all. Nor that the modelers “believe” the low resolution and lack of cloud modeling behavior is ok and discussing this in “literature” does not make it any more relevant to reality. These are serious deficiencies with the simulations making any conclusions purely theoretical and unrelated to the actual climate.

%d bloggers like this: