Gavin Schmidt: station data "not used" in climate models

Gavin Schmidt has told Anthony Watts that the problematic station data are not used in climate models and any suggestion to the contrary is, in realclimate terminology, “just plain wrong”. If station data is not used to validate climate models, then what is?

His point seems to be that the climate models use gridded data.

But isn’t the gridded data calculated from station data? Well, yes. (And it wasn’t very hard to watch the pea under the thimble here.) So Gavin then argues that the adjustments made in calculating the gridded products have “removed the artefacts” from these poor stations:

If you are of the opinion that this station is contaminated, then you have to admit that the process designed to remove artefacts in the GISS or CRU products has in fact done so -

At this point, all we know is that the process has smoothed out the artefacts. Whether the artefacts have biased the record is a different question entirely and one that is not answered by Gavin’s rhetoric here. At this point, while we have a list of GISS stations there still is no list of CRU stations or CRU station data. How could one tell right now whether CRU has “removed the artefacts or not”? So on the present record Anthony doesn’t have to admit anything of the sort. OF course, if the data and code is made available and it becomes possible to confirm the truth of Gavin’s claim, this situation may change. But right now, no one can say for sure.

Gavin then asserts than any removal of contaminated stations would improve model fit. I’m amazed that he can make this claim without even knowing the impact of such removal.

Personally I’m still of the view that modern temperatures are warmer than the 1930s notwithstanding the USHCN shenanigans. But supposing that weren’t the case and all the stations in the USHCN with very big differentials turned out to be problematic and the good stations showed little change. Surely this wouldn’t improve the fit of the models. I’m not saying that this will be the impact of the verification. I think that the verification is interesting and long overdue, but I’d be surprised if it resulted in big changes.

But you don’t know that in advance and for Gavin to make such a statement seems like a “foolish and incorrect” thing to do. :twisted:

He urges Anthony not to ascribe “consequences to your project that clearly do not follow” but obviously feels no compunction in making such ascriptions himself. Check it out.


157 Comments

  1. mjrod
    Posted Jun 23, 2007 at 12:22 AM | Permalink

    Well, it’s clear that Anthony needs to look for gridded thermometers. No wonder all those stations have problems. Apparently, the NOAA have moved on to more reliable technology. Anyone have a picture of a gridded thermometer?

  2. Jim Edwards
    Posted Jun 23, 2007 at 12:24 AM | Permalink

    Somebody poured MTBE, mercury, and dioxin in the well water but the water was subsequently pumped into that big holding tank before it was piped into your house so you have to admit that the water is now pristine.

    Sounds good to me, I’m gonna make some Kool-Aid for my kids…

  3. Graham AS
    Posted Jun 23, 2007 at 12:43 AM | Permalink

    I think what Gavin is saying is that regardless of the nature of the changes in the data consequent on removing the ‘bad’ records he will be able to make a change in the model (parameters, initial conditions etc) which will maintain, or even improve, the fit. Sadly, I tend to believe him. Of course, whether such model changes are justified I’ll leave to others to judge (when such changes are subjected to proper scientific scrutiny).

  4. IL
    Posted Jun 23, 2007 at 2:11 AM | Permalink

    Another analogy would be that Gavin is saying that the large cheque that his uncle has paid him is perfectly valid and legal. Yes he is aware of serious allegations that his uncle robbed a bank and cashed the money into his personal bank account but because the only thing that he, Gavin, sees is the money that has been paid from his uncle’s bank account, its all perfectly ok. Unbelievable.

  5. Steven mosher
    Posted Jun 23, 2007 at 3:19 AM | Permalink

    Anthony,

    You’ve kept it civil with Gavin. I’d ask him if his claim about ModelE can be
    verified and replicated, including the CRU data. Did he verify CRU, take it on
    face value? etc etc. Until is replicated ( not merely peer reviewed) Its simply
    a warm version of Pons and Flieschman.

  6. Steven mosher
    Posted Jun 23, 2007 at 3:25 AM | Permalink

    Hey where is the Bunny?

    We don’t have to “falsify GISTEMP” nasa doesnt use it. At least that’s How I read Gavin

    Here is an interesting question. Does Mann use it?

  7. Bob Meyer
    Posted Jun 23, 2007 at 3:36 AM | Permalink

    Schmidt stated (on Anthony’s Blog):

    We compare the models to the gridded products that deal with individual station problems as best they can. We have used the GISTEMP and CRU products to do so.

    He continues farther down:

    Thus reductions of the trend at [a particular] station would actually improve the match to the model – always being clear that you shouldn’t really compare model grid boxes to individual stations…

    Is Schmidt actually suggesting that large changes in the individual station data would have no effect on the grid data because by some occult process they have already “fixed” the deviant station data? If so, why not place the list of deviant stations on line so that we can compare them to the actual station audits? Or better yet, like the old TV quiz shows they can send the list of deviant stations by a certified, bonded courier to an escrow company that will store them unseen in a huge vault with an intimidating armed guard standing watch. Then, when we complete the station audits, we can see if we “correctly” identified the deviant stations. Obviously, the gurus at NASA can tell simply from the data which stations are deviant while we poor mortals not having arrived at their level of initiation into the great Mysteries of Climate must actually go to the stations and hope that we can tell if a station in an asphalt parking lot, next to an air conditioner might be producing erratic data.

    At my age I should no longer be surprised if people like Schmidt actually believe that they can derive all of nature’s oddities from first principles without ever actually looking at the world.

  8. Terry
    Posted Jun 23, 2007 at 6:48 AM | Permalink

    Gavin Schmidt has told Anthony Watts that the problematic station data are not used in climate models

    Gavin then asserts than any removal of contaminated stations would improve model fit.

    So let me get this straight … station data is not used in climate models, but if you change the station data by removing some of them, it changes the model results. Is this another example of telekinesis?

  9. John Norris
    Posted Jun 23, 2007 at 7:38 AM | Permalink

    re

    If you are of the opinion that this station is contaminated, then you have to admit that the process designed to remove artefacts in the GISS or CRU products has in fact done so – (i.e. that grid box in the product does not have a 2 deg/Century trend).

    It appears to me that the worlds greatest climate modeler is struggling with “if P then Q”. That doesn’t bode well for the community.

  10. James Lane
    Posted Jun 23, 2007 at 8:03 AM | Permalink

    #8 Terry

    So let me get this straight … station data is not used in climate models, but if you change the station data by removing some of them, it changes the model results. Is this another example of telekinesis?

    That’s not what Gavin is saying. He’s saying that the models are “compared” to station data processed by GISS and CRU. Then he uses a cute rhetorical device to claim that because the station discussed by Watts has a larger trend than the corresponding adjusted grid-cell, the adustments made by GISS and CRU are all tickety-boo. Best of all, in the same breath Gavin says you can’t tell anything from a single site temp series.

  11. Posted Jun 23, 2007 at 9:23 AM | Permalink

    OK. So I have a V8 engine that doesn’t run very well. I haven’t bothered to compression check each cylinder to find out if the problem is one bad cylinder (burnt valve) or bad compression (worn piston rings). Instead I go out and buy expensive Slit Fire spark plugs, Nology spark plug wires, expensive distributor and cap, and since the engine runs better than it did before, the problem if fixed. Of coarse you increased the performance some, but you still have an engine that’s going south, and sooner or later (probably sooner), you’re still going to find yourself stranded on the side of the road.

  12. Posted Jun 23, 2007 at 9:40 AM | Permalink

    If station data is not used in any form, then the models must be construced entirely from first principles of physics. Any ‘calibration’ of the model, specification of intial conditions, CO2 sensitivity etc would refute the claim.

    Secondly, if removing the contaminated stations reduced the 20th century increase to the point there was no increase in temperature, how could that possibly improve model fit, when the models show an increase of 0.5deg?

  13. pk
    Posted Jun 23, 2007 at 9:44 AM | Permalink

    So what is the actual output of ModelE? Is it gridded or regional temperatures?

  14. Sudha Shenoy
    Posted Jun 23, 2007 at 10:13 AM | Permalink

    In his first post Gavin Schmidt says:

    Observational data at large scale (not individual stations) are used to evaluate the models after they’ve been run – but again generally only at the continental scale and above. The evaluation is not just with trends but with patterns of variability (El Nino responses, NAO etc.) and obviously, the better the data the more reliable the evaluation.

    I’m only an historian. As I read it: (1) they run the models (2) The models are then evaluated against ‘data’ at the continental level & above [?the entire globe - ?], large-scale variability, etc. (3) Better data mean better evaluation.

    With respect to (3): Surely it’s the other way around? The model has to explain the data? If the data are already poor, then ??? where does the model come from?? How do they know how good/bad the ‘data’ are?

    (2) Where do these data come from? At the continental/global levels? Do any individual station data ever get used, at any point, to develop the continental-leval data?? global data?? Is he saying that the ‘good’ stations swamp the ‘bad’ stations? How does anyone know?

    Sorry for these simplistic questions.

  15. Anthony Watts
    Posted Jun 23, 2007 at 10:47 AM | Permalink

    Kenneth Trenberth of NCAR, made some astounding claims about GCM models such as “E” on Roger Pielke’s Climate Science Blog. He says: “None of the models used by IPCC are initialized to the observed state and none of the climate states in the models correspond even remotely to the current observed climate.”

    and also

    “Moreover, the starting climate state in several of the models may depart significantly from the real climate owing to model errors. I postulate that regional climate change is impossible to deal with properly unless the models are initialized.”

    as discussed at:

    http://climatesci.colorado.edu/2007/06/18/comment-on-the-nature-weblog-by-kevin-trenberth-entitled-predictions-of-climate/

  16. Sir O B
    Posted Jun 23, 2007 at 11:04 AM | Permalink

    Amazingly now that CA has broached this issue, the glaciers have started to unmelt. More progress is needed.

  17. Steven mosher
    Posted Jun 23, 2007 at 11:15 AM | Permalink

    And you all ignored me when I said modelE was poop.

    And you all ingored me when I said gavin was no fool when he said all you need are 60
    good stations.

    SteveM, I was under the assumption that the models had to do a calibration against the
    observed record. ever see that? or results from that?

    Now, I’ve seen that CRU and GISS differ somewhat. Is gavin saying or implying that they check
    ModelE against both “records” … hmm choose your factual record that best fits your model?

    It would be interesting to see what adjustments were made to CRU ( if any) after initial testing

    Another random point: when I look at the prjections of the IPCC GCM ( after esemble runs) I see
    an “error band” of .5c around a Given SRES.

  18. samoore
    Posted Jun 23, 2007 at 11:17 AM | Permalink

    I guess I just don’t understand…

    I have a chemical process that I want to analyse.
    I make a model of the process, using known chemistry and physics.
    Then, I collect daily average data (from hourly averages) to determine how the process actually behaves.
    Then, I compare my model to reality.

    If I understand Gavin Schmidt correctly, I’m not only NOT using reality in my analysis, but the hourly averages aren’t a part of the daily data!

    I haven’t had a drink in over 10 years, but this might be enough to make it seem worthwhile.

  19. Steven mosher
    Posted Jun 23, 2007 at 11:28 AM | Permalink

    opps I hit enter to early, premature! exclamation.

    I think Gavins comment to Anthony leads to some interesting questions about the “60 good sites”
    comment he is supposed to have made. How would he know unless he tested? probably did. And
    so, which 60 sites? and what does good mean? minimizing error in the hindcast?

    I think its also interesting that both “Hansen” and “Jones” have dramatically reduced the number of
    sites that go into the land surface record.

    If I were calibrating a model against a “factual” record, I’d look at two methods.

    1. fiddle with the model ( knobs and dials to force a fit )
    2. Fiddle with the “factual” record.. smooth stuff, impoves the model fit potentially.

    other random points.

    I think Warwick posted a map on here showing the sites for CRU grid 35N-40N 115W-120W..

    I think USHCN says that gridding at 5×5 is not reccomended.. and I think they suggested 2.5 grid
    and some non square gridding.

  20. Gerald Browning
    Posted Jun 23, 2007 at 11:38 AM | Permalink

    It has already been shown that there is insufficient global large scale data to provide adequate initial data for global numerical weather prediction. For example, the relative errors in the numerical solution for the Canadian global model over the US (where the observations are most dense) approach 70 % in 36-48 hours (see Sylvie Gravel’s manuscript on Exponential Growth thread). The only way in which these global large scale NWP models stay on track is by continually updating (assimilating or inserting) new large scale observational data into the model every 6-12 hours. And it has also been shown that one only need interpolate the wind data and periodically insert the result into the model to obtain an equally valid forecast during the same time period (this is a simple procedure to analyze mathematically).

    If there is not adequate large scale data for the large scale NWP models, how can there be adequate data for large scale climate models that are not updated with new observational data? The parameterizations (physical forcing approximations) in the NWP models lead to the large errors. The climate model forcings are even more crude (have larger errors).
    Gavin Schmidt’s argument is nonsense.

    Jerry

  21. Frank M. Tuttle
    Posted Jun 23, 2007 at 11:55 AM | Permalink

    Is the following excerpt from the July 2, 2007 Issue of Time Magazine relevant to this ongoing discussion?

    “According to the National Aeronautics and Space Administration, Atlanta’s temperature is now five to eight degrees higher than the surrounding countryside following decades of development that bulldozed wooded areas.”

  22. Steve McIntyre
    Posted Jun 23, 2007 at 12:02 PM | Permalink

    #15. Trenberth’s comment and Pielke’s discussion are indeed worth reading. Here’s another gem form Trenberth:

    Of course one can initialize a climate model, but a biased model will immediately drift back to the model climate and the predicted trends will then be wrong. Therefore the problem of overcoming this shortcoming, and facing up to initializing climate models means not only obtaining sufficient reliable observations of all aspects of the climate system, but also overcoming model biases. So this is a major challenge.

    This is interesting. As I interpret it, this suggests that the models have a “center” (mean state) and if you start off away from the centre, it tends to revert to the mean, interfering with the study of the effect of a contrast e.g. increased CO2. You know, in one sense, I can accept this as long as this is made clear. But I’ve never seen this said before.

    I don’t think that this means that it is impossible to estimate the effect of increased CO2. However I still don’t understand the relevance of GCMs to this job as opposed to energy-balance models.

  23. Stan Palmer
    Posted Jun 23, 2007 at 12:26 PM | Permalink

    What Schmidt seems to be saying is that the AGW theory is based on model results only. Temperature data is of such poor quality that it is of next to no use. If this is the case, then he would seem to be in strong agreement with the viewpoint of this blog. That is paleoclimatology is of very little use and that the only evidence for AGW is found in the GCM models.

    Of course the utility of the GCM models has not been verified. That they are based on poorly understood physics, poorly understood algorithms and are not implemented with any recognized software methodology are issues about their utility that have not been analyzed.

    I wonder what Michael Mann would say about this?

  24. Kristen Byrnes
    Posted Jun 23, 2007 at 12:35 PM | Permalink

    The rhetorical empire is getting pretty desperate now that the pictures are getting around.
    There is no excuse for what is being documented so they are doing everything to save what they can.
    Seems to me that this is the time to step up the effort and help Anthony as much as possible.
    BTW, awefully cool here in NY today despite being the day after the solstice. :)

  25. dennis
    Posted Jun 23, 2007 at 12:43 PM | Permalink

    I want to echo David Stockwell’s point
    (#12). If modelE is in fact constructed
    using only physics first principle, and yet
    is capable of modeling the current climate,
    that is an astounding feat. I wish Anthony
    would ask Dr. Schmidt if this is a correct
    characterization of the model. If not,
    then it would be helpful if Dr. Schmidt
    would describe exactly the inputs to the
    model.

  26. JP
    Posted Jun 23, 2007 at 1:26 PM | Permalink

    What if a particular grid cell is composed entirely of corrupted stations? Does that grid get eliminated, or is there a means of interpolating adjacent grid data to the poor one? How would the people at NOAA or Hadley know? DO they visit and audit these stations? What if the adjacent grids also contain stations with questionable temps? Again, how would they know?

    Schmidt has in a way admitted that the adjusted grid cell data goes under some fairly robust corrections (ie corrections that are weighted to the “warm side”). I think this should be kept in mind the next time NOAA publishes a report that such and such was the hottest on record.

  27. cytochrome_sea
    Posted Jun 23, 2007 at 1:48 PM | Permalink

    Is Gavin Schmidt honest?

  28. Anthony Watts
    Posted Jun 23, 2007 at 2:14 PM | Permalink

    RE27 I don’t want questions about data and methods to turn into attacks on a persons integrity. Dr. Schmidt is an expert on this subject, let us not lose sight of that. What is most important is get independent reviews of the data, formulae, and methodology used in such models.

    Let’s focus on obtaining that. Let’s also focus on the task of completing the surface station survey so that we can help the entire science community do a better job of data analysis.

  29. Michael Jankowski
    Posted Jun 23, 2007 at 2:24 PM | Permalink

    So what is the actual output of ModelE?

    Pretty much trash :)

    Hansen (and Schmidt) et al admit:
    ModelE [2006] compares the atmospheric model climatology with observations. Model shortcomings include ~25% regional deficiency of summer stratus cloud cover off the west coast of the continents with resulting excessive absorption of solar radiation by as much as 50 W/m2, deficiency in absorbed solar radiation and net radiation over other tropical regions by typically 20 W/m2, sea level pressure too high by 4-8 hPa in the winter in the Arctic and 2-4 hPa too low in all seasons in the tropics, ~20% deficiency of rainfall over the Amazon basin, ~25% deficiency in summer cloud cover in the western United States and central Asia with a corresponding ~5 °C excessive summer warmth in these regions…

  30. Steven mosher
    Posted Jun 23, 2007 at 2:34 PM | Permalink

    This from a while back

    Warwick posted a map of CRU stations for Anthony’s first area of interest( 40N 120W)
    On this thread.

    http://www.climateaudit.org/?p=1603#comments

    Comment 65.

    Thoughts:

    1. What is the grid average and Std.dev.
    2. what does Gisstemp say for this grid in comparisn.
    3. How will the new network ( CRN) estimate this grid

  31. James Erlandson
    Posted Jun 23, 2007 at 3:15 PM | Permalink

    Present-Day Atmospheric Simulations Using GISS ModelE: Comparison to In Situ, Satellite, and Reanalysis Data
    Fig. 17. SAT anomalies compared to the CRU dataset (Jones et al 1999 and updates) for the DJF and JJA seasons. (journal page 181, pdf page 29)
    The anomalies range from -10 degrees C to +10 degrees C.

  32. Phil B.
    Posted Jun 23, 2007 at 3:38 PM | Permalink

    Do the grid cell temperature series products of GISS and CRU have 2 deg/century trends for all or most of grid cells? Gavin seems to consider them artefacts if they don’t. Are individual station adjustments and station selection criteria made to assure the 2 deg/century trend? Inquiring minds are considering circular arguments.

    Phil B.

  33. Anthony Watts
    Posted Jun 23, 2007 at 4:04 PM | Permalink

    RE 30 Here is Warwicks Los Angeles grid cell page which has CRU stations

    http://www.warwickhughes.com/climate/la.htm

  34. Posted Jun 23, 2007 at 4:32 PM | Permalink

    Remember that in the misleading Vose et al 2005 paper;

    http://www.warwickhughes.com/papers/vose05.htm

    purporting to show great agreement between CRU and GHCN, yet over the 1976-2003 period chosen by Vose et al, the two gridded datsets varied by more than 0.1 per decade for the L.A. and San Francisco grid cells.
    In fact the L.A. grid cell was the subject of special comment in the paper.

  35. Philip B
    Posted Jun 23, 2007 at 4:44 PM | Permalink

    Re #14

    You hit the nail on the head. Developing large complex software systems is a highly error-prone activity. As a practical matter, a system can only be as correct as the data used to test it.

    This is why I have no faith in the models. Even if every single assumption made in them is correct (extremely unlikely), they will still contain many implementation errors, because it is impossible(?) to adequately test them.

  36. Anthony Watts
    Posted Jun 23, 2007 at 5:02 PM | Permalink

    Sorry to post off topic, but I just had to show this picture:

    USHCN – Urbana Ohio Waste Water Treatment Plant

    Full gallery here:

    http://gallery.surfacestations.org/main.php?g2_itemId=5322

  37. Philip B
    Posted Jun 23, 2007 at 5:03 PM | Permalink

    And I’d add, this (poor quality data) is why the term GIGO – Garbage In Garbage Out – was coined.

  38. matt
    Posted Jun 23, 2007 at 5:08 PM | Permalink

    A bit of insight into why the models might not need a starting point:

    Even if there were, the projections are based on model results that provide differences of the future climate relative to that today. None of the models used by IPCC are initialized to the observed state and none of the climate states in the models correspond even remotely to the current observed climate. In particular, the state of the oceans, sea ice, and soil moisture has no relationship to the observed state at any recent time in any of the IPCC models. There is neither an El Nià±o sequence nor any Pacific Decadal Oscillation that replicates the recent past; yet these are critical modes of variability that affect Pacific rim countries and beyond. The Atlantic Multidecadal Oscillation, that may depend on the thermohaline circulation and thus ocean currents in the Atlantic, is not set up to match today’s state, but it is a critical component of the Atlantic hurricanes and it undoubtedly affects forecasts for the next decade from Brazil to Europe. Moreover, the starting climate state in several of the models may depart significantly from the real climate owing to model errors. I postulate that regional climate change is impossible to deal with properly unless the models are initialized.

    The current projection method works to the extent it does because it utilizes differences from one time to another and the main model bias and systematic errors are thereby subtracted out. This assumes linearity. It works for global forced variations, but it can not work for many aspects of climate, especially those related to the water cycle. For instance, if the current state is one of drought then it is unlikely to get drier, but unrealistic model states and model biases can easily violate such constraints and project drier conditions. Of course one can initialize a climate model, but a biased model will immediately drift back to the model climate and the predicted trends will then be wrong. Therefore the problem of overcoming this shortcoming, and facing up to initializing climate models means not only obtaining sufficient reliable observations of all aspects of the climate system, but also overcoming model biases. So this is a major challenge.

    Full link here (http://blogs.nature.com/climatefeedback/2007/06/predictions_of_climate.html), it’s an informative read.

    So, the summary seems to be the model output is relative to some starting point. If the starting point drops a degree due to bad stations, then the output in 100 years drops a degree too. But, the warming rate is still them same.

    I think that is how the argument might go. The point on models being linear and not knowing that things can’t get any drier than a drought is interesting.

  39. Anthony Watts
    Posted Jun 23, 2007 at 5:08 PM | Permalink

    Forgot to mention, that picture at 36 comes from surfacestations.org volunteer Steve Tiemeier.

  40. Steven mosher
    Posted Jun 23, 2007 at 5:56 PM | Permalink

    OK,

    ModelE should be no Mystery. You can download it. I did. Then I started to go through it.
    Now, I am no novice with OPC. Other Peoples Code. I cut my teeth on it. Funny, most of it
    was NASA code.

    I expected to find documentation. You will find None, oh a CVS. This means you must wade through about 100K
    lines of code. No design docs, no specs, no test plans, no test results, no unit test, no integration
    test. no style guide, no calling tree, no drivers, no exception handling, spagetti farm. Is the result accurate?
    You could not even tell if it met its own standard, because it has none published with the code.

    here’s one, off the top of my head:
    ModelE, when initialized to historically observed climate conditions 30 years ago, shall
    produce a climate prediction that matches the land record ( within 5%) all subsequent 30 years on a monthly
    basis in the following respects

    1. Grid level maximum temp and minimum temp
    2. Grid level precipitation
    3. Grid level Wind speed and direction.

    Then you would provide he test data set. the code. The test results. The test report.
    Simple. Thousands of engineers do that every day. Its not rocket science. Done right,
    nobody dies.

    Very simply, if the model takes input at the grid level, and we have grid level
    data about max/min temp, and grid level wind, and grid level pricip, and grid level….
    Then, the model ought be able to hindcast this with accuracy if we expect it to forecast this with accuracy.

    How accurately? I dunno, Global warming is pretty damn important. Stopping it will be pretty damn
    expensive. So the code that predicts this should be pretty damn robust. So, I’d say ModelE
    should probably exceed the product reliability of, say, Apollo 1, Challenger, and Columbia.
    it’s not rocket science, this software engineering thing.

    Now, I dont really care what the “spec” is. What I care is that they set a goal and met it.
    SET a goal. Met the goal. Tested that they met the goal. Published the results. and showed
    you how to replicate it. Not rocket science.

  41. Anthony Watts
    Posted Jun 23, 2007 at 6:10 PM | Permalink

    RE40 Steve, Model E has been an ongoing work, touched by many programmers, with a lineage that goes back almost 20 years, possibly more. Done in Fortran. Code maintained over that period of time is bound to become spaghetti code. It’s the nature of programming. But there should be documentation, test plans, etc, I agree. Perhaps its time to make an official request for them.

  42. EW
    Posted Jun 23, 2007 at 6:24 PM | Permalink

    ModelE, when initialized to historically observed climate conditions 30 years ago, shall produce a climate prediction that matches the land record ( within 5%) all subsequent 30 years

    And does it produce such results?

  43. David Smith
    Posted Jun 23, 2007 at 6:34 PM | Permalink

    Re #36 Anthony, I think that Urbana is now in first place for most-poorly-located thermometer.

  44. Anthony Watts
    Posted Jun 23, 2007 at 6:36 PM | Permalink

    RE40

    Funny, most of it was NASA code.

    It just hit me. The code is so old, I’m betting they’ve lost a certain amount of control over it. I have the same problem in my own shop for code that goes back to 1998. Four programmers have worked on it. Documentation by the first two programmers was sparse and in some modules, non-existent. Two programmers since then have worked on it, but some modules remain as “black boxes” where data goes in, data comes out, but we don’t know what happens inside the black box because the person who wrote it isn’t there any more and didn’t leave enough comments in the code or documentation.

    With a large entity like NASA, with code spanning 20+ years, the problem has to be magnified.

    They may not know how it all works today because I’m betting the original programmers aren’t there anymore or have been reassigned to other divisions or projects. And, I’m betting this was contract work, where a spec was written but the original climatologists weren’t the programmers.

  45. Anthony Watts
    Posted Jun 23, 2007 at 6:41 PM | Permalink

    RE43, I’d have to agree. Marysville is now second. I had tears in my eyes from laughing after seeing that picture, and it wasn’t from the fumes.

  46. tetris
    Posted Jun 23, 2007 at 6:41 PM | Permalink

    Re: 38
    matt,
    I must be missing something. How can one [with a straight face at least] use a linear model to model what in its very essence is a highly non-linear system?

  47. John F. Pittman
    Posted Jun 23, 2007 at 7:03 PM | Permalink

    Well #40, #41, #44 I wonder how Model E can meet the DQA. Several groups are launching law suits. Interesting link.

    http://www.ombwatch.org/article/archive/231

  48. Steven mosher
    Posted Jun 23, 2007 at 7:04 PM | Permalink

    Warwick,

    Thanks for coming back. For a while I went down the path of trying to figure out what CRU were up
    to inthe San Fran area. As I recall, the grid ( 35-40 115-120) had a mean temp of 13.5C in the 61-90
    time frame. Stations I was looking at ( marysville, quincy, lake spaulding) were +- 3C from this.

    Made me think about how much junk you could hide in a grid cell.

  49. Philip B
    Posted Jun 23, 2007 at 7:20 PM | Permalink

    Re: 40

    The standard and almost universal practice in business is independent testing of software. Frequently, multiple level of independent test are performed. And as you point out test scripts, data sets and results are retained as proof of both the tests and the results.

    It’s way past time for independent testing (with published results) of these models.

  50. Pat Frank
    Posted Jun 23, 2007 at 7:32 PM | Permalink

    [Gavin argues],”If you are of the opinion that this station is contaminated, then you have to admit that the process designed to remove artefacts in the GISS or CRU products has in fact done so.

    This statement is, on its face, a non-sequitur. There is no logical connection at all between believing station data to be contaminated, and believing that the artifact removal process is valid. If this connection is what Gavin meant to convey, he’s being irrational.

    Further, if “Gavin [really does assert that] any removal of contaminated stations would improve model fit.“, then he is implicitly asserting that GCMs elaborate an exactly correct theory of climate. If he really believes that, he’s living in a fantasy world. On the other hand, maybe it’s time to re-visit this thread: http://www.climateaudit.org/?p=419

  51. Steven mosher
    Posted Jun 23, 2007 at 7:39 PM | Permalink

    RE 41.

    Anthony, I think Dan hughes has been going down this path. SteveM links to him.
    Gavin has commented a few times on Dan’s site. Essentailly gavin makes the same defense.

    1. a lot of this code is legacy code. ( we cant help ourselves)
    2. We are scientists not software engineers. ( 1 class would fix this)
    3. make constructive suggestions, dont thrown stones. ( help us)
    4. yes the documentation sucks, but we make progress ( shrugging, yes you are right but whats your point)
    5. We dont have budget to do this your way. ( challenger go with throttle up)
    6. This is different than commercial code.( we’ve got 100K lines of code!for christs sake)
    7. V&V doesnt really apply. ( err what’s V&V again)

    Some of these are reasonable, in the short term. But in the end, NASA needs to move to modern coding
    practice. Coding langauge, practice and documentation aside, The issue of requirements still remains.

    The clue is this. Without a customer, you never set SPECS. You just make yourself happy.

    I am banned on RC. excessive snarkiness, which gavin, a gentleman all around, did not cotton to.
    I dont blame him or complain. I suspect we met on a playground in the past.

    Dan Hughes has better creditials on this and is less snarky.

  52. Steven mosher
    Posted Jun 23, 2007 at 8:34 PM | Permalink

    RE 42..

    How ModelE performs.. Found this. I’m sure subsequent stuff exists. I sure these issues
    were addressed. I sure it performs better today….

    I’m a couple pages into this. It needs more eyes.

    http://pubs.giss.nasa.gov/docs/2000/2000_Russell_etal_2.pdf

    MONEY QUOTE for me:

    “Starting from an observed atmospheric state, zero ocean
    currents, and climatological ocean temperature and salinity
    distributions [Levitus et al., 1994], the atmosphere-ocean model
    was spun up for 40 simulated years with constant 1950 atmospheric
    composition. From this spin up state, three simulations
    were integrated from 1950 to 2099: a control simulation that
    continues the spin up run, GHG1 experiment with observed
    greenhouse gases up to 1990 and compounded 0.5% CO2 annual
    increases thereafter, and GHG 1 SO41 experiment with
    the same varying greenhouse gases plus varying tropospheric
    sulfate aerosols.
    The control simulation had a significant climate drift in its
    surface air temperature amounting to 0.5 C in its first 60 years
    but ,0.1 C for the last 90 years.”

    In short OUR CONTROL drifted ( read DIVERGED) .5C in 60 years.

    I’m sure they addressed this, but since I raise the issue of calibration, I thought
    I’d give the link.

    I’ll keep reading, but the sharper pencils in the box should have a look.

  53. Kenneth Fritsch
    Posted Jun 23, 2007 at 8:34 PM | Permalink

    Re: #15

    Kenneth Trenberth of NCAR, made some astounding claims about GCM models such as “E” on Roger Pielke’s Climate Science Blog. He says: “None of the models used by IPCC are initialized to the observed state and none of the climate states in the models correspond even remotely to the current observed climate.”

    and also

    “Moreover, the starting climate state in several of the models may depart significantly from the real climate owing to model errors. I postulate that regional climate change is impossible to deal with properly unless the models are initialized.”

    I have stated here that climate and climate change (as it matters to individuals in different regions) could be essentially local and as geographic scale is decreased with computer models these local difference become impossible to significantly more difficult to predict. Now as Steve M has noted, and I agree, we have not completely separated these effects into differences due to temperature measurements errors and real temperature changes.

    I repeat (as I am prone to do) that the local “bad” climate stuff (think drought and resulting starvations) needed to push for attempts at mitigation become more difficult to show with modeling and even temperature reconstructions. Global averages of temperature increases, even significantly different than today, would, I think, be more difficult to sell to the general public.

  54. Steven mosher
    Posted Jun 23, 2007 at 8:42 PM | Permalink

    Son, are you trying to blow smoke up my ass?

    last quote from a modelE paper. I promise

    We fit the global observed surface air temperature data from
    1858 to 1998 [Hansen et al., 1999] by the least squares fit
    exponential of the form A 1 B exp (C z time). Subtracting the
    exponential fit from the observed temperature record, we see
    that the 1930s (0.11 C) and 1940s (0.11 C) were the warmest
    decades of the twentieth century relative to their atmospheric
    gas compositions. Regardless of the reason, after 1950 the real
    world behaved as though much of its possible unrealized warming
    from 1850 to 1950 had been realized during the 1930s and
    1940s. Thus, starting the model experiments from cold start
    mode in 1950 and comparing with observed surface temperatures,
    we would estimate an effective unrealized temperature
    disparity (causing the model to be colder than the observations)
    to be ,0.23 C. Because model results are warmer than
    the observations for the first 30 years, there is no clear evidence
    of a cold start problem. The model results are not
    modified for this unclear temperature disparity due to unrealized
    warming.
    When compared with the real world, the atmosphere-ocean
    model’s climate variables have errors of various magnitude in
    different seasons and locations. In addition, the ocean prognostic
    variables show systematic climate drifts of various magnitude.
    Thus the model’s climate predictions should be based
    on differences between experiments and control simulations in
    order to reduce deviations from observations and for the same
    time periods in order to reduce climate drift. The model’s
    climate changes are then based on atmospheric composition
    changes since 1950, the year whose fixed composition is used in
    the controls.

  55. Steven mosher
    Posted Jun 23, 2007 at 8:50 PM | Permalink

    Ps. in this paper it looks like Gavin used GISStemp. No Jones cites,bows to hansen’s data.

    http://pubs.giss.nasa.gov/docs/2000/2000_Russell_etal_2.pdf

  56. matt
    Posted Jun 23, 2007 at 9:47 PM | Permalink

    Re: 38
    matt,
    I must be missing something. How can one [with a straight face at least] use a linear model to model what in its very essence is a highly non-linear system?

    I’m just the messenger.

    Re the size of the code and the quality of the code that someone else noted.

    I got into a thread on RC asking about this, as I was surprised to learn that climate models had reached 1M lines of code. If the defect density of the code was typical of industry, then there’d be about 10,000 bugs in that code. That assumes a fairly sizable full time test team. With a big of poking, I got the impression it was a bunch of PhDs cranking away on this code. Not that they can’t do a good job, but much of the world fancies themselves a SW person because they write “a little C and VB here and there” but I’ll state that having spent 9 years at on the world’s biggest SW companies, there is a world of difference between someone who writes SW for a living and someone who writes software to make their day job easier. FWIW, I’m EE, so it took me a while to really get that. But I do now.

    The RC folks argued that because it was mathmatical code that the defect rate would be much much lower. But without a test plan, a formal test org, 500K lines of test code, I’m not sure how they can really know one way or the other what their code defect density is.

    We had bit of a tangent discussion on what do you do if one model says the output rises 2 degrees and one says 1 degree. In most engineering, you’d dig until they agreed. What I think happened for IPCC is that they just averaged the various models. But then that made me ask “what if someone shows up with a model that says the temp goes down?” but that question didn’t make it through the filter.

  57. Steven mosher
    Posted Jun 23, 2007 at 10:47 PM | Permalink

    RE 44 Anthony

    It just hit me. The code is so old, I’m betting they’ve lost a certain amount of control over it. I have the same problem in my own shop for code that goes back to 1998. Four programmers have worked on it. Documentation by the first two programmers was sparse and in some modules, non-existent. Two programmers since then have worked on it, but some modules remain as “black boxes” where data goes in, data comes out, but we don’t know what happens inside the black box because the person who wrote it isn’t there any more and didn’t leave enough comments in the code or documentation.

    With a large entity like NASA, with code spanning 20+ years, the problem has to be magnified.

    Yes, exactly. Stuff gets added like barnacles. People move on. They all have different styles.
    You can tell by walking through the code that it has been written without standards or guides.
    You can even tell guys by there coding style ( math heads especialy).
    Then there is that code that Nobody dares to touch. I remember the first time I stumbled
    on Quaternion code doing a walk though. (WTF??)It was holy grail code. Dont touch the code, you’ll
    disintegrate.

    Figure a 25 Man year effort to rewrite and document the code to standard.

  58. Jaye
    Posted Jun 23, 2007 at 11:00 PM | Permalink

    It’s the nature of programming.

    It’s the nature of bad programming practice. There are ways to deal with legacy code. The way we ensure that quality code comes out of our shop is that the engineers are not allowed to write too much code…and we through away anything that comes from a university.

    The RC folks argued that because it was mathmatical code that the defect rate would be much much lower. But without a test plan, a formal test org, 500K lines of test code,

    These people are amateurs. Large fortran programs have big nasty common blocks with lots of array access. Both places where errors are likely to occur regardless of the sort of logic/calculations that are being performed. Mosher is right, without common sense process, testing, V&V, documentation, etc. its likely that the stuff is very buggy.

  59. Jaye
    Posted Jun 23, 2007 at 11:06 PM | Permalink

    Run ModelE through valgrind…you would likely be shocked.

  60. dave c
    Posted Jun 23, 2007 at 11:34 PM | Permalink

    Re #59,

    Come on Jaye, don’t hold us in suspense. What did you/vgrind find?

  61. Posted Jun 24, 2007 at 5:02 AM | Permalink

    #40 Steve Mosher.

    Uh. 5% is really wonderful for electronic or mechanical simulations. So that is a great spec.

    5% of 300K = 15K. Lousy for climate.

    Which always made me suspect the models. If well understood and modeled systems like servos and electronic stuff are considered good if they come within 5% (continuously tested against the real world by sceptical engineers I might add) how in the heck can climate modelers claim and understanding of a way more complex system with so many known unknowns (let alone unknown unknowns) at better than 1% i.e. +/- 3 deg K? it could happen. What are the odds?

    If we are going to bet trillions of $ and hundreds of millions of lives on this code it had better be built to FAA (my expertise) or FDA (similar) levels of quality. Of course the FAA guys are not dummies. They insist aircraft mfgrs build as much redundancy into the system as possible. Any one got any ideas for a test planet?

  62. Steven mosher
    Posted Jun 24, 2007 at 5:25 AM | Permalink

    RE 61.

    5%. I made it up to illustrate an example of what one would expect to
    see in the FORM of a spec. Not as an actual spec. Yes anyone who has ever delivered
    code under the watchful eye of the FAA would kinda go purple in the face when looking
    at ModelE.

  63. Posted Jun 24, 2007 at 5:38 AM | Permalink

    Since I’m working backwards I got to Steve Mosher’s #17.

    .5 deg C error band? That would be .2%

    On its best day with a relatively easy system a really good electronics model would come within 1%.

    I have been saying this at various places for over a year. There is no way the model(s) is/are as good as claimed. Then add in data errors and it gets worse. Then add calculation noise that is added (sometimes multiplied) at every iteration.

    However, it is worse than I feared. I always imagined a team working on a model that was coherent, cohesive, and continuous and even if the models (or at least the parameters used) were secret they were at least coherent. Now I find it is nothing of the sort. Now I find that it is more like: if circles don’t work add epi-cycles. Why? Well they can be adjusted to give the correct result. You just keep adding epi-cycles until it comes out right. At least for the past.

  64. Steven mosher
    Posted Jun 24, 2007 at 5:56 AM | Permalink

    Last time I looked ModelE was 100K LOC. not a million.

    Have a look. Oh, Kudos to Gavin and company, there is an online code viewer
    So they are making progress.

    http://www.giss.nasa.gov/tools/modelE/modelEsrc/

  65. JG
    Posted Jun 24, 2007 at 8:40 AM | Permalink

    I posted this comment on Anthony’s site as wel:

    The data may not be used in the climate models, but isn’t it used by NASA to compare the global temperature from one year to the next so that they can issue periodic press releases declaring a particular year to be “the warmest on record”? I think that is the real problem with poor quality control of the stations. Their data are used to generate these press releases that are then picked up by the mass media who can use them – because of their sensationalist nature – to sell more newspapers and magazines.

  66. Anthony Watts
    Posted Jun 24, 2007 at 8:44 AM | Permalink

    RE64 Steve Mosher

    Here are the stats for model E

    1 Main program
    549 subroutines
    1 HTML export program

    91877 lines of code
    1108547 characters

    The subroutines calculate a wide variety of things, everything from soil type albedos to something called “canopy drip flux” which read like this:

    C*** computes the flux of drip water (and its heat cont.) from canopy
    !@+ the drip water is split into liquid water dripw() and snow drips()
    !@+ for bare soil fraction the canopy is transparent

    While I would imagine it would be easy enough to do empirical tests on each of these subroutines to determine how well they model reality, I would expect that the respective errors of each subroutine combined with the wildly nonlinear nature of earths systems to produce a significant cumulative error in the output for any given set of starting parameters and data.

    I reminds me of tuning a piano with a flexible string anchor board, where each of the strings is a subroutine and the keyboard is the main routine. The input is sheet music. The output is sound. The piano is earth. This may be simplistic but its the best I can do on a Sunday with half a cup of coffee.

    Getting one string in tune is easy enough, but adjust another and the board starts flexing, changing the tension on the other strings.

    Then having done that you find that the piano itself has been adjusting the length of the other strings, and the shape of the sounding cavity and the thickness of the wood it is made of

    No matter what you do, you’ll never get all the strings precisely in tune, because adjusting one effects the others and the system itself is dynamic.

    An off key approximation of Chopin doesn’t do much to impress confidence in the music. But like music practice, it can get better over time. Whether it ever gets close enough to be recognized as Chopin’s ‘Minute’ Waltz is the question.

  67. steven mosher
    Posted Jun 24, 2007 at 8:45 AM | Permalink

    re 63.

    Simon, sorry I didnt finish that post and hit enter to early, I was going to check on the error
    estimate ( I was working from memory) So, here it is in words. Then some explination.
    From chapter 10 IPCC wG1

    The multi-model mean SAT warming and associated uncertainty ranges for
    2090 to 2099 relative to 1980 to 1999 are
    B1: +1.8°C (1.1°C to 2.9°C),
    B2: +2.4°C (1.4°C to 3.8°C),
    A1B: +2.8°C (1.7°C to 4.4°C),
    A1T: 2.4°C (1.4°C to 3.8°C),
    A2: +3.4°C (2.0°C to 5.4°C)
    and A1FI: +4.0°C (2.4°C to 6.4°C).

    But these are not really error bands. They are the range of results given by the models.
    Now, what is B1, B2 etc? These are emission STORIES. so, B2, for example is a projection
    of GHG emmission from now until 2100. In that story the models produce results that range
    from 1.4 to 3.8C Increase! so, a bit bigger than I remembered ( I misremembered it was +-.5C,)

    Anyway, The dirty little secret is the Emission scenarios. When you look at all these SRES
    you’ll see they are all considered equally probable ( this has since been updated )

    SO, when they say, warming will be from 1.1 to 6.4C, they are looking at vastly different
    Emission scenarios with many models. WIthin a given scenario, results vary as you see above.

    Now, any given model, lets say ModelE, may have a narrower band of results, but when you combine
    all “19” models you get results like 1.4C to 3.8C. Comforting huh?

  68. Bob Meyer
    Posted Jun 24, 2007 at 8:59 AM | Permalink

    M Simon said:

    Any one got any ideas for a test planet?

    No, but Schmidt does. He want to test his model on the earth.

  69. Stan Palmer
    Posted Jun 24, 2007 at 9:07 AM | Permalink

    Here are the stats for model E

    1 Main program
    549 subroutines
    1 HTML export program

    91877 lines of code
    1108547 characters

    This is not a big program. If it is a spaghetti farm and considering the billions that are now being mandated and spent for AGW causes, then it would seem a simple matter to have it rewritten into a useful form. In itself, this effrot would probably be cheaper than fixing the bugs that are created and triggered by each modification. However if there is no verification effort, bugs will remain undiscovered so perhaps the spaghetti farm technique is cheaper after all.

  70. tetris
    Posted Jun 24, 2007 at 9:37 AM | Permalink

    Re: 54 and 68
    On the basis of this voodoo “science”, governments around the world are now implementing trillion dollar “policies” purporting to mitigate carbon emissions. These “policies” have recently started impacting conventional liquid hydrocarbon markets, the consequences of which are interesting to ponder if we are indeed heading towards a cooling period. Much more disturbing in the short term, access to staple foods is being disrupted [so we can make ethanol we don't need], driving key food crop prices through the roof around the world with disastrous consequences for those who can least afford it. And no one involved in producing this shambles will ever be held accountable.

  71. steven mosher
    Posted Jun 24, 2007 at 9:45 AM | Permalink

    Anthony,

    I loved the piano tuning analogy. IN addition you wrote

    C*** computes the flux of drip water (and its heat cont.) from canopy
    !@+ the drip water is split into liquid water dripw() and snow drips()
    !@+ for bare soil fraction the canopy is transparent

    While I would imagine it would be easy enough to do empirical tests on each of these subroutines to determine how well they model reality, I would expect that the respective errors of each subroutine combined with the wildly nonlinear nature of earths systems to produce a significant cumulative error in the output for any given set of starting parameters and data.

    Yes, the issue with big math models is not bugs. You fix the underflows and overflows and index out of bounds errors
    very quickly. Pardon my French, but “… hangs” so you fix it. The isue is the subtle little errors that
    creep in. The control simulations that “drift”. See my post a ways back on the ModelE paper.

    Finding these “errors” is tough,

  72. steven mosher
    Posted Jun 24, 2007 at 10:21 AM | Permalink

    RE 69.

    Yes, with 100K lines of F90, I guessed 2 LOC per hour rewrite. Even at 1 LOC/hour
    we are talking 50 man years and maybe 10M-15M. ( domestic labor… grin)

    I think it’s worth twice 15M to bring the coding up to standards.
    The real benefit would be credibility. I don’t see why they don’t get that.

    Anyway, I’m looking at the MIT GCM, which has a team of scientists and Software engineers
    working together. An approach I can vouch for.

    It’s an interesting problem. The scientist types gravitate to Fortran. It’s a world
    they have to live in. The software types run from that Pigden ( ok I’m being unfair
    to Fo-tran) But still, it’s the slide rule of programming ( opps I did it again)

    Written properly, Fortran is fine. ( Ducking tomatoes)

  73. steven mosher
    Posted Jun 24, 2007 at 10:34 AM | Permalink

    RE 65.

    Interesting slant. Now thats a DQA lawsuit that might be interesting.
    hansen loves to talk.

  74. Curt
    Posted Jun 24, 2007 at 11:24 AM | Permalink

    steven mosher:

    My brother is a geophysicist and I am an engineer. From time to time I tease him about his use of Fortran. His defense has always been that the math libraries are better, or at least better validated, than those in C. He felt he could trust the Fortran libraries, but not the C libraries. That was several years ago, so I should see if his opinion has changed.

    At my company, after almost two decades of evolving a generation of code, we are making a clean break and starting from scratch for a new generation. Too much crud builds up as something is patched here and tacked on there for the issue or crisis of the day. This new effort is taking a long time, and “the money” is getting impatient, but I really don’t think there’s an alternative for us.

  75. Michael Jankowski
    Posted Jun 24, 2007 at 12:18 PM | Permalink

    Re#74, Fortan libraries were always a big selling point. Frankly, I hated Fortran and found it peculiar that a language so often preferred by scientists was one in which matrix notation was opposite of what is used seemingly anywhere in the real world. Too many profs still around who grew up using it almost exclusively and who aren’t up for learning new tricks.

  76. Posted Jun 24, 2007 at 1:52 PM | Permalink

    Steve Mosher #67,

    My statistical understanding is weak so maybe some one will correct me but shouldn’t the error bands be symmetrically distributed?

    i.e. -1/+2 for one type of calculation. -2/+1 for another?

    Admittedly the sample size is small but what does it mean when the error bands are biased in a given direction?

    BTW thanks for the numbers. Giving them the benefit of the doubt, we have climate models that are good to say +/- 2 deg. C. About .7%

    Yep. Way more realistic than .2% Cough.

    Has some one tested the canopy drip factor? How about calculation drift from rounding errors? How many iterations make up a year? A decade? A century?

    To get some idea on calculation drift for small differences you could use say 2.00000 for a parameter and then change it to 1.9999847412109375 (ie 2 – (1/2^16) and see what the runs look like. The same on the high side. Then a little farther up and down. Hopefully your results should be roughly evenly distributed for small differences. What would be worrisome is if some small differences give results wildly off the trend line for a given calculation. I suppose annealing could be invoked. The question is – has any one tested the belief in annealing?

    In reality you not only have to have parameters, but error bands for the parameters. Error bands for the data. Then you Monte Carlo that stinkin pile (assuming you know the likely distribution profile of each of the errors) and you do a run. Ten thousand runs and maybe you know something. Not counting things like leaving out a UV model because it is “too complicated”. But hey canopy drips are well modeled so it shouldn’t matter. I wonder if the height of the trees in the canopy make a difference? Have they modeled the variations in pest infestation of the trees vs temperature? The variation of growth rates vs. CO2? Well you know. How much stuff could they account for in 100,000 lines of code? Which is why I find that number astounding. As in way too small.

    I’m doing some work in plasma physics these days. That stuff is very hard to model when it comes to large numbers of particles because everything affects everything. Charges attract particles. Moving particles generate magnetic fields. Magnetic fields bend particle trajectories. Which affects the magnetic fields. When the magnetic fields move that creates an electric field. Then you add fixed magnets to the fixed charges. Then you add variable over time charges and magnetic fields. Out of all this if you have divided your working area into sufficiently small chunks the model might give you some idea of trends. Did I mention you have to add in particle transmutations as the collision energies get sufficiently high? The only way to be sure however, is to do the experiment.

    Now, as I see it, “climate science” is vastly more complicated than plasma physics. Has any one done runs with smaller chuck sizes to see if there is convergence on a solution?

  77. Posted Jun 24, 2007 at 1:59 PM | Permalink

    Has any one done runs with smaller chunk sizes to see if there is convergence on a solution?

  78. DocMartyn
    Posted Jun 24, 2007 at 2:47 PM | Permalink

    I invent a drug and test it on 300 rats; my drug is aimed to prevent them from dying of sepsis so I give them a dose of excreta in the belly and of my test compound or a sugar solution placebo.
    150 rats have a placebo and 150 get the drug.
    50 rats that get the drug die the same day; however despite the fact that none of the controls die I exclude this results because I believe that it was the shock of giving them the drug, not the drug itself that killed them.
    I can now compare 100 treated with 150 controls.
    Half the control rats die at day three and 80 drug treated rats died in the same time period. However, I note that most of the surviving rats were a little bit darker than the controls. It is obvious that the drug only works on dark colored rats. I note that almost all the dark colored rats have survived and so I redo my calculations.
    Of the controls, half die of sepsis; however in a model testing my drug I found that 20 out of 23 rats of the right sort of rats survived.
    I publish the results and wait for the money to pour in from the drug companies.

  79. matt
    Posted Jun 24, 2007 at 4:09 PM | Permalink

    Last time I looked ModelE was 100K LOC. not a million.

    Yes, but realize you don’t get EVERYTHING with the download. There’s a bunch of secret sauce that they hold back. :)

    Here’s an article noting many current climate models are a million LOC: http://physicsweb.org/articles/world/20/2/3

    My point on RealClimate was that software projects of this size that were being run as a university thesis were in big trouble. A company like MSFT would have 20-30 devs, and 30-50 testers alone working on that. And that’s a smallish effort for somethink like photo viewer and sync software. It’d be interesting to hear what level of validation Boeing might put into a 1MLOC piece of code that was responsible for something very important on an airplane.

    I’d also asked about sensitivity analysis..whether work was done to determine if a small change in an initial condition led to a larger change down the road. They claim it didn’t.

    I’m still mystified by that because if I change an interest rate just a little bit in my 401K model it can have a huge result 100 years in the future. And since climate seems to be a big question of how do the various physical systems store, accumulate, transfer and release energy, seems that if you are off just a bit on the amount of coupling between each system, then over 100 years the errors could be substsantial.

  80. steven mosher
    Posted Jun 24, 2007 at 5:32 PM | Permalink

    RE 79.

    Well, matt dont believe everything you read. If you can’t see the code don’t belive the line count.

    ModelE and the MIT GCM are the only I’ve looked at. Have not counted the latter.

    I like your question about sensitivity analysis. That’s math model testing 101. Funny,
    when you look at the 2000 article they say the control simulation drifted .5C after
    60 years. I read that as NO input change and .5C growth after 60 years.

    The bottom line is you should not have to ask the question. The proper way to do things is

    to have a properly documented test plan, test files, test results, test reports.

    Any Boeing Engineer could show them how

  81. BradH
    Posted Jun 24, 2007 at 5:56 PM | Permalink

    I find this astonishing. It’s like playing the pea-under-cup game.

    First, the cup labelled “hockey stick proxy” is lifted and there’s no pea.

    Next, the cup labelled “short-term model forecasts vs. actual outcomes” is lifted and, once again, no pea.

    Finally, the “model initial state” cup is lifted and…guess what?…no pea!

    When will so-called “science” journalists come to realize that the whole thing is a charade? That they’ve been had?

  82. steven mosher
    Posted Jun 24, 2007 at 6:39 PM | Permalink

    RE 74 Curt.

    The argument never changes. The math libraries are better, the math libraries are better.
    So I tested that. HA! it depends.

    This is not about language choice. Good code can be written in every language. And Good programmers
    should know all the major languages, it’s not rocket science. The issue is this.

    My toaster has to have UL certs.
    My Computer has to have FCC certs
    Nasa’s global climate model should have to pass a certification test.

    You see, they cannot have it both ways. They cannot, on one hand, argue that their Modelling
    is mission critcial to planet earth, and then, on the other hand, wave off Quality questions.

  83. steven mosher
    Posted Jun 24, 2007 at 7:09 PM | Permalink

    RE 81.

    Brad, Don’t rely on on Journalists to do the job. You have a camera? Type in
    http://www.surfacestations.org. Find a weather station in your area. Take some photos.
    If you have a kid in school make this her science project as opposed to
    building some retarded vinegar and baking soda volcano simulation. The kid
    will learn about google earth, lat lon, satilltes, instrumentation, calibration,
    statistics…

  84. Jaye
    Posted Jun 24, 2007 at 9:33 PM | Permalink

    Getting one string in tune is easy enough, but adjust another and the board starts flexing, changing the tension on the other strings.

    Or equally as challenging is to tune multiple Zenith/Stombergs…

  85. Jaye
    Posted Jun 24, 2007 at 9:35 PM | Permalink

    !@sum rtsafe use Newton-Rapheson + safeguards to solve F(x)=0
    !@auth Numerical Recipes
    !@ver 1.0

    Hope they paid the license fees.

  86. Jaye
    Posted Jun 24, 2007 at 9:44 PM | Permalink

    It’d be interesting to hear what level of validation Boeing might put into a 1MLOC piece of code that was responsible for something very important on an airplane.

    Safety critical systems have to go through the most stringent software testing of any sort of domain that I can think of. It goes beyond just the software but the code/rtos/hardware interactions. They usually have to be able to prove that their code is analyzable in certain ways. Like showing that certain processes will absolutely perform on time, that memory is partitioned correctly, etc.

  87. Posted Jun 25, 2007 at 2:52 AM | Permalink

    Jaye #86,

    If the testers are really good they will use “impossible inputs” to look for problems.

    I did that with some engine monitor software I designed. First rule – the designer never gets to test his own stuff. Well I put on my take no prisoners hat and squeezed that sucker. Last test of the day. Last test and we sealed the config files. Normal input was 0 to 1,000 Hz. I put in 10,000 Hz. A real engine would have been pieces scattered over the universe at that speed. Just for fun. Found a race condition. Which would have caused a glitch in 1 start up in 100. My bosses were a. pissed (didn’t you RTFM?) b. relieved (do you know the cost of fixing that stuff when it is out in the field? Especially if it is intermittent?). Took about a day and a half to fix and another 3 days for retest.

    You can find out all kinds of interesting stuff about software if you don’t play nice.

  88. Posted Jun 25, 2007 at 3:50 AM | Permalink

    Jaye #86,

    Don’t forget one of the most important functions: failure will be recognized. And recorded. Before it all dies. Or reboots.

    Are there even any internal failure criteria for GCM software? Other than divide by zero?

  89. MarkW
    Posted Jun 25, 2007 at 5:35 AM | Permalink

    #36,

    Is the probe actually mounted to the south facing side of a brick building? I see all the other problems as well.

  90. MarkW
    Posted Jun 25, 2007 at 5:46 AM | Permalink

    #51,

    It takes a lot more than a couple of classes to turn anybody into a programmer.
    I’ve been a professional programmer for over 20 years, and I’m still learning.

    Most of the worst code that I’ve ever been called on to fix, has been written by people with just one or two classes, who then convinced themselves that they were now programmers.

    If these models aren’t being written and maintained by teams of programmers, led by someone with at least 10 to 20 years of experience writting big systems, then that’s another reason not to trust the models.

    Given the importance of this issue, these models should be required to meet the same standards that aeronautical code is required to meet.
    If this code is wrong, then millions, if not billions of people are going to suffer, and some will die, for no reason whatsoever.
    (Ask any economist, reducing net worth kills people.)

  91. Reference
    Posted Jun 25, 2007 at 5:46 AM | Permalink

    From a cursory look at the docs and code it appears feasible to audit ModelE, in particular the key subroutines can be identified, analyzed and their behavior determined.

    Running the full model may need a Beowulf cluster (anyone got access to one?) or if possible, a BOINC port.

    So the question is: how useful would it be to audit ModelE?

  92. MarkW
    Posted Jun 25, 2007 at 6:01 AM | Permalink

    #79,

    In my experience, compared to writting specs, test specs, test cases, test procedures. Running and documenting your tests.

    Writting the code is the easy part.

    I would say that testing and documenting the tests takes at least twice as much time as the actual coding.

  93. MarkW
    Posted Jun 25, 2007 at 6:10 AM | Permalink

    For DO-178B, level A code, you have to PROVE that every line of code has been executed in your tests. You have to prove that every branch was taken, and every combination of tests in you conditionals was tested.

    For example

    if(A && B && C)
    {
    }

    your tests have to contain the conditions
    A == false, B == false, C == false
    A == true, B == false, C == false
    A == false, B == true, C == false
    A == true, B == true, C == false
    A == false, B == false, C == true
    A == true, B == false, C == true
    A == false, B == true, C == true
    A == true, B == true, C == true

    You have to document that you have tested each case.
    You have to document that you have tested each line.

  94. BradH
    Posted Jun 25, 2007 at 8:09 AM | Permalink

    RE 83

    Steven,

    I’ve already identified the sites in my area. Unfortunately, the ones I can get to relatively quickly have limited temperature histories (back to 1940’s – 1950’s), but I’ll endeavour to visit them as soon as I can.

    However, you ask why I would bother to wait for science journalists. It’s because once, a long time ago, a journalist interviewed someone who was concerned about the warming effect we might have on the atmosphere. The journalist wrote a story…eventually, quite a few journalists wrote stories and asked questions for which politicians had no answers…however, politicians had budget allocations and they could ask the scientists to investigate it. Well, if a government department says: Funding will be provided for investigation into whether and, if so, how much industrial activities are heating the Earth…there might be a few takers!

    Journalist finds someone with a scary theory -> journalist writes stories which sell more papers -> journalist asks politician what they will do about it -> politician expresses concern at extent of problem (and blames previous government for inaction) -> politician funds study, so can give good answer next time they meet journalist -> journalist writes story saying the problem is so dire, that the government will study it and report back.

    99 times out of 100, that’s the last we hear of it. However, 1% of the time, the subject matter captures the attention of the population. When that 1% happens, journalists (and, by extension, politicians) are relentless.

    So, that’s what I mean when I say that science journalists need to understand that there are now no peas under the cups.

    Climate Audit vs. Fox News? I don’t think so. Science journalists being convinced that predictions of our doom are based on a future-machine model, which can’t replicate either past or present reality, so completely ignores it, yet still pretends to predict the future? Now, that would be priceless!

    Cheers,

    BradH

  95. steven mosher
    Posted Jun 25, 2007 at 8:14 AM | Permalink

    BradH.

    well put. I concur. I was just trying to drum up support for Anthony’s project.

    Cheers

  96. BradH
    Posted Jun 25, 2007 at 8:26 AM | Permalink

    Steven,

    It is definitely a worthwhile project and, needless to say, extremely compelling and media-friendly.

    The opinions of scientists and predictions of models are all open to interpretation, however photographic evidence is especially powerful.

  97. DeWitt Payne
    Posted Jun 25, 2007 at 9:18 AM | Permalink

    Re: #84

    Or equally as challenging is to tune multiple Zenith/Stombergs…

    Two SU’s on my 1965 MGB were doable with a flow meter. More than two may be equivalent to the three body gravitational problem: there is no analytic solution and the system may be chaotic.

  98. Anthony Watts
    Posted Jun 25, 2007 at 10:07 AM | Permalink

    RE97, I used to own an Austin Healy Sprite…with SU carbs…I know the challenge. Try tuning a Jaguar V12 with them. ANd then there was Lucas, the “Prince of Darkness”.

  99. Ian Foulsham
    Posted Jun 25, 2007 at 10:31 AM | Permalink

    Surely the “requirements” are that the models show catastrophic CO2 induced warming?
    Therefore the models are tested so that they meet and in this case exceed requirements.

    Software testing is my career, and I do it rather well, even if I say so myself.
    Unfortunately, I am not sure I can help a lot, as it seems to be almost untestable, and my experience is not of safety critical systems. Also, my (poor) understanding is that it is not so much the code, but the parameters that you pass into the code, including all those constants that are added to make the figures match.
    However, if you are going to do it, I would suggest initially testing the “sub-routines” in isolation and see if they produce expected results, but again you would need to know what the expected results should be, and that probably isn’t easy. This is probably the easiest part, but the least likely to yield results, since as it is easier to test it actually might have been. The problem is going to be integration testing, i.e, putting all those little bits together.

    On the other hand, I haven’t followed all the posts here, but has anyone mentioned J Scott Armstrong, forecasting principles and the climate bet website? Apparently he has a shed load of rules for economic forecasting and forecasting models in general. I don’t know who rattled his cage, but it makes for hilarious reading. No doubt his Wikipedia entry will be adjusted to mention denialism, Big Oil and not being a real scientist.
    He has a few rules on how to test a model “in the round” as we testers like to say.

  100. Steve McIntyre
    Posted Jun 25, 2007 at 10:36 AM | Permalink

    Ellingson did an inter-comparison of infra red code in GCMs around 1991. As I recall, he found that some of the GCMs had what cou;d only be described as incorrect code for CO2 with differences reaching an order of magnitude greater than the effect of 2xCO2. He archly observed that the one thing that the models agreed on, regardless of whether they had correct IR coding, was the impact of 2xCO2, at least raising the question in his mind of tuning. While the IR codes have changed, I doubt that the incentives have.

  101. Steve McIntyre
    Posted Jun 25, 2007 at 10:40 AM | Permalink

    In the climateprediction.net experiment, runs were made of the HAd GCM using different parameterizations. As I recall, a proportion of the runs led to equatorial cooling and these were rejected from the sample. I thought about examining the rejection protocol a while ago, but never got around to it.

    The ability to select runs must also be considered in assessing the models.

  102. Anthony Watts
    Posted Jun 25, 2007 at 11:28 AM | Permalink

    For anyone that wants to try out modeling, here is a a resource worth trying:

    http://edgcm.columbia.edu/

    Its called EdGCM and is designed to introduce climate modeling to eduction. It appears to be model E Fortran code with a pretty front end GUI applied by programmers at Columbia University, just down the hall from Gavin since GISS is housed at Columbia U.

    I’ve been able to make it run even on the nefarious Vista operating system. There’s also a Mac version. It’s better documented than E as far as I can tell.

    It may be worthwhile to try some runs…I’d do it but my resources are maxed at the moment.

  103. Jaye
    Posted Jun 25, 2007 at 11:32 AM | Permalink

    I had a GT-6 with a 3 carb intake.

    Coolest car I every owned was a Jensen GT…wish I still had it.

  104. Steve Sadlov
    Posted Jun 25, 2007 at 11:48 AM | Permalink

    Is this some sort of NASA endgame? Or perhaps, is this the endgame of a particular faction at NASA who feel embattled now that the manned mission folks seem to have gotten some concessions of late?

    The age old saw that doomed manned missions via decisions made during the late 1960s was “it is immoral to undertake manned, plant the flag, missions, while we have sooooooooo many problems here on earth.” Gavin must be part of this crowd (albeit an apprentice, since the original folk have largely retired). Just a thought …..

  105. Steve Sadlov
    Posted Jun 25, 2007 at 11:49 AM | Permalink

    BTW – I hearby admit to the world, that I suffer from a pro big science bias. ;)

  106. steven mosher
    Posted Jun 25, 2007 at 11:52 AM | Permalink

    Thanks Anthony,

    I download that model a while back. Its based on an older version of ModelE. I trashbinned
    it when I found modelE source. Anyway, you got it to run under Vista? ( ha)
    My main interest was seeing if one could do a proper hindcast with it.

    Oh, have you had a look at the british columbia Lighthouse site.. cited on another thread here.
    Lovely lovely stuff

  107. Anthony Watts
    Posted Jun 25, 2007 at 12:01 PM | Permalink

    RE106 no didn’t see BC lighthouse, where is it?

  108. samoore
    Posted Jun 25, 2007 at 12:03 PM | Permalink

    RE: #98, “The Prince of Darkness”

    I had an MGB for about 7 years. By the time I resold it, I had replaced almost every electrical component with a Bosch part.

  109. Jim Edwards
    Posted Jun 25, 2007 at 12:16 PM | Permalink

    #104, Steve Sadlov:

    It reminds me of Feynman’s post-Challenger investigation of the management – engineering disconnect at NASA. NASA had designed the shuttle to operate well outside the envelope of prior engineering experience, so evaluating risk was problematic. Engineers estimated the risk of catastrophic failure of the shuttle at ~1% [a close prediction of the actual rate we've seen over the past 20+ years], while managers stubbornly stuck to the line that “manned space flight necessarily had a failure rate on the order of 0.001%.” ["Necessarily" b/c if Congress thought the failure rate were 1000x higher than what NASA administrators had been advising them was the case, the shuttle might not be funded...]

    The difference is that the shuttle program had technical and policy factions working on manned space flight, here it appears that the policy wonks are in charge of the “science.”

  110. Steve Sadlov
    Posted Jun 25, 2007 at 12:53 PM | Permalink

    RE: #109 – The shuttle is a excellent / bad example of how the policy wonks convinced Washington that they should kill heavy lift, Moonbase, Mars etc while the technical folks figured that the shuttle would be some sort of compromise to allow at least some sort of manned program – albeit one stuck in LEO. Even within the auspices of this compromise, further compromises were made. The original shuttle concept as done by Von Braun included a heavy lift recoverable booster with the lifting body on top – no strap on solid boosters and tank. A variation was a two fixed wing craft approach where the winged booster flew with the mission ship on its back up to about 100K feet, then the mission ship shot up to orbit from there, the booster ship returning ala Spaceship 1’s mother ship. The wonks must be feeling a bit embattled right now, with moonbase and Mars at least being nominally worked on again. I do also note some hearty missives against the current management over at Rabbet Run.

  111. Jaye
    Posted Jun 25, 2007 at 1:17 PM | Permalink

    I live in Huntsville, so this NASA stuff, especially Ares, gets lots of press. Rumors have it that a bunch of engineers were at the Space and Rocket Center (we have two Saturn V’s, couple of 1B’s and a few F-1 nozzles laying around) measuring everything about the F1 engines they could think of. Unfortunately not many drawings have survived VB just did things by intuition, then he added 3x to the margins.

  112. steven mosher
    Posted Jun 25, 2007 at 1:37 PM | Permalink

    re 107.

    here’s the link Anthony

    http://www.fogwhistle.ca/bclights/

    Every year or two they open up positions. Wonder if one has to be a canuck?

  113. steven mosher
    Posted Jun 25, 2007 at 1:50 PM | Permalink

    re 90.

    MarkW. It is Uncanny how many flight control or aircraft system programers see the problem
    with ModelE and other GCM.

    Imagine a system that had a non linear response ( rapid Temp growth), but a HUGE delay
    in showing effects from input changes.

    In short, Assume AWG theory is true. Assume we make huge changes ( pull back on the stick )
    Changes, won’t show up for decades. Now, you try to fly that plane. Put in a full input and
    convice other people that the effect will show up 30 years down the road.

    And they thought saving SSI was a political nightmare.

    True. One class won’t fix it the Nasa code quality Issue. But these guys are not dumb bunnies.
    They are merely C students. ( ever notice that about Nasa guys??

  114. steven mosher
    Posted Jun 25, 2007 at 2:14 PM | Permalink

    re 100 and 101.

    Was that the distributed computing experiement? I think I signed up for it just for grins.
    Core dumped.

    A couple of points on ModelE and code like this:

    1. When I downloaded ModelE I read the FAQ ( Out of the ordinary for me). I found this

    7) The model crashed in the dynamics near the pole. What should I do?

    Occasionally (every 15-20 model years), the model will produce very fast velocities in the lower stratosphere near the pole (levels 7 or 8 for the standard layering). This will produce a number of warnings from the advection (such as limitq warning: abs(a)>1) and then finally a crash (limitq error: new sn

    Now, I made a snarky comment to gavin about this and he said this was fixed.

    Your point about rejection protocol is spot on. Physical models of this compexity
    will simply go ballistic for utterly unknown reasons. ( especially on a spherical surface)
    Then you either detect and artificially
    bound those conditions… or fix the bug.. or refine the math library.. or not allow the inputs
    that screw the result up.

    It’s not a pretty business.

  115. pk
    Posted Jun 25, 2007 at 2:18 PM | Permalink

    RE: 114 – then it’s a good thing the model doesn’t accurately model what is really going on…lest we have things going ballistic on Earth for utterly unknown reasons.

  116. Steve Sadlov
    Posted Jun 25, 2007 at 2:19 PM | Permalink

    RE: #111 – the rumor is true. Those engineers were, as I understand it, from LMMSC Sunnyvale.

  117. steven mosher
    Posted Jun 25, 2007 at 2:29 PM | Permalink

    RE 104:

    SteveS I don’t like ascribing motives to folks, but it would seem to me that the
    Institution of NASA would feel it was in its interest to move toward projects
    that did not come to explosive conclusions in the lifetimes of top administrators.

    So, just enough Manned stuffed to keep that marketing pitch alive ( commercial Space travel
    will take that gem away IMHO) and focus on the things you can’t get wrong.. err too wrong.

  118. steven mosher
    Posted Jun 25, 2007 at 3:14 PM | Permalink

    RE 116.

    Hmm. LMMSC. I’ve heard tell of a lot of activity there abouts recently….

  119. steven mosher
    Posted Jun 25, 2007 at 3:38 PM | Permalink

    RE 100,

    In the past, to many years ago to matter, I had to work with “VALIDATED” code.
    God rest your soul if you ever found an error ( I found a couple of whoppers).
    One invalidated 5 years of design work. Opps.
    First they Deny. Then they attack. Then they get quiet. Then they implement your fix
    and take credit. Then they claim it really doesnt matter. Why did you fix it then?

    So, I see this stuff. deja vu kinda thing.

    PS. my sense is you would have been a fun SOB to work for. Just sayin.

  120. Anthony Watts
    Posted Jun 25, 2007 at 4:17 PM | Permalink

    Here is one of the latest to come in from Joel McDade

    http://gallery.surfacestations.org/main.php?g2_itemId=6728

    The discarded appliance is a nice touch.

  121. steven mosher
    Posted Jun 25, 2007 at 4:38 PM | Permalink

    #120.

    Anthony, I have been looking at this DQA thing ( data quality assurance)

    I’m not sure if NASA Goddard is covered.

    Anyway, could make an very interesting lawsuit.

  122. Neil Fisher
    Posted Jun 25, 2007 at 5:19 PM | Permalink

    Re #97:

    Many years ago I worked on some (then) ancient UHF radio gear. In order to get the ~2GHz needed, they pumped 25W @ ~175MHz into a passive cascaded multiplier. 2 times into 3 times into 2 times (12 times total). Output was 1.2W. The tuning was so critical that the driver (175MHz) was *never* replaced seperately from the multiplier and the next stage (up-convertor) – instead all three subsystems were replaced as a “chain”. It would take 3-5 days to tune the multiplier to spec. – start at one end, go to the other; then back again; then back again etc ad nauseum. Then, when you thought you had it right, connect it to the up-convertor and most times, you’d need to spend several hours tweaking it yet again, as the up-convertor needed to be tuned as well – which changed the multiplier tuning, which changed to up-convertor tuning, which…. Then more time making sure that it stayed in spec over Mil spec temp range plus mecahnical stresses (hit with screwdriver handle etc). Yet for all of that, it was actually very reliable equipment *providing* you took the time to do it right – if you got it, as some did, just “good enough”, it would fail inside 3 months, but if you persisted and got it to meet all manufacturers specs, (even the ones that “didn’t matter” for our purposes) it went for years without a problem (my personal record was over 6 years).
    And that last bit is the bit that’s on-topic – coupled non-linear systems that require tuning to meet a certain specification are *hideously* difficult to get right. No detail is too small and with everything affecting everything else, small changes and checking every specification after every change is not just advisable, it’s vital. So you can imagine my opinion of GCM software that has no formal test specs. and also my opinion of the concept that “it doesn’t matter” if some detail is not what it should be – IMO, in this type of system *every* detail matters, and “almost right” is simply not good enough if you want a reliable and predictable system.

  123. Paul Linsay
    Posted Jun 25, 2007 at 6:15 PM | Permalink

    #120,Anthony,

    I love your project. The pictures always give me a good laugh. What make is the car on blocks? Sort of looks like my old Ford Galaxy 500. The tires aren’t off so I guess the exhaust could keep the MMT warm on a cold night. One of these days I’ll rustle up a GPS and get you a few sites around here.

  124. Posted Jun 25, 2007 at 10:36 PM | Permalink

    AW #98,

    You are aware that uncontrolled sparks from a magneto can play havoc with a gasoline engine are you not? Well Lucas figured out how to avoid that problem entirely. ’58 Triumph 650 MC. Best handling bike I ever owned. When I could get it on the road.

    BTW Steve Mosher,

    In the uncertainty numbers you posted is that the model uncertainty alone or does that include data inputs?

  125. Anthony Watts
    Posted Jun 26, 2007 at 6:55 AM | Permalink

    RE123 Thanks, It looks like a Ford Fairlane.

    And I think I may have discovered another “first principle” of applied meteorology that complements this “law”: “mobile homes attract tornadoes”

    From the Vacaville, CA Fire Station courtesy of surfacestations.org volunteer Frank Rowand:

    http://gallery.surfacestations.org/main.php?g2_itemId=6968

    “Stevenson Screens and MMTS shelters attract barbeques”

  126. steven mosher
    Posted Jun 26, 2007 at 8:34 AM | Permalink

    RE 124.

    M Simon. May I call you Simon? or M?

    Anyway, if you are referring to the B1, B2, A1, A1T, A1F1, numbers let me give you a cartoon
    version of how they do this. You Must read the WG1 ( working group 1 report). THEN you
    Must, everyone must, read the report on the SRES. Just google SRES IPCC…

    So, Cartoon version, vaguely valid in two dimensions. RTFM if you want the full scoop.

    1. The IPCC get a team of folks to predict the uses of fossile fuels ( and other stuff like cement)
    into the future. For the next 100 years. These predictions fall into FAMILIES.
    Low use families, high use families. Scenarios where we all work togther ( read UN control)
    Scenarios where we tackle this crisis on an individual basis. Scenarios with Lots of BABIES
    ( 15B world pop) Scenarios with a less vigerous, how shall we say, coupling probability.
    (7B world pop). You can see these scnarios documented in the SRES. They are called storylines.
    For all the storylines you get a set of inputs to the GCM. For now, lets just say, cartoon version,
    that they predict TONS OF CARBON spewed per decade. Now, its broken down much finer that this
    and I dont want to belittle these futurists ( honest) But, for the sake of illustration. you
    get the picture. Some scnearios ( B class) project a future of low C02 spewing. Others project
    high Spewage. Like A1F1. B2 is basically, we all go green. A1F1 is everyone catches the US in
    per capita C02. The spread in total emissions, integrated over a century, is, how shall I say…
    Really bleeping big.

    BOTTOM LINE: we have a bunch of guesses about how much GHG we will spew for the next 100 years.
    The spread in predictions is huge. Oh, the scenarios are deemed equally probable.
    That is, it is just as proable that we will grow emissions 1% per anum or 3% per
    annum.

    2. These scenarios ( data files ) are fed into the GCM. The GCM then run for 100 years. Now when
    they run the GCM, they just dont make one run. They run an “ensemble” Details on
    how they do this are not known to me. But, typically, one would Perturb the model
    a little bit to see how it reacted to every so slighly different initial conditions.
    OR, one might identify some process inthe model that was represented as a stocastic
    process and you would use its random nature to produce a “family” of outputs for that
    SRES. So, you take every SRES and you output an esemble. You get strings of spaggetti
    from T=0 to T=2100. So, for example, your B1 scenario will produce a bunch of runs that
    may vary from 1.1C warming to 1.5C warming. ( for example, ok )

    3. Other guys do the same thing with their models. Their B1 may vary from 1.4C warming to 2.3 C warming.
    IN the end the results from all 19 models are “collected” and you get a range of results. B1 ranges from
    1.1 to 2.4. It’s NOT an error band. the center is not a mean. It is a range of model results.
    Pick different models, you get a differet range. All models are ranked equally. REGARDLESS of the their
    hindcast skill. ( that’s how I read it)

    Final point. If you look at the 19 models, its actually NOT 19 models. ModelE, for example, appears
    in various configurations. What I would call different Levels of Detail.

    clear?

    • Posted Aug 14, 2010 at 9:23 PM | Permalink

      exactly what happened,the first were students,they were called “skunks”they never saw there summations again,onto more and more “authors”
      4 models then had input and output and the one that had the scenario of the “warming”was passed forward,on and on till they reached the peak they needed,im not a scientist,not a programmer,im a 50 something lady i have the data program involving Tim Lister,the other guy is redoing work because it still wasnt “peaked”enough,i have collected so much data but am unable to know what to do with it,i also have “mintemp-maxtemp”of australia from i think 1940,am happy to pass anything on if it helps stop this madness,regards lorraine

  127. Jaye
    Posted Jun 26, 2007 at 8:50 AM | Permalink

    RE:116

    LMCO guys huh…well that motor never failed once in flight. Of course, it does show the sorry state of NASA that LMCO is scurrying around copying an undocumented 40 year old rocket motor.

  128. jae
    Posted Jun 26, 2007 at 9:31 AM | Permalink

    126: LOL. And NONE of the models “knows” how to accomodate clouds and rain, arguably the most important variables in the whole climate system. I don’t see how grown up scientists can have ANY faith in these models.

  129. MarkW
    Posted Jun 26, 2007 at 10:18 AM | Permalink

    From the economists that I have talked to, the economic projections put forward as the basis of these “stories” are problematic at best.

    For example, in the story where they assume that all countries will catch up to the US, they make a gross currency conversion error, that causes them to overestimate the increased economic activity by a factor of 2.

    Additionally, they don’t assume that the other countries catch up to the US in terms of CO2 output, they assume that the other countries catch up to the US in terms of economic output. They assume that the CO2/unit of economic output for each country stays the same. The result is that the CO2 per capita in these other countries, at the end of the run, vastly exceeds that put out per capita by the US.

    Another point of contention are the population growth projectsions. There are many people, including myself, who believe that the lowest UN projection is actually closer to the highest, realistic possibility. That is, the UN says population will be at least 7Billion by 2100. More likely, it will be at most 7Billion.

    So not only are there serious problems with the models themselves, there are serious problems with the projections of how much CO2 will be in the atmosphere in the future.

  130. steven mosher
    Posted Jun 26, 2007 at 10:31 AM | Permalink

    re 128.

    “And NONE of the models “knows” how to accomodate clouds and rain,
    arguably the most important variables in the whole climate system.
    I don’t see how grown up scientists can have ANY faith in these models.”

    I think the importance of clouds KILLS hindcast skill, and impairs the
    forecast skill. We all know clouds
    are important. We all know getting cloud modelling right is critical.

    Problem: hindcasting clouds. Calibrating your model. Anybody got records
    of cloud cover, cloud height, density etc etc for the past 60 years.

    just asking

  131. Steve Sadlov
    Posted Jun 26, 2007 at 10:40 AM | Permalink

    RE: #125 – For those unfamiliar with it, that BBQ is specifically tailured for burning oak, in pursuit of perfect Tri Tip. Once you get the oak flaming, obviously there will be hours upon hours of consistent, dense heat.

  132. Steve Sadlov
    Posted Jun 26, 2007 at 10:42 AM | Permalink

    RE: #127 – Well, I can’t blame them, I blame the idiots who shut down heavy lift during the 1960s, thinking we’d never need that technology again. Fools! Morons! Brigands!

  133. Mats Holmstrom
    Posted Jun 26, 2007 at 11:17 AM | Permalink

    Re #129.
    Some scientists think that all the IPCC scenarios uses more fossil fuel than is available.

  134. steven mosher
    Posted Jun 26, 2007 at 11:30 AM | Permalink

    RE 125.

    Mobile homes attract tornados only in the midwest. Ask SteveS about the Sunnyvale F2,
    back in 98 I think.

    I walked out. My midwestern spidy senses went on alert. Felt like Twister weather. in NORCAL?
    no way. Ask SteveS I think he’s posted on it before. freeky day.

  135. MarkW
    Posted Jun 26, 2007 at 11:35 AM | Permalink

    133,

    The problem with the Pemex field has more to do with poor maintenance and investment than it does with how much oil is left in the world. In the last few years, sevaral major finds have been announced.
    Even when oil does start running low, we can tap oil shale and tar sands. Some estimates of oil shale in the Colorado region exceeds the reserves of Saudia Arabia. When that stuff runs low, we can convert coal to oil, and we have hundreds of years worth of that.

    Those who claim we are about to run out of fossil fuels are smoking something, and it ain’t oil based.

  136. Keith Herbert
    Posted Jun 26, 2007 at 12:03 PM | Permalink

    First off, let me state, I have little statistical knowledge, that is one of the main reasons I come here. I’m trying to grasp grid cell data but lack sufficient understanding to get Gavin’s point. The grid cell temperature (as far as I can discern) is determined by use of a weighted mean. Doesn’t this assume they know what to weight?
    If station temperature is used (are there other surface temperature measurements?), then they must rely on the accuracy of the measurement devices. Much of the erroneous data from an individual station will be time dependent, and likely show as a smooth transition. These include changes related to protective covering deterioration, urban growth, and land use changes. How would these be accounted for if they are not known to be errors?
    If one were to merely observe the data and not the site, the smooth changes would appear normal. I suspect a discontinuity such as changing location of the station , introducing a new measurement device, or installing a heat producing element adjacent to the device would appear anamolous and could be considered, but this is a small subset of the possible influencing factors.
    Additionally, it would seem likely that many stations within a grid would undergo similar station dependent transitions. So how would they account for any errors associated with this?
    I searched Nasa’s GISS site for “grid cell” and the top article that came up was this http://gacp.giss.nasa.gov/publications/special/knapp.pdf. This pertains to aerosols but may be relavent in terms of the use of grid cells. On page 15 they discuss the assumptions of grid cell homogeneity and possible bias. Is this not possible with surface temperature also? I didn’t see any Nasa articles in the grid cell list relating to Gavin’s work.
    Any help on understanding this would be appreciated.

  137. steven mosher
    Posted Jun 26, 2007 at 12:06 PM | Permalink

    re 133.

    Yes, There is a interesting interplay between the peak oil, peak gas, peak coal stories
    and the Global warming stories.

    The thing that stuns me stupid is we spend a huge amount of time looking at reasons for a
    .6C warming or .12 warming etc etc etc. BUT we spend no time looking at the emissions scenarios, when
    the scenarios differ by an order of magnatude.. nearly.

  138. Keith Herbert
    Posted Jun 26, 2007 at 12:09 PM | Permalink

    re #136
    omit the ending period to make link work

  139. Steve Sadlov
    Posted Jun 26, 2007 at 2:00 PM | Permalink

    RE: #134 – A freaky day during a freaky spring. A few days prior there had been an F1 reported at Moffet Field, a mere 3 miles to the northnorthwest. A few days after another outbreak of wall clouds and funnel clouds, with a possible F1 in SSF. One funnel cloud, perhaps the remnent of that F1, rotated right over SFO, freaking out the tower crew!

    That was during the last really good El Nino, we had a series of cut off lows slipping down the coast. For the Sunnyvale one, there was drizzle in the AM, which later evaporated into massive buildups along the Santa Cruz mountain front. There was an odd sheer scenario, aided and abetted by a sea breeze front …. more in line with the Gulf than Pacific coast. Indeed, freaky.

  140. Curt
    Posted Jun 26, 2007 at 2:28 PM | Permalink

    Re 125, 134: When Kerry Emanuel was an undergrad, he wrote and circultated a joke scientific paper purporting to give a physical basis for the phenomenon that tornados are attracted to trailer parks. It was well enough written that some people were taking it seriously.

  141. steven mosher
    Posted Jun 26, 2007 at 3:35 PM | Permalink

    re 39.

    I totally forgot about the Moffet one. I lost a lemon tree to that spring, not uprooted
    but snapped off. It was like being back home… except without the cellar.

  142. Steve Sadlov
    Posted Jun 26, 2007 at 6:55 PM | Permalink

    RE: #129 – I would be good money that by 2100, not only will global population be 7 B max, but it will be falling rapidly. Also, I would bet good money there will be at least one “unexpected” (“we honestly thought that people were more rational than this / we honestly thought that [the fiction of] MAD would keep great powers from fighting”) full scale, massively destructive world war sometime prior to 2100. Add to that, the chances of at least one truly tragic pandemic, and there you have it. The UN / Malthusians once again completely miss the mark.

  143. Marvin Jones
    Posted Jun 26, 2007 at 9:33 PM | Permalink

    RE: #128

    It isn’t about science! It’s about politics and wild guessing. “S”WAG

  144. Marvin Jones
    Posted Jun 26, 2007 at 9:35 PM | Permalink

    RE: #130 Clouds

    Lindzen is an idiot, and an industry hack*.

    *Note to those paying attention, that is sarcasm.

  145. MarkW
    Posted Jun 27, 2007 at 5:22 AM | Permalink

    Personally, I believe the population will top out a little over 6 billion in the 2050 to 2060 period, and then start falling. By 2100, I believe the population will be down to around 5 billion.

    Of course a new discovery that dramatically increases life span would through all of our guesses out.

  146. Reference
    Posted Jun 27, 2007 at 7:09 AM | Permalink

    MarkW said in #145:

    Personally, I believe the population will top out a little over 6 billion in the 2050 to 2060 period, and then start falling. By 2100, I believe the population will be down to around 5 billion

    According to the US Census Bureau the world population is already 6.6 billion and expected to climb to 9 billion by 2042. Expect an even bigger increase in the energy demanded as more people want cars etc etc.

  147. MarkW
    Posted Jun 27, 2007 at 8:49 AM | Permalink

    My last comment on this OT subject.

    Make that we will add about another billion by 2050, then start falling. Every 10 years the UN estimates of population numbers are even less accurate than their guesses regarding how much the world is going to warm up.

  148. Steve Sadlov
    Posted Jun 27, 2007 at 1:31 PM | Permalink

    RE: #146 – You go for it, bet on that. In fact, bet your whole portfolio on that assumption. You and a few hundred million other su…. sorry, I meant to write, souls.

  149. steven mosher
    Posted Jun 27, 2007 at 3:53 PM | Permalink

    UN Malthusians . ( sadlov hits the bulls eye)

    I’m a bit suprised that few people have seen this agenda. If one bought the AWG
    scenarios, one would not spend money on sequestering Carbon. One would sequester,
    cough cough, precious bodily fluids. Want to cut Carbon emmissions? Cut vas deferns.

    Then, in totally irony, claim your actions were taken for the benefit of future generations.

  150. Joe Ellebracht
    Posted Jun 27, 2007 at 5:00 PM | Permalink

    To get back to the original purpose of this discussion, station data and climate models, isn’t it obvious that the models use the temp data (gridded or otherwise) just for tuning purposes. All of them have embedded in them an equation that says co2 up equals temperature up (yes, expressed in a very complicated way). This equation is not negotiable, and never will be. It is this relationship that determines the whole debate, not aerosols or worms.

    Once all of the North American stations are audited, adjustments (continental) will be made by the regular crew of adjusters. I believe that the upward slope of continental temperatures will be intact, as the adjusters will have to be conservative in making their adjustments, not wanting to doom humanity or lose their fame. At worst, the temperature lag will be increased by a tiny fraction to accomodate the new tuning requirements.

    Hopefully, a set of reasonably valid stations can be carved out and counter-assessents of the temperature record can be published (but probably not in Nature). Sorry to be gloomy, tis cloudy here.

  151. Scott-in-WA
    Posted Jun 30, 2007 at 10:38 AM | Permalink

    #150

    I suspect one of the reasons Dr. Vincent Gray is so vehement in raising objections to the use of the term “experiment” in relation to the output from the GCMs is that he knows these global climate models are in fact data manufacturing tools, not data collection tools — as most reputable scientists would understand and employ the term “data collection” when applied within the study and replication of a complex physical system using a code-driven model.

    However, there is no way for the interested public to gain a useful appreciation for the important differences between how climate modelers do their work and how other scientists in other disciplines use computer models as a means of investigation and inquiry, as opposed to using their models for the custom manufacture of data in order to support one, and just one, preordained conclusion.

    What this implies is that if the general public is to receive an opposing view concerning AGW, the raw data will have to tell its own story. (Assuming anybody is listening, of course.)

  152. aurbo
    Posted Jul 1, 2007 at 7:54 PM | Permalink

    Re #s 125, 134 & 140; Mobile home parks being tornado targets:

    A few years ago, a tornado, I believe it was in either MN or WI, destroyed a mobile home manufacturing plant…thereby effectively eliminating the middleman.

    The problems with mobile homes are that they are rarely tied down and are aerodynamically unstable. A minimal F1 tornado can make them airborne. They’re not called mobile homes for nothing.

  153. Joe Ellebracht
    Posted Jul 3, 2007 at 11:14 AM | Permalink

    Re Gavin Schnidt’s July 2 realclimate essay on this topic, he got a lot of feedback. That which made it through the political correctness filters seemed mostly pretty sceptical of the “models don’t need no stinkin’ station data” theory of climate modeling.
    Perhaps something like a debate is beginning to occur.

  154. A Azure
    Posted Jul 3, 2007 at 1:19 PM | Permalink

    Dr. Pielke has a very strong rebuttle to the RealClimate folks; well worth its reading….

    Cheers!

  155. steven mosher
    Posted Jul 3, 2007 at 2:16 PM | Permalink

    6 strawgirl arguments. A strawgirl hockeyteam. nasa style.

  156. steven mosher
    Posted Jul 3, 2007 at 2:49 PM | Permalink

    RE 152.

    having lived in Tornado land for a couple decades, and having driven through the carnage
    “just to see” I noticed this.

    mobile home parks tend to be located in Wide open flat treeless landscapes.

    One wonders what the groundscape sucks.

    Put another way, SteveS and others can prolly speak to the ground conditions that are neccasry and suffienct
    for twister landfall.

    As a kid, I learned to steer clear of the flat wide open spaces.

One Trackback

  1. […] see Roger Pielke Sr’s blog posts here and here and the comments on the threads at ClimateAudit here and here. That blog post from Trenberth is still being referenced in blog posts (this one […]

Follow

Get every new post delivered to your Inbox.

Join 3,299 other followers

%d bloggers like this: