Crossword Puzzle #3

Let’s move from the 2 column case to the 3 column case (e.g. Bagdarin, already considered) applying the results here. Some of the hypotheses from the earlier discussion have to be re-visited. Bagdarin has 3 dset=0 versions (0,1,2). As we’ve seen in the 2-column case, the column that continues to the present (version 2) exactly matches the dset=1 version in the later part of the record when there is only version. Bagdarin version 0 is reduced by 0.3 deg C, version 2 by 0.2 deg C. Given these adjustments, dset=1 more or less follows.

Versions 0 and 1 are scribal variations for 1980 and after and from 1951 to 1960, but are discrepant between 1960 and 1980. For analysis, it might be a good idea to find a record that has 3 columns and is only scribal.

Hansen and Lebedeff 1987 describe an iterative procedure for combining the versions net of the deductions. My experiments indicate that this boils down to a simple unweighted average of the versions net of deductions, but this is experimental so far.

Hansen’s description (for whatever that’s worth) indicates to me that he first calculated the delta between versions 0 and 1 (or alternatively between 1 and 2), then formed an interim composite and repeated the procedure. However, I couldn’t get everyting to work. If I apply the proposed 2-column Hansen-delta calculation to versions 0 and 1, I get a delta of -0.2, followed by a delta between the interim series and version 2 of +0.1: so this doesn’t work.

If I try version 1 against version 2 first, I get a delta of 0 followed by a delta of -0.2 between the interim series and version 1.

These deltas seem quite unstable to ordering in a first peek.

So today’s puzzle: find a system for the 3-column case, consistent with 2-column results.

Of course, Hansen could always free the code.


  1. Falafulu Fisi
    Posted Sep 6, 2007 at 8:12 PM | Permalink

    Steve M, what is the reason behind Dr. Hansen’s refusal to provide the code for his publication if it was peer reviewed? I find it unusual, since if it was a commercial work, then you could understand the refusal to make the codes available to the general public, because it has commercial value. I don’t publish peer review work myself, however I scour different types of peer reviewed science & engineering journals looking for interesting algorithms that might be useful in what I do.

    If I come across an algorithm that would be of interest, then I first request the author/s, to see if their versions of their codes is available to the general public. If the published algorithms of those authors were done based on commercial sponsorship, then I am told that their implemented codes are not available to the public, however they are very happy to answer any question regarding the published algorithm if I want to implement my own based on their paper. For example, the following paper “Numerical pricing of discrete barrier and lookback options via Laplace transforms“, was published in the Journal of Computational Finance and sponsored by CSFB (Credit Suisse First Boston). I made contact with the author Prof. Steven Kow of Columbia University, to see if his implementation is available. He told me that his work was commercial and he is sorry, that he couldn’t give me his codes, but I am welcome to ask him any questions about the algorithm if I want to implement it myself. In Non commercial publication, I have always request that authors code and they do send thru if they already have an implementation.

  2. Falafulu Fisi
    Posted Sep 6, 2007 at 8:22 PM | Permalink

    The following link is a must read about Reproducible Research. Prof. David Donoho and colleagues at Stanford have developed a wavelet toolkit, named WaveLab (I have used it for a few years now) where they made their codes available to the public, and their main reason is so that anyone could reproduce their research. Read about it below.

    WaveLab and Reproduceable Research

  3. Mike
    Posted Sep 6, 2007 at 8:50 PM | Permalink

    Have you tried a FOI request yet? Also, is there anyone else who works with Hansen who might be more sympathetic to the need for auditing? Someone who might leak the code anonymously? How about that guy who was asking for the scraped GISS data, any chance of him returning the favour?

  4. Posted Sep 6, 2007 at 8:50 PM | Permalink

    Hey Steve,

    Big fan here, but I have a question, please excuse the question it might sound confrontational but I don’t mean it to.

    I’ve noticed you are continually auditing Hansens, HadCRUT3 and all the others work for mistakes and point out that they have made a lot of mistakes, which is excellent. I was wondering when or if ever you would just get fed up with waiting for them to “free the code” and make you own analysis of the worlds temperature data, and create your own paper on the subject? And then release the code and the system used to make the chart, wouldn’t your’s become as “official” as theirs? After all the only reason theirs is official is because they say it is, obviously not because anyone has actually tested their methodology, and found it to be correct. They could test yours.

    It’s just a thought, and might help put a lot of added pressure on many of the so called “experts” to show their methodology (free the code), or be left as yet another Piltdown Man in the footnotes of history?

    Do you think their temperature estimates would differ substantially form your own?

    Sorry if I’m off topic I just after reading the last 2 or 3 Posts it’s obvious the system needs a total rehaul and I doubt very much it will happen from within.

  5. John V.
    Posted Sep 6, 2007 at 9:06 PM | Permalink

    I agree with you that generating an alternative analysis with open code has many advantages. Most importantly, it would lead to a better understanding of the truth. Second, it would pressure the authors of previous analyses to open their code. Third, it would help dispel any conspiracy theories.

    I am looking into the feasibility of writing new analysis code myself (see my comment #115 here). With support and feedback from this community a very solid program could be written and publicized in a relatively short time.

  6. David
    Posted Sep 6, 2007 at 9:29 PM | Permalink

    #4, #5: One can either choose to play the game by their rules, or expose the flaws in the system. I would rather Steve keep doing what he is doing because he is helping to expose the flaws in the system.

  7. Steve McIntyre
    Posted Sep 6, 2007 at 9:36 PM | Permalink

    #4. If I were doing it, I would want to develop proper information on every station used in the network. It’s a big job and would take time. They’ve been doing this for 25 years and obtained hundreds of thousands, if not. millions of dollars in funding. IT doesn’t seem like they’ve done much in the way of data quality control and so what they seem to have produced is some fairly crappy code to calculate average temperatures from poorly QCed data.

    Just because Hansen, like Mann, has some gross errors, doesn’t mean that you can obtain something meaningful just by fixing the gross errors.

  8. Posted Sep 6, 2007 at 10:08 PM | Permalink

    # 7


    That’s nothing compared with what we discovered in the Meteorological Station at Linares 🙂 The barometer was a laundry bathtub and the thermometers were alcohol thermometers manufactured in 1959! And our discovery happened in 2001!

  9. Steve McIntyre
    Posted Sep 6, 2007 at 10:11 PM | Permalink

    #GISS/GHCN only show Linares up to 1983. Any plausible reasons why.

  10. Damek
    Posted Sep 6, 2007 at 11:31 PM | Permalink

    So what was the conclusion on the 2 column cases? Were we using Hansen’s yearly averages to determine the deltas? Is there a complete description in one of the threads about this?

  11. Damek
    Posted Sep 6, 2007 at 11:37 PM | Permalink

    I know that this doesn’t match Hansen’s description, but does calculating the deltas from version 2 before doing any combining work? So right off the bat calculate the delta for version 1 & 2 and also for versions 0 & 2. Then after applying the deltas, combine versions 1 & 2 (I’ll call 1-2). Then combine 1-2 with version 0. I would test this myself but I don’t have the yearly averages or the delta calculation method in front of me.

  12. Leonard Herchen
    Posted Sep 6, 2007 at 11:53 PM | Permalink

    Is there somewhere all the temperatures are stored in a single database, easy to download? Or, If I wanted to play with the data, do I have to go to GISS and download it all station by station? I could get a scraper to do that, but it would still be some time.

    Bottom line is, if I want to see the raw data on all the thermometers, is anyone sharing that? It can’t be that many megabytes.


  13. KDT
    Posted Sep 7, 2007 at 2:09 AM | Permalink


    First, January 89 is a special case. I suspect it may have been dealt with as an outlier, but the method is not clear. Whatever the method, it wasn’t dealt with very well. Need to read that part of the paper again.

    If you ignore that month, this worked for me.

    Combine Rec0 with Rec1 by applying a bias of -0.11 to -0.15 to Rec0
    Round the resulting record (RecNew) to the nearest 0.1 (round in the positive direction)
    Combine RecNew with Rec2 by applying a bias of -0.21 to -0.25 to RecNew
    Round again for output

    I think I ruled out any solution without rounding the combined record.

    I think I also ruled out the other way to round (-1.05 rounds to -1.1, no good.) Round to the positive direction.

    I think I also ruled out any other combination order for this set, including combining them all at the same time.

    And finally, I think I ruled out rounding the bias itself, it just about has to be at least 2 decimal places, probably just floating point.

    I think I think I think.

    See if that works.

  14. KDT
    Posted Sep 7, 2007 at 4:06 AM | Permalink

    On how to combine multiple series, the following rules should work.

    Assume the records are numbered in the order to be combined.
    For each pair:
    Pin the present: Apply bias to the record that ends earlier.
    If both end on the same date, bias the shorter one.

    I’ve combined a lot of pairs, these rules hold so far. The ordering assumption makes sense for batch processing.

    I’m a night owl! Hoot!

  15. JerryB
    Posted Sep 7, 2007 at 5:34 AM | Permalink

    Re #12


    GISS gets temperature data from GHCN which publishes temperature, as well as
    precipitation, data at .
    See the temperature readme file for format info.

    See PDF file at
    for some background on data collections.

  16. Posted Sep 7, 2007 at 5:37 AM | Permalink

    Re: 12



  17. Murray Duffin
    Posted Sep 7, 2007 at 7:45 AM | Permalink

    Re 7: snip #4. If I were doing it, I would want to develop proper information on every station used in the network. snip
    Steve, could you elaborate on what you mean by “proper information”? I’m guessing that it would be mostly metadata and UHI effects. Agreed, on over 6000 stations that would be a big job, probably needing funding.
    However, there may be other ways to address the issue. One would be for this group to agree on the best way to process the existing data, based on the kinds of data problems known now, and on the known defects in the Hansen method. Then write the program to do that. Then process the same data that Hansen has used, and compare the results. This would expose systematic averaging errors, if any, and give a better GW result. This job is well within the range of capability of this group, probably easier than puzzles #1 and #2, so far. (Could be solving puzzles is perceived as more challenging/fun). John V, in #5, seems willing to tackle the SW development for this one.
    Step 2 would be for this group to agree a set of rules for treating apparent UHI for all stations where population growth from ca 1975 to ca 2005 is available. One of the rules would have to deal with apparent saturation, because it appears that urban agglomerations do reach a point where the temp. increase flattens. Then all sites where temp. and population delta are known could be recalculated, and then step one re-run. The comparison of GISS, step one and step two, would probably be very enlightening.
    Step 3 owuld be agreeing how to treat all stations according to known metadata station changes. That is the big one from a work point of view, but might make the smallest contribution to getting as correct results as can be got. (For sure we can’t go back and regenerate high quality initial historic data).
    I think that all the participants (and lurkers) here believe that the surface instrument average GW as generated by GISS is wrong due to flawed methodology, and the active contributors are trying to demonstrate that. Doing steps one and two above would probably be the best demonstration you could do. Given that a good methodology, implemented by good SW would be the result, actually knowing all the flaws in the GISS methodology would be superfluous.
    Might not be as much fun as puzzle solving and ferreting out GISS flaws, but seems to me it would be vastly more productive. Murray

  18. Murray Duffin
    Posted Sep 7, 2007 at 7:57 AM | Permalink

    Re” #17
    Steve, I can’t fund step 3 above, but I will put up $1000.00 for the person or team that implements step one. Ie write a set of rules that key contributors here agree to, write a program that all agree implements those rules, and then rerun the available data through that SW. You define how the judging will be done, and tell me how to put the stake in escrow.
    If this proposal seems ok to you, we can go ahead as now, or I plan to be in Toronto the 14th through 17th Sept., and would be delighted to meet you. Murray

  19. John V.
    Posted Sep 7, 2007 at 8:26 AM | Permalink

    Step 1 can be broken down into sub-steps:

    1a. Generate station monthly data from daily data:
    Although the monthly data is already available in GHCN v2, it may be useful to generate the monthly data from scratch so that error bounds on the monthly averages can be determined and recorded. The result of this step would be compared to GISS dset=0.

    1b. Combine station data:
    Combining multiple sets of station records can be done directly from the daily data (my preference) or from the monthly averages. In either case, the variance of the offset should be calculated and stored. The result of this step would be compared to GISS dset=1.

    1c. Homogeneity adjustments:
    I am not sure how this could be done as I have not looked for any reference documents. The result of this step would be compared to GISS dset=2.

    1d. Generate regional and worldwide temperature trends:
    The boxing method used in GISS could be applied, but I think there are better methods. Would have to research this.

    As for step 2, a good starting point for determining UHI effects would be creating a new study similar to Peterson 2003. It should not be difficult to get population statistics and trends for North American stations. We could use these to differentiate purely rural stations, long-time urban stations, and newly urban stations.

    I apologize for hijacking this thread. Steve M, would you consider opening a new thread for this discussion? Thanks.

  20. Steve McIntyre
    Posted Sep 7, 2007 at 8:37 AM | Permalink

    #19. 1b – you don’t have daily data on most stations.

    I’m aware of these steps. Right now I was trying to solve particular issues with GISS, beginning at the GISS beginning. There are several steps which we’ll get to and we’ll get to GHCN as well.

  21. Ron Cram
    Posted Sep 7, 2007 at 8:40 AM | Permalink


    Falafulu Fisi,
    Thank you so much for the link to WaveLab. The pdf was very interesting reading. Those guys feel strongly about supplying the code because they could not reproduce their own results when the code was lost. I have included a link to the website and pdf paper on Wikipedia’s article on data sharing.

  22. Jan F
    Posted Sep 7, 2007 at 8:41 AM | Permalink

    #19 step 1a and 1b

    Why not always use the daily data?
    Computers are powerful enough these days plus averaging an average can generate a bias and is sensible to the applied sequence.

  23. John V.
    Posted Sep 7, 2007 at 8:50 AM | Permalink


    you donÂ’t have daily data on most stations.

    If that’s the case, then monthly data it is.

    Steve, I realize that you are aware of the steps and that you are working on other things right now. I am offering to help where I think I am most able (writing new analysis code).

  24. KDT
    Posted Sep 7, 2007 at 9:42 AM | Permalink

    If my bias figures are confirmed, I am flabbergasted. I already knew this record was flawed, but to see the process in action is amazing. When record 2 is added to the mix, Hansen tries to detect a .13 bias that exists between these records. But he comes up with -.24, a difference of -.38 degrees.

    His error is triple the bias he is looking for.

    And here’s the kicker. That .14 bias was introduced by Hansen’s same error in the first combination! If he hadn’t screwed that one up, there would be no bias to detect.


  25. John V.
    Posted Sep 7, 2007 at 9:49 AM | Permalink

    I agree that daily data should be used for most work (assuming daily data is available).
    I can think of a few reasons for generating monthly data though:

    – comparison against GISS dset=0 (for validation)
    – monthly and seasonal temperature trends
    – plotting and visualization

  26. Al
    Posted Sep 7, 2007 at 12:08 PM | Permalink

    This entire area is sufficiently screwed up that a simple reproduction of the historical papers would be worthy of publication.

  27. steven mosher
    Posted Sep 7, 2007 at 12:24 PM | Permalink

    Daily data… Tmax and Tmin is critical if you want to do UHI and/or Microsite

  28. Sam Urbinto
    Posted Sep 7, 2007 at 1:07 PM | Permalink

    Come on Jimmy, free the code!

  29. steven mosher
    Posted Sep 7, 2007 at 3:38 PM | Permalink

    RE 19.

    John V. Have a read thrugh Hansen87, hansen99 and 2001, The latter two are on
    GISS the former is linked in the Hansen BIAS thread.

    Nothing made sense to me until I read H87.

  30. Posted Sep 7, 2007 at 6:39 PM | Permalink

    # 9

    Steve McIntyre,

    No, I’ve not a plausible explanation. Perhaps it was the obsoleteness of their equipment. The case went to the public dominion, but I didn’t follow the results and if they restructured their methods. I didnÂ’t know that GISS/GHCN on Linares had stopped in 1983; could you give me the link? Perhaps I’ll find something about.

  31. Steve McIntyre
    Posted Sep 7, 2007 at 8:07 PM | Permalink

    #30: follow the choices.

  32. Posted Sep 7, 2007 at 11:37 PM | Permalink

    Thank you, Steve! And you’re right, Linares is not working now! I’ve sent E-mails to some colleagues working there to know the cause. As soon as I have the answer, I’ll make you know it. I’m really ashamed by this inconvenience.

%d bloggers like this: