19 Versions and Whadda You Get

Recently Anthony Watts noted that the Lampasas TX station was relocated in 2000 to an extremely poor location and attributed a hockey-sticking of the Lampasas series to this re-location. In a comparison that I made with nearby Blanco TX (which is the sort of comparison that USHCN says that they do), it seemed plausible that the move could have added over 1 deg C. Atmoz argued that the site problems had a negligible impact and that there was some presently unknown problem with the GISS algorithm, which I dubbed a UFA (unidentified faulty algorithm). In order to examine this a little more closely, I waded one more time into the swamp of temperature data, re-collating various versions of the Lampasas temperature series, eventually ending up with 19(!) different versions of the Lampasas TX temperature history. Perhaps, borrowing the language of climate modelers, we could dub these an “ensemble”.

These include various versions of the NASA GISS “raw” and “adjusted” series, scraped from the NASA website last year (causing some controversy within the climate blog world), but which now yield an interest resource for comparing these different versions.

Here’s one comparison that caught my eye and caused me to re-work the material in a little more detail. This shows the impact of the Hansen adjustment (USHCN adjustments are additional) over three stages: green- pre-identification of the “Y2K” error; red – immediately post identification of the Y2K error identification reflecting corrections implemented in Aug 2007; black – the current adjustment, which includes the various changes made (without announcement) in Sept 2007 and which caused considerable puzzlement here as we decoded them. NASA has now provided information and the present documentation while, hardly to the standards that I would recommend, are an improvement and better than the documentation for rival collations, such as CRU.

As soon as one sees this graph, one wonders: what caused the NASA-stage adjustment for Lampasas TX in the early part of the 20th century to increase by as much as 0.3 deg C in the early part of the century? By raising this question, I do not imply that all policy initiatives should be put on hold pending resolution of this matter (which seems to be an all too typical straw man response), but mere curiosity: what is it in the behavior of the algorithms that leads to this result? Why is the temperature history of Lampasas TX being thrashed around this way? And this revision is taking place in one of the best documented networks in the entire world? And the changes are not just the Y2K adjustment as the major change occurred after the initial adjustment for the Y2K error.

texas53.gif
Figure 1. NASA GISS Adjustments to the Lampasas TX station history.

The 19 Versions
From time to time, I post information on Station Data sources on a permanent page (see left frame) and this should be consulted for URLs.

So far I’ve identified 4 different locations where a total of 19 different versions are archived (this does not include multiple editions of the same version). The 4 locations are NOAA, CDIAC, GHCN and NASA GISS. Some of the data handling seems rather hard to justify, but, hey,

NOAA: NOAA is by far the most up-to-date (up to May 2007) – it has Raw, TOBS and Filnet versions. I recommend that this version be adopted by all users.

CDIAC: CDIAC’s data ends in Dec 2005. They are at least 2 updates behind NOAA for reasons that are unclear. I’ve spot-checked CDIAC versions against the corresponding NOAA versions and found them identical in the examples that I studied. In addition to the Raw, TOBS and Filnet versions, CDIAC produces an Urban-adjusted version (also to Dec 2005). Intermediate versions (SHAP, MMTS) up to early 2000 were formerly available at CDIAC but were deleted in the past year. Jerry Brennan managed to archive the vintage versions before they were deleted.

GHCN: GHCN’s version of USHCN data ends in March 2006 and are also at least 2 updates behind NOAA, also for reasons that are unclear. Other than being short-dated, the GHCN Raw version corresponds to the USHCN Raw version up to rounding. For the most part, other than being short-dated, the GHCN Adjusted version matches the USHCN Filnet (Adjusted) version up to rounding. For reasons that seem hard to understand, the two versions do not match in the late 1940s or the early 1890s, where the GHCN Adjusted version appears to be drawn from the USHCN TOBS version or some analogue. Right now, it’s hard to say – other than the versions are related, but distinct.

texas67.gif

NASA: Now for the really thorny part. NASA went through some fairly wild gyrations last summer, making it difficult to tell what was going on at any given time. However, I think that it’s now possible to diagnose things fairly accurately. At present, their most recent data is drawn from GHCN (and thus, like GHCN, only goes to March 2006) even though data is available at NOAA up to May 2007. In addition, NASA does a bizarre adjustment in order to splice two different data sets – a completely unnecessary torturing of the data if they merely used the readily available and more up-to-date NOAA version, as I’ll show below.

Pre-Y2K: As of mid-2007, prior to the identification of the Y2K error, NASA used the vintage (2000) CDIAC SHAP version up to 2000 and the GHCN Raw version for 2000 and after. They removed any USHCN interpolations (made during the SHAP adjustment, which is included in the Filnet version BTW); instead of using the SHAP/Filnet interpolation, they used the Hansenizing interpolation that we’ve discussed elsewhere in connection with dset=1. [raise eyebrow /raise eyebrows]. The splice was particularly problematic if there was a TOBS or other adjustment in effect as at 2000 and this led to the Y2K error (and resulting bias). This process yielded the GISS “Raw” version (dset0). For USHCN stations, dset1 is equal to dset0 since there is only one record and the Hansenizing dset1 splice does not come into play. From this, they then did another adjustment – their two-legged trend adjustment to coerce results to unlit stations. What’s a little hard to understand about this methodology is why “lit” stations are being used at all, if the unlit stations are driving matters – but that’s a question for another day.

Post Y2K: After the Y2K problem was brought to their attention, they appear to have patched the above method by calculating (separately for each month) the difference between the GHCN Raw and Shap versions for the 15 years up to and including 1999 and then adjusting all data up to and including 1999 by this amount, in effect re-writing their entire USHCN data set to cope with the Y2K splicing error. I think that it would have made more sense to leave the 99% of the data in place and adjust the post-2000 data. The NASA two-legged adjustment was re-applied to the revised data, leading to revised adjustments.

Current: After these first changes were made in mid-August 2007, NASA scrubbed the modified version and issued an entirely new version, to our considerable consternation last September. It got a little hard trying to keep up, but things seem to have settled down. In their third go at this, they’ve used the CDIAC Filnet version up to December 2005, which is now combined with the GHCN Raw version (for the three further months up to March 2006). As in the first patch, they appear to have patched the discontinuity by calculating the 15 year difference by month and then applying this difference to all records prior to Dec 2005.

It’s a bizarre and goofy way of doing things on a number of counts. First and most obviously, NOAA has a Filnet version through to May 2007 that is online and easily accessible. So why even use the stale GHCN version? And since they are using GHCN – why use the GHCN raw version and calculate a patch? If they’re going that route, which is stupid, why not just use the GHCN Adjusted version?

Filnet and UHI
When Hansen applies the NASA adjustment to data, relying on “unlit” sites, the effect of the USHCN Filnet adjustment needs to be considered. This adjustment appears to do some quite odd things which we’ve talked about from time to time. There are some “good” USHCN sites and it’s a little disquieting to see some of the adjustments to the “good” sites; while I haven’t assessed the quantitative effectiveness of the TOBS adjustment or the MMTS adjustment, I can understand why this is not unreasonable in principle for a “good” site, but the other adjustments applied to “good” sites trouble me. My impression of the impact of the SHAP/Filnet adjustments is that, whatever their stated intention, they end up merely creating a blend of good and bad sites, diluting the “good” sites with lower quality information from sites that are “bad” in some (objective) sense. When this version gets passed to Hansen, even his “unlit” sites no longer reflect original information, but are “adjusted” versions of unlit sites, in which it looks to me like there is blending from the very sites which are supposed to be excluded in the calculation.

Having said all this, even if the Hansen methodology is somewhat incoherent, as we discussed last fall, there are enough decent sites in the USHCN network that the overall results are not implausible.

The real issue in all of this is the quality of information in the ROW. On this topic, there are some opposing tendencies. I’m satisfied that the Lampasas site relocation in 2000 had a measurable impact and reported temperatures and demonstrates one more time, if further demonstration were needed, that you need impeccable meta-information and “good” sites if you want to develop the best quality long-term time series. You need to work out from the best data.

There are pros and cons to the USHCN network. It is a volunteer network, leading to some peculiar locations. Of course, some of these locations are by volunteers who should know better, such as presumably Atmoz’ University of Arizona. Somehow it seems unlikely to me that Canadian or Swiss weather services would have quite the same profusion of exotic violations of WMO standards.

On the other hand, there doesn’t seem to be any shortage of exotic stories about mis-measurement in various parts of the world. Without proper documentation by the national services, it’s hard to assess the situation.

However, weather stations in other countries seem far more likely to be located at airports and stations in the MCDW network which make up the lion’s share of reporting stations in the post-1993 GHCN network used by both CRU and GISS appear to be especially so. The classic urban heat island effect, as opposed to gross violations of WMO policy, would appear to be more of a consideration in these networks than in the USHCN network. (And of course, SSTs are different story again.)

Regardless of whether these station histories “matter”, surely there’s no harm in NASA (and GHCN) adopting rational approaches to their handling of the USHCN network. To that end, I would make several recommendations to NASA:

1. Use the NOAA USHCN version rather than the stale CDIAC and/or GHCN versions.
2. Lose the splice and the patch.
3. Use USHCN interpolations rather than Hansenizing interpolations.
4. Use TOBS or perhaps MMTS, and if MMTS is used, ensure that NOAA places this online.

For GHCN, I’d be interested in knowing exactly where are they getting their data from. (It’s not impossible that the difference that I noticed here, results from inconsistent USHCN versions – so maybe there’s something going on there as well.)

Here’s the range of the ensemble (with the two versions with Y2K error removed.) The range of the ensemble in the early part of the century reaches about 1 deg C. The variation is not limitless. This range does NOT include the impact of the relocation which was the effect that we originally sought to measure – which is on top of this ensemble. I may try to update this graphic at some point when we know more about the relocation error.

texas68.gif

240 Comments

  1. Vic Sage
    Posted Feb 20, 2008 at 3:35 PM | Permalink

    Steve, I think what your doing is a good thing. Science, especially when it’s being used to make social policy, should be questioned.

    But do you think that the questioning will ultimately do any good?

  2. Joe Black
    Posted Feb 20, 2008 at 3:56 PM | Permalink

    but, hey,

    Heh, heh, heh, he said “but, hey”.

  3. jeez
    Posted Feb 20, 2008 at 4:08 PM | Permalink

    I’ll buy that for a dollar!

  4. Bill
    Posted Feb 20, 2008 at 4:13 PM | Permalink

    Is this a ‘proof’ for my new favorite saying (and I don’t remember who first stated it so I cannot give proper attribution much to my chagrin), “Numbers are like people, torture them enough and they’ll tell you whatever you want to hear.”

  5. steven mosher
    Posted Feb 20, 2008 at 4:27 PM | Permalink

    Ideally, NASA shouldnt even do a temp series.

    Noaa controls the network and the data and the adjustments.

    Noaa should provide 1 climate science qualified dataset. Quality screened and
    adjusted.

    Intermediate files can be provided ( raw etce) but One file is Quality assurred.

  6. Posted Feb 20, 2008 at 4:33 PM | Permalink

    Is there any easy way to find out how many stations have had there records adjusted with the “current” NASA “cool the past to warm the present” method? That step down in the ’30’s and ’20’s is pretty impressive.

  7. crosspatch
    Posted Feb 20, 2008 at 4:38 PM | Permalink

    “But do you think that the questioning will ultimately do any good?”

    That was the thought I had. I was wondering if Steve M really expected any answers to those questions here or if they were rhetorical questions.

    What appears to me to be the case is that the modifications aren’t so much about values as they are about slope. It seems to be more important to get the slope of the change of temperatures over recent decades to be the way they want it than it is to get the values “correct”. “Fixing” the Y2K error caused the shape to change with more modern temperatures being made cooler. So changes since that initial change seem to be about making the past cooler so that the old shape can be brought back to the graphs without making the recent times warmer. You just make the past colder. Then the graphs come back into the same shape they were before albeit shifted down in absolute value somewhat. It seems that “rate of change” is attempted to be preserved and not “magnitude of change”.

  8. Mike B
    Posted Feb 20, 2008 at 4:45 PM | Permalink

    Steve, your “ensemble” comment about made me fall out of my chair. Nicely played.

    It would be interesting to see what would happen if you put a dozen or so central Texas temperature series throught the Mannomatic. I’ll bet you a cold one that PC1 would look like Lampasas.

  9. Robert in Calgary
    Posted Feb 20, 2008 at 5:08 PM | Permalink

    Re: 2

    Exactly what I was thinking (heh)

  10. Anthony Watts
    Posted Feb 20, 2008 at 5:15 PM | Permalink

    RE9,10 Steve S, Robert

    There may be a more plausible explanation, a flaw in methodology related to night lights. Steve Mc, Mosh, and I are looking at this now.

    Let’s not accuse with such broad strokes. We must find out what is going on with adjustment methodology first.

  11. Posted Feb 20, 2008 at 5:28 PM | Permalink

    Anthony,

    Since you have 100% coverage of the California stations maybe a focus there would provide the easiest path to determining whether it’s night lights or just “tune to cool”.

  12. Steve McIntyre
    Posted Feb 20, 2008 at 5:40 PM | Permalink

    Aside from anything in this network, one logical questions is: if it’s necessary to do TOBS, SHAP, MMTS, FILNET, NASA adjustments to estimate the temperaturein Lampasas TX, shouldn’t one do all those thing for Shanghai as well?

  13. Geoff Sherrington
    Posted Feb 20, 2008 at 6:11 PM | Permalink

    A revisit needed?

    Since TOBS is a dominant land temperature adjustment, might we please be able to revisit the matter so eloquently put by Bob Meyer June 23rd, 2007 under “Stamford CT”, post #1?

    There’s something that I don’t understand about changing Time of Observation. If the system only records maximum and minimum temperatures what difference does TOBS make? The same high temp and the same low temp are recorded (although they might appear on different days). Over a month the average high and the average low temp should be very nearly the same with only one measurement changed in a thirty day period. Over a year there would be no discernible difference.

    Yes, I have done further reading to understand it and it eludes me still. In particular, (a) should the adjustment for a large area like USA trend towards zero over a long term, whereas adjusted graphs commonly show non-constant, non-zero adjustment? (b) is/was it still used after the shift to MMTS etc? (c) do we now have more recent papers where thermometers were read several times a day, the real max and min determined and the TOBS adjustment modelled under various prior operational assumptions, or compared with MMTS? (d) does the TOBS correction feed back to the calibration of satellite temp data?

  14. JimC
    Posted Feb 20, 2008 at 6:35 PM | Permalink

    Dr. Pielke’s latest paper and today’s discussion is on this very subject. One of the conclusions is thus:

    the poorly sited locations can not be “corrected” by using nearby better sited locations in order to provide added sources of independent data;

    http://climatesci.org/2008/02/19/photographic-documentation-of-poor-sitings-part-iii-from-our-jgr-paper/

    Not being in climate science I am not familiar with much of the methodology, however working in metrology (not to be confused with meteorology) for many years conditioned me to have a good sniffer for suspicious data. To be frank, this whole surface temperature affair would be totally unacceptable in any industrial capacity. Climate controlled labs don’t afford the same uncertainty in measurement claimed by the powers that be in climatology.

    Are the measurement devices calibrated? How often? History? What is their uncertainty? Have outside influences been accounted for and quantified? What is the resolution of the device? And on, and on.

    From what I gather, there is a considerable amount of subjectivity and despite reassurances the data is both precise and accurate to xx, it is idealistic at best IMO. Please convince me I’m wrong.

  15. George M
    Posted Feb 20, 2008 at 6:43 PM | Permalink

    Tell me again what possible justification there is for changing temperatures recorded a hundred years ago, back when the observers were more likely to be dedicated to their tasks? Instrumental error? Time of day? (I still don’t see that one) Poor observer eyesight? Or the reason no one wants to discuss? Funny that the adjustment matches the model ‘predictions’ so well. Seems to me that it ought to be the other way around, but there is that model vs. the real world conflict again, and we all know the models are much better than any observations. Is any of this for real?

  16. eric mcfarland
    Posted Feb 20, 2008 at 6:57 PM | Permalink

    What about the fact that Watts, et al., can’t even seem to find the sites … or line them up with their official locations … does that not make all of this “bad site” judgment a little bit premature?

  17. Earle Williams
    Posted Feb 20, 2008 at 7:02 PM | Permalink

    This is a bit off the topic, but I want to note the homogeneity correction for station moves due to change in elevation. As I understand from discussions of the GISTEMP code, a correction is applied based on the atmospheric lapse rate of 6.5 C per 1000 m. Any bit fiddlers recall if this is accurate? (mosher or John V in particular)

    My point is that the stations are not moved vertically in the atmosphere, they are moved to new locations on the earth surface. The free atmosphere assumption really doesn’t apply when your observation point is still ~ 1.5 m above the ground, which is known to have substantial reflective, conductive and radiative properties.

    To satisfy my curiosity I plotted up elevations and mean temperatures for about 35 random locations in the western US, noting the least square fit to be 4.4 C / 1000 m for the mean temperature in January, compared to 3.4 C / 1000 m for the mean temperature in July. Splitting the difference, that’s 3.9 C / 1000 m that would be justified in doing elevation adjustments. These locations run from the California central valley to the Missouri River between latitudes 36.0 and 41.5. Mean temperatures are from NWS calculations for the period 1971-2000.

    It seems to me that the USHCN has all the information necessary to do this calculation on a grand scale, to arrive at a much more meaningful value for doing elevation adjustments. What would be the net effect of using 3.9 versus 6.5 C per 1000 m? Any thoughts?

  18. Sam Urbinto
    Posted Feb 20, 2008 at 7:03 PM | Permalink

    If it’s necessary to do TOBS, SHAP, MMTS, FILNET, NASA adjustments to estimate the temperature in Lampasas TX, shouldn’t one do all those things for Shanghai as well?

    It’s not a temperature estimate, it’s a sampling of temperature to determine an anomaly down the road to combine with other stations in the grid in question.

    On the other hand, I think you’d have been doing this long enough, Steve, to realize probably one would have to be interested in getting something reliable to do things with any consistency and reproducability.

    On yet the other hand, figures don’t lie, but…..

  19. Raven
    Posted Feb 20, 2008 at 7:05 PM | Permalink

    eric mcfarland says:

    What about the fact that Watts, et al., can’t even seem to find the sites … or line them up with their official locations

    I am not sure what your point is. If Anthony and company cannot find the sites based on the official description and location then that means any GISS UHI adjustments applied based on that data are wrong no matter what bias they might add.

  20. Earle Williams
    Posted Feb 20, 2008 at 7:05 PM | Permalink

    lowercase eric,

    What does the station metadata say bout the location?

  21. Posted Feb 20, 2008 at 7:09 PM | Permalink

    Ref 15 George M. I agree completely. The temperatures measured were the temperatures measured. Error bars for instrumentation changes are fine, but suppressing temperatures in the past is nonsense. If the corrections are required stick them on the modern end.

  22. Sam Urbinto
    Posted Feb 20, 2008 at 7:12 PM | Permalink

    What about the fact that Watts, et al., can’t even seem to find the sites … or line them up with their official locations … does that not make all of this “bad site” judgment a little bit premature?

    I need you to go to find the station at Longitude 52.3 and Latitude 27.9 Oh, sorry, I meant -52.341 27.991

    No, I didn’t mean Iran or the center area of the Atlantic ocean, I meant the one someplace or another near -104 39

    Let me know when you find it.

  23. Joe Black
    Posted Feb 20, 2008 at 7:18 PM | Permalink

    eric mcfarland says:
    February 20th, 2008 at 6:57 pm

    What about the fact that Watts, et al., can’t even seem to find the sites … or line them up with their official locations … does that not make all of this “bad site” judgment a little bit premature?

    Gone looking for any stations on your own yet?

  24. eric mcfarland
    Posted Feb 20, 2008 at 7:23 PM | Permalink

    If you are going to critique sites based upon physical location … you’d better be certain about their location … or, you may find out that you’ve been snapping pics. of mine equipment or local weather station instruments. I realize that this level of skepticism may be too much for some of you who are born again contrarians … but it should be addressed. For one, if I were Watt, et al., I’d make a serious effort to work out my “finds” with ncdc and/or seek revisions in the offocial record as to site locations. Simply asserting the the government has its coordinates wrong … “ours are correct” (trust us) is not enough … at least from an auditing point of view … which I thought was the purpose and ethos of this page.

  25. Dave Dardinger
    Posted Feb 20, 2008 at 7:31 PM | Permalink

    One thing I think needs to be pointed out again, so people don’t get confused. The reason the past is changed rather than the present measurements is that if you do it the other way, you end up with the silly situation where the “official” temperature is rather colder than the measured temperature. This may be correct in a sense if you’re correcting an airport temperature for UHI for example, but to the man-on-the-street it looks like the weather people don’t know what they’re doing. Since what’s wanted, for climate study purposes, is just the change over time, and not an absolute measurement, it doesn’t much matter where the temperature is pegged, so it might as well be the moving present.

    But that said, people also need to be careful what they’re comparing to, say, the MWP temperatures via proxy. Here I’d think you’d want the corrected temperature all the way. But as this thread shows, that’s not very easy to know with the questionable locations of the thermometers today.

  26. eric mcfarland
    Posted Feb 20, 2008 at 7:38 PM | Permalink

    23: I just took my 3rd Tamiflu of 10. As soon as my temp. drops … I may.

  27. Dave Dardinger
    Posted Feb 20, 2008 at 7:45 PM | Permalink

    BTW, eric,

    Anthony’s data is public and easily accessable precisely so that any errors can be found and corrected. If the powers that be don’t have anyone checking his data and sending corrections, too bad. That said, it’s fine for you to make the point, now and again, that Watt’s results are only good pending verification. Just don’t make it a moving target, ok? Our old trolls here used to love to tell Steve Mc to stop complaining and do some tree boring. I don’t think I’ve seen anyone try that one lately. So what are you going to do about people telling you to go verify some USHCN locations?

  28. Earle Williams
    Posted Feb 20, 2008 at 7:52 PM | Permalink

    Re #26

    eric mcfarland,

    In the meantime be sure to note the metadata. What’s metadata you ask? The Wiki on it is a start. When you’ve got that fully digested you can then contemplate data quality while you rest up.

    You or anyone can put these site location aspersions to rest by simply demonstrating that the stations are in fact located where the metadata claim they are, and that the information they provide are fit for their intended use.

    Simple, no?

  29. steven mosher
    Posted Feb 20, 2008 at 8:05 PM | Permalink

    RE 24. Eric as I’ve said before your sceptical spirit is welcome. A long time ago
    when I was defending Anthony’s work to other folks, I made what I thought were modest claims: “at the least Anthony is verifying the Coorinates that have been published.” A number of people thought that was not a worthy endeavor. So when you Question the coordinates of a site that is A good thing. It means you acknowledge of value of getting data right.

    I’m suprised how often this point is lost on people.

  30. eric mcfarland
    Posted Feb 20, 2008 at 8:06 PM | Permalink

    28: Ptolemaic circles … perhaps?

  31. kim
    Posted Feb 20, 2008 at 8:07 PM | Permalink

    Systemic viremia runs deep. Into your head it will creep.
    =================================

  32. kim
    Posted Feb 20, 2008 at 8:25 PM | Permalink

    Another day older and deeper in dirt.
    San Diego don’t ya call me ‘cuz I can’t go.
    I owe my record to the company line.
    =======================

  33. deadwood
    Posted Feb 20, 2008 at 8:53 PM | Permalink

    I had wondered, particularly after the Y2K episode last summer, why GISS temp anomaly graphs were showing the early 2000’s so far above the 1930’s and 40’s. Perhaps this “night-light” adjustment explains it?

  34. D. Patterson
    Posted Feb 20, 2008 at 8:59 PM | Permalink

    Everyone seems to be neglecting a temperature data resource, and the impact of using this data resource could have far reaching consequences upon all of the air temperature studies. As I have previously noted, there have been a large number of weather stations which used thermographs to observe and record the air temperatures in the special, 1 hour, 3 hour, 6 hour, and daily min/max weather reports. Thermographs have been in use by the National Weather Bureau, National Weather Service, U.S. Army, U.S. Navy, U.S. Air Force, and other agencies over a period of 80 years or longer in some instances. Air temperatures are recorded continuously from second-to-second and minute-to-minute as a continuous trace on a bar chart. This bar chart is removed and replaced as needed, and the bar chart record is forwarded to an appropriate archive (such as NCDC). I’m not currently acquainted with the retention period of these bar charts in the various archives. If a researcher were to access these bar charts to obtain the continuous record of air temperatures observed at a site, it would then be possible to determine the actual observed average air temperature for a given period and not just the TOBS and variety of other schemes used to interpolate estimated temperature values. Obviously, there are a multitude of valuable lessons to be learned from comparing the actual continuous measurements to all of the interpolated and adjusted values being used in the multitude of manipulated air temperature datasets.

    Has anyone tried or considered trying to access these archived thermographic data charts? Try comparing those continuous records to the conventional datasets.

  35. eric mcfarland
    Posted Feb 20, 2008 at 9:12 PM | Permalink

    Here’s a suggestion: Mr. Watt and friends should provide a list of all of the sites where their “finds” do not match the coordinates published by ncdc. Then, somebody to at least begin doing a true-up with the agency. Any takers?

  36. eric mcfarland
    Posted Feb 20, 2008 at 9:15 PM | Permalink

    35: then somebody could do a true-up ….

  37. M. Jeff
    Posted Feb 20, 2008 at 9:27 PM | Permalink

    From the Lampasas Texas thread:

    eric mcfarland #77, says:

    February 18th, 2008 at 8:58 pm
    The lat and long are wrong … or these are different sites:
    http://www4.ncdc.noaa.gov/cgi-win/wwcgi.dll?wwDI~StnSrch~StnID~20024915

    Help me out here?

    Several replies to the Eric post on that thread focused on trying to help. Now on the current thread, Eric is seemingly raising the same issue without refuting or responding to the previous information that was given in an attempt to explain the issues.

    I suggest that Eric might profit from touring surfacestations.org for a few hours.

  38. eric mcfarland
    Posted Feb 20, 2008 at 9:35 PM | Permalink

    37: I think the biggest help was Mr Watts admitting that his “finds” often occur at locations that do not match the official coordinates put out by ncdc for the stations in question. So, what’s your point … aside from being sour?

  39. John Goetz
    Posted Feb 20, 2008 at 9:38 PM | Permalink

    The shape of the graph one gets when subtracting raw from adjusted for Lampasas, TX and others seems pretty typical. Past temperatures are forced further downward as a function of their distance from the present. The following plots from three stations near the Canadian border – two in northern Vermont and one in northern New York, present a shape I have not seen before. Has anyone looked closely enough at Hansen’s homogenization algorithm to understand how such adjustments can be made?

  40. M. Jeff
    Posted Feb 20, 2008 at 9:40 PM | Permalink

    surfacestations.org is an interesting and informative site.

  41. D. Patterson
    Posted Feb 20, 2008 at 9:41 PM | Permalink

    National Climatic Data Center (NCDC) Dataset 9957 Surface Records Retention System (SRRS) of Weather Observations, Forecasts, Summaries, Warnings and Advisories Issued by the National Weather Service (NWS) says the data retention period for observational records is only 5 years. I have to wonder whether or not the thermographic bar cahrts could have ended up in longer term storage in some other archive or archives besides the NCDC?

  42. Jeff C.
    Posted Feb 20, 2008 at 9:48 PM | Permalink

    Re eric,
    Please folks a skeptic is not someone who brings up the same point over and over despite multiple attempts by others to answer his question. That is a troll. Let’s not let him hijack another thread. It’s almost like TCO is back.

  43. aurbo
    Posted Feb 20, 2008 at 10:09 PM | Permalink

    Re #13, TOBS:

    TOBS is significant if the daily mean is computed from a single Max and a single Min temperature observed in a 24-hour period. Much of the US data, especially the COOP Data is based strictly on Max and Min observed i9n this manner…once a day.

    Let’s say the time of observations is at 7AM LST. This is close to the usual ti9me of minimum temperature. L

  44. aurbo
    Posted Feb 20, 2008 at 10:09 PM | Permalink

    Re #13, TOBS:

    TOBS is significant if the daily mean is computed from a single Max and a single Min temperature observed in a 24-hour period. Much of the US data, especially the COOP Data is based strictly on Max and Min observed i9n this manner…once a day.

    Let’s say the time of observations is at 7AM LST. This is close to the usual ti9me of minimum temperature. L

  45. aurbo
    Posted Feb 20, 2008 at 10:09 PM | Permalink

    Re #13, TOBS:

    TOBS is significant if the daily mean is computed from a single Max and a single Min temperature observed in a 24-hour period. Much of the US data, especially the COOP Data is based strictly on Max and Min observed i9n this manner…once a day.

    Let’s say the time of observations is at 7AM LST. This is close to the usual ti9me of minimum temperature. L

  46. eric mcfarland
    Posted Feb 20, 2008 at 10:16 PM | Permalink

    let the seance continue … by all means.

  47. Posted Feb 20, 2008 at 10:38 PM | Permalink

    I’ve lost track of the various revisions of US temperature history so I’m perplexed.
    Here is the difference between the Woodville Mississippi GISS temperature history as of August 2007 and and as of now. Basically the new GISS version is about 0.5C warmer 1950-2000 than the prior version, but about the same since 2000.

    Is this associated with the Y2K problem? Interestingly, the absolute values of the post-2000 temperatures are the same in both versions and nearby rural stations don’t seem to show the same adjustment pattern.

    (Note: the August 07 values are estimated by reading the plot posted at Anthony Watt’s station gallery.)

  48. henry
    Posted Feb 20, 2008 at 11:07 PM | Permalink

    eric mcfarland says (again and again):

    February 20th, 2008 at 9:12 pm; 9:15 pm; 9:35 pm

    Here’s a suggestion for you: Look at the list of sites, find one near you, and go look for yourself. Shouldn’t be hard, there’s over 1200 of them. Report your findings on location, condition, etc to us and to the Govt. If you trust the Govt locations, and their description as having “high quality sites”, shouldn’t be a problem.

    You’re right: Mr Watts IS finding the equipment is NOT at the official coordinates put out by ncdc for the stations in question. He’s also finding microsite problems, too. So, what’s your point?

  49. aurbo
    Posted Feb 20, 2008 at 11:36 PM | Permalink

    My Multi-posted #s 43, 44 and 45 should be ignored/eliminated. Cockpit error on my part. Here is the real post:

    Re #13, TOBS:

    TOBS is significant if the daily mean is computed from a single Max and a single Min temperature observed in a 24-hour period. Much of the US data, especially the COOP Data is indeed based strictly on Max and Min observed in this manner…once a day. There are two types of the classic mercury-in-glass thermometers used in the CRS obs. One is designed to record and retain the max temperature and the other to record and maintain the min temp. Both are reset manually at the time of observation.

    Let’s say the time of observation is at 7AM LST. This is close to the usual time of minimum temperature. Look at the following 2-day example: Day 1 Min at 7AM = 30°, Max = 60° at 4PM, for Day 2 Min 20° at 7AM, Max 50° at 4PM, night-time low = 40°. If the once-a-day time of observation is 7AM, then the Day 1 ob is taken on the morning of the following day when the min is 20° (the lowest temperature in the prior 24-hour period). The 24-hour mean for day 1 is (Max + Min)/2 = (20+60)/2 = 40°. If the night-time temps on day 2 never fall below, say 40°, then the 24-hour min for day 2 is once again 20°(!) (from the prior 24-hour Min at 7AM). This makes the day-2 mean (20 + 50)/2 = 35°.

    Now if the once-a-day obs are taken around 4PM near the time of max temperature, then for day 1 the mean is (30 + 60)/2 = 45° and for day 2 (20 + 60)/2 = 40°

    What happens is that in the first case, the minimum temperature often gets double-counted, while in the second case, the max temp gets double-counted.

    This problem is minimized if the mean is derived from an average of 24 hourly readings, and the problem is reduced (but not minimized) if the Max/Min temps are observed on a calendar-day basis.

    NCDC did a study on this many years ago and empirically derived a statistical correction factor to compensate for these TOBS biases. I don’t have a problem with this factor as regards the integrity of the climate observations.

  50. deadwood
    Posted Feb 21, 2008 at 12:10 AM | Permalink

    I was absolutely amazed recently during a visit to tamino’s site to read that folks there think GISS, by virtue of its large numbers, does a better job than any other organization at tracking warming.

    I see now they were right on the money. I also am beginning to see why they might be saying more than I thought they were.

    I also think it might be time to free the temperature from the self-interested.

  51. Anthony Watts
    Posted Feb 21, 2008 at 12:39 AM | Permalink

    RE35, Eric, If you are going to complain about my work being inaccurate, at least learn to spell my name, sheesh.

  52. Raven
    Posted Feb 21, 2008 at 12:42 AM | Permalink

    deadwood says:

    I also think it might be time to free the temperature from the self-interested.

    The accidental situation with the satellite data is nice. Two competeing groups analyzing exactly the same data and one of the groups owes no allegence to AGW. However, suspect the biases in the GISS data have infected the satellite data too because the satellite groups were force to tweak their algorithms until they produced the same trend as the GISS.

  53. eric mcfarland
    Posted Feb 21, 2008 at 12:47 AM | Permalink

    How about that list, Mr. Watts?

  54. bender
    Posted Feb 21, 2008 at 1:13 AM | Permalink

    Where does the burden of proof lie? IMO bold claims require strong evidence. Alarmists who want to re-engineer the economy should provide their data to the auditors. Who audits the auditors? Always a good question. Not eric mcfarland, that’s who.

  55. Anthony Watts
    Posted Feb 21, 2008 at 1:32 AM | Permalink

    RE35, ah… misread what you were asking for. In fact that list is already done, you can see the USHCN master list at http://www.surfacestations.org (XLS and HTML versions are available) and all of the GPS coordinates where the station was actually found have been recorded there. You can compare those to the NCDC database and whatever else you choose.

    the USHCN photographic database is at http://gallery.surfacestations.org/main.php

    Careful though, just saying they don’t match the NCDC database is not proof the wrong station has been surveyed, you have to dig deeper than that.

    But here is the way you can prove it; as Henry says, go find a station yourself.

    We love challenges here, so here is your chance to prove us all wrong. Pick one from the USHCN database at http://www.surfacestations.org, or from the http://www.co2science.org USHCN database and tell us ahead of time which one you’ve picked so there will be no questions later. There are 1221 USHCN stations to choose from, I’ll make it easy for you, you can even survey one that’s been surveyed already if it makes it easier for you in the city you live in. That way you can prove we have surveyed the wrong station.

    Use the NCDC MMS database here to find your way to the station:
    http://mi3.ncdc.noaa.gov/mi3qry/login.cfm use the “guest login” and then search for the station.

    By doing so you can then prove to yourself and to us that the official coordinates given in the NCDC database don’t match GPS coordinates onsite or Google Earth coordinates, even though they are “close”, lets say within a mile, then prove you have the wrong USHCN station with photographs or observer testimony or NCDC station history/metadata records.

    Then, find the other official USHCN station(s) in the one mile radius that is the right one. Get new GPS coordinates and pictures. Present your findings.

    So go find a USHCN station, report back. Not just any station, a USHCN station, because that is what the project is about.

    From your view you’d think there’s one on every corner, and we are just getting the wrong one(s). Prove to us that there are many USHCN stations with MMTS thermometers or Stevenson Screens in a city and that we just have the wrong one down the block from the right one.

    I eagerly await the results from you. This blog is about duplicability, proof, and auditing, if you want your opinion respected, you have to do something of value to back it up. Otherwise, its just bits in the ether.

    Who knows, you may find something useful to contribute. Sometimes the best ideas come from challenges like this. Here is your chance. Prove me, the survey volunteers, and the other contributors to this effort wrong by demonstrating with data and photos that we have the wrong stations.

    When you get your station surveyed, let me know, and I’ll register you as a user so you can upload your results the photo database.

  56. Geoff Sherrington
    Posted Feb 21, 2008 at 2:51 AM | Permalink

    Re # 49 aurbo

    Re TOBS

    Thankyou for yet another patient explanation.

    The key to your essay is that Tmax is added to Tmin and then a mean is taken for the period of observation, typically a day to (badly) several. These are then accumulated to monthly, yearly, etc. When the TOBS correction is made, it is applied to many correct results as well as to a few wrong ones. If a 2-day reading is taken instead of a one-day one, there is a generalisation that the effect is probably small because it’s not often, though it does happen, that you go from flaming summer one day to freezing winter the next.

    If Tmax was averaged uncorrected over a year, then Tmin, and a mean taken, then surely the positive and negative biases would approximately cancel and you would be left with a yearly mean that had almost all of the Tmax right, almost all the Tmin right and therefore the Tmean just about correct for the year. (I do not endorse the use of Tmean at all, by the way).

    This is because a record has been kept through thermometer design, of the correct Tmax and Tmin for almost every day. Why adjust correct results? The correct ones will not agree with the TOBS adjusted data as now performed.

  57. D. Patterson
    Posted Feb 21, 2008 at 4:22 AM | Permalink

    49 aurbo says:

    February 20th, 2008 at 11:36 pm
    My Multi-posted #s 43, 44 and 45 should be ignored/eliminated. Cockpit error on my part. Here is the real post:

    Re #13, TOBS:

    [….]

    NCDC did a study on this many years ago and empirically derived a statistical correction factor to compensate for these TOBS biases. I don’t have a problem with this factor as regards the integrity of the climate observations.

    How did NCDC factor into the statistical correction varying cyclical phenomena such as the temperature anomolies introduced by FROPA, adiabatic heating, inversions, and so forth; and how did NCDC determine such corrections would remain valid despite the changes resulting from cyclical climate effects introduced by El Nino, La Nina, ENSO, PDO, and like phenomena; or did they?

  58. MarkW
    Posted Feb 21, 2008 at 5:44 AM | Permalink

    In the spirit of the dendro-climatologists, I propose that we just accept that Lampsas is just teleconnected to the rest of the world, and accept the temperature readings as is.

  59. MarkW
    Posted Feb 21, 2008 at 5:50 AM | Permalink

    eric is as usual, playing the troll by distorting what was said.
    Anthony’s survey team has complained about problems finding the official sites, and in a few cases have had to give up. But in the vast majority of cases, the official sites have been located and confirmed.

  60. Harold Pierce Jr
    Posted Feb 21, 2008 at 6:09 AM | Permalink

    Here is a copy of my recent post from Atmoz’s blog which is relevant to this topic.

    Question for Steve and Anthony: I visit a number of climate change blogs and websites, but I never see any link to John Daly’s website. Is there some reason for this?

    Hello Atmoz!

    Suspecting that the integrity of many land-based weather stations have been comprised over the years and thus produce inaccurate data, the late John Daly analyzed the temperature records of several hundred mostly rual (i.e, remote) weather stations and displayed the results as temperature-time plot of the annual mean temperature. Go: http://www.john-daly.com, scroll down, and click on “Station Temperature Data”.

    These results are quite spectacular and in some cases absolutely astounding, e.g., Death Valley whose graph shows flat lines since 1927. He specifically mentions Alice Springs AU which is located in the middle of continent and in the desert and whose temperature records start in 1879. The plot for Alice Spring show little or no change in the annual mean temperature, i.e., the deviation from the mean is zero for the last few years as compared to the early part of the temperature record.

    The results of his analyses confirmed quite conclusively his suspicions, and showed without a doubt that there is no evidence for any global warming. Given the number and world-wide distribution of these sites, I whole-heartedly concur with his conclusions.

    Unfortuantely, John Daly passed away quite unexpectedly in 2004 from a heart attack. He was only 61 and many of his plots end in ca. 2000-01. It would be of extreme interest to see addtions of data to end of 2007. I going to do this for Alice Springs, Death Valley, Yuma and Tombstone AZ.

    I’m really keen on desert weather stations since due to the low rel. humidty most of the complicating effects water vapor are minimal. More importantly there is usually little if any human activity to bias temperature measurments.

    Deserts temperature records are the crucial data that can easily shoot down all this climate change claptrap and global wrming gobblygook. Deserts get very hot in the daytime because the heat energy from surface (and plants in some cases) is removed mostly by conduction and convection and to some extent by emission of IR energy. However, after the sun sets the temperature plunges quite rapidly and often to near freezing. This is due to the absence of water vapor. If CO2 has any effect on air temperature, then we would expect the mean minimum temperature, which usually occurs just before sunrise, should show a small but discernable increase from the ca. 1900 to present and should follow the increase in the concentration of CO2. Since the plots from Death Valley are stratight flat lines, I concluded that CO2 has no role in climate at least at this site.

    I posted the ref to John’s website over at RC, but Gavin the Grinch wacked it. He and his crowd at RC are without a doubt aware of John’s work, but for obvious reasons have ignored and suppresed any ref to his site and the results of his research.

    -=-Harold Pierce Jr

  61. Joe Black
    Posted Feb 21, 2008 at 6:10 AM | Permalink

    eric mcfarland says:
    February 20th, 2008 at 9:12 pm

    Here’s a suggestion: Mr. Watt and friends should provide a list of all of the sites where their “finds” do not match the coordinates published by ncdc. Then, somebody to at least begin doing a true-up with the agency. Any takers?

    Dear eric,

    (North) America was initially populated by folks who strongly tended to do things for themselves. In a continuation of that streak, most (perhaps all) of the efforts here and at surfacestation.org are self-funded in treasure and time.

    If you want something done, I’d suggest that you do it yourself (you DO seem to be a helpless whinner though, so I doubt there will be any useful output from you).

  62. Reference
    Posted Feb 21, 2008 at 6:46 AM | Permalink

    At what point should NASA be called on to conduct an IV&V investigation of this situation? NASA have a world wide reputation for quality control and some of the best experts in measurement and data analysis. NASA conducts inquiries into all aspects of its operations, from the recent allegations about Astronaut health, to the root causes of the Columbia disaster. They even investigated interference with the right of scientists to speak out after some guy called Hansen complained. Given that GISSTEMP is one of the core datasets used to advise policymakers worldwide, aren’t there enough questions about its validity to justify at least an internal report?

  63. Bernie
    Posted Feb 21, 2008 at 6:47 AM | Permalink

    Harold:
    Interesting post. Desert temperatures do appear to be an interesting subset. I assume you would include “cold” deserts into this group of stations.They are certainly unlit! I am less sure that they are unimpacted by by man, since it appears to only takes a few yards of asphalt, a pond or a nearby a/c to create a micro-climate. It is sad that John Daly was not able to see how many others have made use of his original work.

  64. Glacierman
    Posted Feb 21, 2008 at 7:30 AM | Permalink

    RE: eric

    Your sniping has come to an abrupt end. You infer that Mr. Watts has surveyed the wrong sites because the coordinates do not match government records. You make several requests for a list, and then it is revealed that that work has already been done and the information is available to you. Show some initiative and do some leg work for yourself instead of throwing stones at others who are doing important work. Publish something showing that mining equipment is being surveyed and the actual surface stations were missed, then we can discuss it. If the stations are not where they are reported to be, that is important. If they are not cited properly, that is important. If there are problems with the surface station network, isn’t it important to know that? If it is shown that problems exist, shouldn’t the data be questioned? And I’m not even mentioning the adjustments.

  65. steven mosher
    Posted Feb 21, 2008 at 7:54 AM | Permalink

    re 17. On the lapse rate adjustment. Station moves in the past have been adjusted for
    in SHAP. The prgram is available i think, but I havent looked at it. FRom limited descriptions
    it would appear that changes in elevation, changes in moving from buildingtop to ground
    and perhaps latitude changes would be taken into account. I dont know the lapse rate they use
    or if they a single rate or different rates for different seasons. Hansen makes 2 explicit
    lapse rate adjustments, documented in H2001, st helena station and one other i cant recall
    he uses a constant 6C as opposed to the 6.5 you cite.

  66. steven mosher
    Posted Feb 21, 2008 at 8:09 AM | Permalink

    re 39. the adjustment code is in step2 of gisstemp, the routine is Papars.f

    The adjust works roughly like this. If a site is dim or bright it is compared to its dark
    or unlit neighbors:

    “The goal of the homogeneization effort is to avoid any impact (warming
    or cooling) of the changing environment that some stations experienced
    by changing the global trend of any non-rural station to match the
    global trend of their rural neighbors. If no such neighbors exist,
    the station is completely dropped, if the rural records are shorter,
    part of the non-rural record is dropped.”

    the rural stations (within 500km or 1000km) are combined to create a trend for that region

    C**** The combining of rural stations is done as follows:
    C**** Stations within Rngbr km of the urban center U contribute

    C**** to the mean at U with weight 1.- d/Rngbr (d = distance
    C**** between rural and urban station in km). To remove the station
    C**** bias, station data are shifted before combining them with the
    C**** current mean. The shift is such that the means over the time
    C**** period they have in common remains unchanged. If that common
    C**** period is less than 20(NCRIT) years, the station is disregarded.
    C**** To decrease that chance, stations are combined successively in
    C**** order of the length of their time record.

    so rural stations are combined and then the urban station is subtracted to create
    a bias, warming or cooling, then a curve fitting routine is called to
    create an adjustment to the urban station which can be a two legged linear adjustment
    with a variable hingepoint.

    More later i just found something in the code that puzzles me onn the creation
    of the rural combination of stations.

  67. steven mosher
    Posted Feb 21, 2008 at 8:13 AM | Permalink

    john goetz, can you provide a list of all stations in the us that receieve no adjustment?

  68. PaulM
    Posted Feb 21, 2008 at 8:24 AM | Permalink

    Wow – Temperatures around 1900 are adjusted downwards by GISS by almost a degree – thats about the same as the alleged US temperature rise over the century. And that’s just the GISS adjustment – prior to that there were USHCN adjustments (and can anyone guess which way they moved the temperature trend? – if not see the atmoz blog).
    But it’s all your fault Steve! After you found the error in August 2007, and after correction, 1934 was warmer than 1998, Hansen just had to find another adjustment to bring the older temperatures down. This he did in Sept 2007, and it had quite a significant effect, for example the 1890 US mean dropped from 0.23 to 0.10.
    It would be interesting to see the real raw data for say the 1890s at this site and then list all the ‘adjustments’ that have been applied to it.

  69. Max
    Posted Feb 21, 2008 at 8:29 AM | Permalink

    Most of the Canadian sites and Data is online, is anyone looking at that Data? Its most likely “uncorrected”. For verification purposes pick some within a reasonable radius of Calgary, some of us lurkers can check the sites.

  70. MarkW
    Posted Feb 21, 2008 at 8:33 AM | Permalink

    In order to calculate changes due to the lapse rate, wouldn’t you need to know with some accuracy where the station used to be and where it is now?

  71. steven mosher
    Posted Feb 21, 2008 at 9:05 AM | Permalink

    STEVEMC and ANTHONY!!!!

    I found the dark, dim, bright problem.

    http://data.giss.nasa.gov/gistemp/station_data/station_list.txt

    this is the files giss uses. see where A/B/C says dark, dim, bright.

    the program doesnt use that feild!!

    scroll down to a site in the usa

    725830060 CEDARVILLE lat,lon (.1deg) 415 -1202 R2A cc=425 0
    725910004 RED BLUFF/MUN

    see R2A ??? for sites in the us canada and mexico, a numeric value, in this case ‘2’
    is added. 1 = dark,2 is dim 3 is bright

    in this case cedar ville is coded as dim, while the A value indicated DARK or unlit.

    the program uses thee numeric feild. see Papar.f

    mystery deepens

  72. John Goetz
    Posted Feb 21, 2008 at 9:30 AM | Permalink

    #67 Steve

    Unfortunately I have not attempted to generate such a list. I stumbled across the strangeness while taking a closer look at the Burlington, VT station.

    #71 Steve

    Thanks for clarifying the use of the numeric. I had thought they were using A/B/C. I see that often A corresponds to 1, B to 2, and C to 3, but this is not universal.

    Anyway, it explains why little tiny Dannemora, NY is not considered rural. The Clinton County prison is located nearby, and it is so brightly lit one can see the glow from MILES away.

  73. bender
    Posted Feb 21, 2008 at 9:41 AM | Permalink

    #55 ah, the peaceful sound of crickets

  74. Darwin
    Posted Feb 21, 2008 at 10:01 AM | Permalink

    Re 73 — … after the toad croaked?

  75. Steve McIntyre
    Posted Feb 21, 2008 at 10:03 AM | Permalink

    #71. Excellent find. I’ll collate that information into my details data base and post this afternoon.

    One more example of why you need to inspect code with this bunch. How the hell could anyone figure out from the written description than this field was used as a brightness index while the field labeled “brightness” wasn’t? Grrrr….

    BTW I’ll be backatcha with a list of USHCN stations with no current NASA stage adjustment. I’ve got scraped versions of dse1 and dset2 and it’s just a matter of comparing.

  76. Steve McIntyre
    Posted Feb 21, 2008 at 10:13 AM | Permalink

    A list of GISS-unadjusted USHCN stations (According to my collation of scraped data and not triple-checked) with particulars is online at http://data.climateaudit.org/data/giss/ushcn.unadjusted.dat. There’a column for Watts rating, but I haven’t updated this for about 6 months.

  77. Steve McIntyre
    Posted Feb 21, 2008 at 10:24 AM | Permalink

    Steve Mosher – this is a good idea to isolate these stations. Probably something that should have been done a long time ago.

    Some elements of the GISS approach to the data aren’t bad. They at least make an effort to identify “good” stations. And if one had to choose between this list and the MCDW urban airport list, you’d have to prefer the GISS list.

    I think that where this is going is to use the “good” US data to benchmark the sort of difference that can arise between “good” and MCDW sites, along the lines of the Peterson analysis last summer.

    Offshore where Hansen doesn’t use the same approach, it’s really a bait-and-switch to project from US results to the ROW. There’s a definite opportunism in some of the feedback on this as people deprecate the US 1930s-2000s delta on the basis that the US is only x% of the world area, but then use relative coherence of the GISS index in the US to “good” data as support for the ROW where they use different methods.

  78. M.Villeger
    Posted Feb 21, 2008 at 11:09 AM | Permalink

    #14 Jim C., as a casual observer I tend to agree with you!

  79. Craig Loehle
    Posted Feb 21, 2008 at 11:26 AM | Permalink

    What is really amazing is that no one from NASA or NOAA or Hadley or anywhere drops in here to offer an explanation for the weird adjustments, multiple versions, and changing versions. It can not possibly be because they are not aware of the site, because we know certain people comment on other blogs about this site. It is not really to their advantage to ignore it, because the discrepancies keep piling up (rather than going away).

  80. steven mosher
    Posted Feb 21, 2008 at 11:34 AM | Permalink

    Alright, This gets even better folks and its somewhat complicated.

    Lets start with the basics. GISSTemp has file that describes the stations used

    http://data.giss.nasa.gov/gistemp/station_data/station_list.txt

    —ID—- Legend: R/S/U=rural/sm.town/urban A/B/C=dark/dim/bright cc=country-code brightness-index

    200460003 GMO IM.E.T. lat,lon (.1deg) 806 581 R A cc=222 0
    200690003 OSTROV VIZE lat,lon (.1deg) 795 770 R A cc=222 0
    202740000 OSTROV UEDINE lat,lon (.1deg) 775 822 R A cc=222 0

    I ask you to note 3 things:
    The Rural/small town/Urban
    The Dark/dim/bright ( A/B/C)
    And the brightness index.

    R stands for Rural, S for small town, and U for Urban. This is based on population
    data where Rural is less than 10K, Small town or periUrban is 10-50K and Urban is >50K

    A/B/C stand for dark dim bright, as noted inthe header.

    Brightness Index, which is not defined. I dont know the source for this.

    Now Lets look at a site from Mexico:

    711400010 ST ALBANS CANADA lat,lon (.1deg) 497 -996 R1A cc=403 0
    711400020 HILLVIEW CANADA lat,lon (.1deg) 499 -1006 R1A cc=403 0
    711400030 MINNEDOSA,MA lat,lon (.1deg) 503 -998 R2B cc=403 27

    Opps thats Canada our unsecure border to the north.

    Do you see R1A? That means
    R= Rural population 3=bright

    And by implication 2 = dim.

    So when Gisstemp starts to do adjustments they have to count Rural stations and urban stations
    Like so in Papars.f

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN
    ISR=ISR+1

    ……more code

    NSTAu=ISU ! total number of bright/urban or dim/sm.town stations
    NSTAr=ISR ! total number of dark/rural stations

    More on this line of code later, since it creates adjustments that do not match the descriptions
    in H2001. But for now, we note that ISR is the number of dark/rural stations ( actually I believe this is a misdescription, which I will explain later)

    But For now, Here is what we have. We have a description of a site that Includes
    a Index of Population R/S/U ; and a index of Brightness A/B/C ;and another Index of brightness
    1/2/3; And yet another index of brightness which is an integer value ranging from zero upwards.

    The R/S/U feild is used by the program.
    The 1/2/3 feild is used by the program
    The A/B/C feild?? I have not found it used.
    The brightness index, have not found it used.

    Everybody with me?

    So, Here is how H2001 describes the Process.

    4.2.1. Classification of meteorological stations. Meteorological stations were classified by Hansen et al. [1999]
    as rural, small town, or urban, based on the population estimate provided as metadata in the GHCN record, as in the
    earlier study by Easterling et al. [1997]. This classification was used to identify which stations would be corrected
    for possible urban warming (adjustments were made only in “urban” areas, i.e., those with a population over 50,000)
    and also to identify nearby rural stations that could be used to define the magnitude of the adjustment. Problems
    with this approach include not only the age of the population data and the poor geographical resolution but also the
    5
    fact that a population of even 10,000 (the division between “rural” and “small town”) or less can produce significant
    local climate effects [Mitchell, 1953; Landsberg, 1981].
    In the current GISS analysis within the United States the long-term temperature trend is based on only the
    “unlit” stations identified by satellite data, as the long-term temperature trends of the periurban and urban stations
    are adjusted to match the mean trend of neighboring unlit stations. Only about one quarter of the “small-town”
    stations are unlit, but this more stringent definition of a rural area still leaves about 250 stations in the United States.
    As the contiguous United States covers only about 2% of the Earth’s area, the 250 stations are sufficient for an
    accurate estimate of national long-term temperature change, but the process inherently introduces a smoothing of the
    geographical pattern of temperature change.
    This reclassification of stations is carried out here only for the United States and bordering regions in
    Canada and Mexico, where Imhoff et al. [1997] have analyzed brightness data into these three categories. Thus for
    the rest of the world we continue to use the GHCN population classification of stations to decide which stations
    should be adjusted.”

    So, Imhoff 1997 ( which I will have to return to later) Creates 3 categories , dark, dim, bright. In the Program value for this feild is 1/2/3

    NAME(31:31)=brightnessIndex 1=dark->3=bright

    Ok. So what gets adjusted and how does it get adjusted.

    H2001:

    The urban adjustment in the current GISS analysis is a similar two-legged adjustment, but the date of the
    hinge point is no longer fixed at 1950, the maximum distance used for rural neighbors is 500 km provided that
    sufficient stations are available, and “small-town” (population 10,000 to 50,000) stations are also adjusted. The
    hinge date is now also chosen to minimize the difference between the adjusted urban record and the mean of its
    neighbors. In the United States (and nearby Canada and Mexico regions) the rural stations are now those that are
    “unlit” in satellite data, but in the rest of the world, rural stations are still defined to be places with a population less
    than 10,000. The added flexibility in the hinge point allows more realistic local adjustments, as the initiation of
    significant urban growth occurred at different times in different parts of the world.
    The urban adjustment, based on the long-term trends at neighboring stations, introduces a regional
    smoothing of the analyzed temperature field. To limit the degree of this smoothing, the present GISS analysis first
    attempts to define the adjustment based on rural stations located within 500 km of the station. Only if these stations
    are insufficient to define a long-term trend are stations at greater distances employed. As in the previous GISS
    analysis, the maximum distance of the rural stations employed is 1000 km.”

    So in the US and parts of mexico/canada UNLIT are used and in the rest of the world Rural.
    Whats the code do? Nothing of the sort.

    In GISSTemp code The sites that are Counted as Rural ISR

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN
    ISR=ISR+1

    In words The population index has to be R OR the Nighlights imhoff index has to be 1.
    RURAL OR DARK.

    HOWEVER, This condition is satisfied by the following, where R is population and the integer is

    R1: Rural Dark
    R2: Rural Dim
    R3: Rural Bright
    S1: Small Dark
    U1: Urban Dark.
    Some examples

    717000020 MINTO,NB lat,lon (.1deg) 460 -660 R1A cc=403 0
    717000030 WOODSTOCK,NB lat,lon (.1deg) 462 -676 R3C cc=403 23
    717000040 HARDWOOD RIDGE,NB lat,lon (.1deg) 462 -659 R2A cc=403 10

    722070020 BEAUFORT 7SW lat,lon (.1deg) 324 -808 R2B cc=425 0
    722210000 VALPARAISO/ lat,lon (.1deg) 305 -865 R3C cc=425 20
    722480030 EL DORADO/FAA AIRPORT lat,lon (.1deg) 332 -928 S1A cc=425 0
    17050000 MONCTON,N.B. lat,lon (.1deg) 461 -647 U1C cc=403 26

    So, if a site is Rural in population and has ANY brightness level, it is counted in
    ISR ( number of rural stations)
    OR if a site is DARK it is counted as Rural Regardless of Population.

    So, When Gisstemp Says:
    NSTAu=ISU ! total number of bright/urban or dim/sm.town stations
    NSTAr=ISR ! total number of dark/rural stations

    Its Not exactly Accurate:

    ISR Is the number of Rural dark, rural dim, Rural Bright, small dark, urban dark.
    ISU is the number of S dim, small bright, Urban Dim, Urban Bright.

    So, In the US and parts of canada and mexico, RuralDark, RuralDim, and RuralBright
    SmallDark, and UrbanDark

    are used to adjust everything So, When H2001 says that GISSTEMP
    uses only UNLIT data, it’s wrong. They use DIM and BRIGHT iff the population
    is

  81. Steve McIntyre
    Posted Feb 21, 2008 at 11:39 AM | Permalink

    #80. you left us hanging in suspense.

  82. Steve McIntyre
    Posted Feb 21, 2008 at 11:40 AM | Permalink

    #32. Bingo.

  83. steven mosher
    Posted Feb 21, 2008 at 11:41 AM | Permalink

    Opps some lines dropped out from that post Quirky brackets

    How do I know that 1 = dark and 2 =Dim and 3 = bright

    Papars.f

    NAME(31:31)=brightnessIndex 1=dark–3=bright

  84. Patrick M.
    Posted Feb 21, 2008 at 11:46 AM | Permalink

    Re 73 (bender):

    Shall we assume that now that Eric has read #55 he is too busy checking out stations, to post any more?

  85. Anthony Watts
    Posted Feb 21, 2008 at 11:48 AM | Permalink

    Mosh, You have excelled! My hat is off to you! Three cheers for Mr. Mosher.

    I’m just happy to be a facilitator for you and Steve by pointing out things that don’t make sense in our surveys.

    Steve Mc. I welcome your analysis.

  86. Larry T
    Posted Feb 21, 2008 at 12:03 PM | Permalink

    re:80

    If that is Fortran some implemenations take a comparision of a constant “R” to a two charater field as comparing “RNull” vs “RSpace” and a false condition. That could change the logic tremondously.

  87. John Goetz
    Posted Feb 21, 2008 at 12:18 PM | Permalink

    #80 Steve

    I’m not sure I fully understand the implication. Are you saying that GISS says they do not adjust only rural or only dim but they in fact do not adjust bright rural?

    For example, you are saying that VALPARAISO and MONCTON are used to adjust nearby stations?

    If that is the case then it is interesting indeed. I took a quick look at both. MONCTON, which is U1C, shows no adjustment. VALPARAISO, which is R3C, shows a 0.1C adjustment downward from 1949-1953. So if VALPARAISO is adjusted, and then is itself used to make adjustments…curious indeed.

    And referring to my comment #39 above. Dannemora, NY is 53km from Burlington, VT. Dannemora is R2B and Burlington is S3C. Dannemora is adjusted, and so is Burlington. However, is Dannemora being used to adjust Burlington?

  88. steven mosher
    Posted Feb 21, 2008 at 12:20 PM | Permalink

    H2001 has a table on page 14 showing all the categorizations and how they overlap

    It doesnt note a Urban dark.

    but then we have this

    717050000 MONCTON,N.B. lat,lon (.1deg) 461 -647 U1C cc=403 26

    And the code says that 1 is dark and 3 is bright, for example ottawa

    716280051 OTTAWA CANADA lat,lon (.1deg) 454 -757 U3C cc=403 141

    From H2001 Its clear that Hansen wanted to Use sites that were ONLY RURAL and ONLY UNLIT

    “Only about one quarter of the “small-town”
    stations are unlit, but this more stringent definition of a rural area still leaves about 250 stations in the United States.
    As the contiguous United States covers only about 2% of the Earth’s area, the 250 stations are sufficient for an
    accurate estimate of national long-term temperature change, but the process inherently introduces a smoothing of the
    geographical pattern of temperature change.”

    In H2001 tables table 1 page 14 hansen shows 942 stations with populations less than 10K
    Of these 256 are UNLIT.

    But the code in Papars.f appears to select a diffent class of stations. As I noted
    Rural OR Dark.

    I have two more files to search before I can say this for sure

    I’ve found one place where the code does not do that. I’ll keep searching

  89. PeteW
    Posted Feb 21, 2008 at 12:33 PM | Permalink

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN

    Long time ( ~30 years ) since I wrote any Fortran, so excuse this comment,

    But should the test be NAME(31:32).EQ.’ R’.or.NAME(33:33).EQ.’1′

    assuming NAME is a character array and its looking at characters e.g. ‘ R1’

  90. Gary
    Posted Feb 21, 2008 at 12:37 PM | Permalink

    Mosh,

    If you’re looking for stations to do a little experiment, I suggest trying three really closely spaced ones that run the range from rural-unlit to urban-bright.

    725070010 BLOCK ISLAND STATE AP lat,lon (.1deg) 412 -716 R1A cc=425 0
    725070030 KINGSTON lat,lon (.1deg) 415 -715 R2C cc=425 22
    725070060 PROVIDENCE WSO AP lat,lon (.1deg) 417 -714 U3C cc=425 67

    They’re in a 50-mile north-south transect and two of the three have been surveyed in Surfacestations at http://gallery.surfacestations.org/main.php?g2_itemId=231 (not Block Island which was closed a few years ago). However, this station is interesting because of the marine influence (the island is 12 miles offshore).

    I briefly looked at the data awhile back and know that the wiggles match pretty well. It seems you could get a good handle on UHI adjustments comparing these three stations.

  91. Posted Feb 21, 2008 at 12:39 PM | Permalink

    steven mosher, Maybe I’m reading the code wrong, but the IF(NAME…) statement seems to be checking 31:32 for a (blank,letter) combination but then checks 31:31 for a number. The data, however, while they do have (blank,letter) combination, the number is in 33:33 and I don’t see a number in either 31 or 32 ????

    And it seems that these comment statements:

    C**** ETC. NAME(31:31)=BRIGHTNESSINDEX 1=DARK->3=BRIGHT
    C**** ETC. NAME(32:32)=POP.FLAG R/S/U RUR/SM.TOWN/URBAN
    C**** ETC. NAME(34:36)=COUNTRY CODE
    imply that the number should be in 31:31 and the letter in 32:32.
    I have not studied the code in detail.

  92. wkkruse
    Posted Feb 21, 2008 at 12:40 PM | Permalink

    PetwW, I was just about to send in almost the same message. But I think it should be

    NAME(31:32).EQ.’R ‘.or.NAME(32:32).EQ.’1′ Position 32 is not blank only when there is a bright, dim dark indicator. Note in my suggestion that a space follows R. So if the S, U, and R are in position 31, and dim, dark bright in 32, then Hansen is correct.

  93. Earle Williams
    Posted Feb 21, 2008 at 12:47 PM | Permalink

    Re #89

    PeteW,

    It’s been a long time for me as well, ~20 years and I don’t recall the indexing convention for FORTRAN character arrays. Does 31:32 refer to two characters, or just one? In either case, the test is inconsistent by using 31:31 when looking at the light code.

    Disregarding the position of the U or R, testing two characters against a one character constant will always return FALSE. If the code snippet
    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN

    is the actual test then it will only ever be true when NAME(31:31) is 1.

  94. Earle Williams
    Posted Feb 21, 2008 at 12:52 PM | Permalink

    Self-correction: I didn’t see the space preceding the ‘ R’ in the code snippet. In that case it is a 2 to 2 comparison.

  95. Steve McIntyre
    Posted Feb 21, 2008 at 12:55 PM | Permalink

    Steve Mosh, can you refresh on where you think the middle code in R1A etc comes from? Or for that matter what the provenance of column 3 is?

  96. wkkruse
    Posted Feb 21, 2008 at 12:58 PM | Permalink

    steven mosher

    If I am interpreting it correctly, the station list that you referenced in #80 above, has the R,S, U codes in the column preceding the 1,2,3 code. If the code in GISSTEMP is as I suggested in 92, then it will do the right thing. Your if statement implies that the R,U,S precedes the 1,2,3 code. Note also that in your if statement, the check is against ‘spaceR’, i.e. 2 positions.

  97. wkkruse
    Posted Feb 21, 2008 at 1:01 PM | Permalink

    Correction, your if statement implies that 123 precedes R,S,U

  98. eric mcfarland
    Posted Feb 21, 2008 at 1:04 PM | Permalink

    I see that I have been burned and dragged through the streets in my absence. Oh well. As soon as my flu lifts, I plan to pick up the trail. Things will not happen over night, however.

  99. John Goetz
    Posted Feb 21, 2008 at 1:19 PM | Permalink

    I would have thought the middle code would be derived from the brightness index at the end, but then I found:

    744130010 HOLLISTER lat,lon (.1deg) 424 -1146 R1A cc=425 10
    744130020 HAZELTON lat,lon (.1deg) 426 -1141 R2A cc=425 8

    which blew that theory out of the water, as Hazelton has a brightness index of 8 and a middle code of 2, while Hollister has a brightness index of 10 and a middle code of 1.

  100. Doug
    Posted Feb 21, 2008 at 1:23 PM | Permalink

    mosher,

    I’ve been a quiet observer here for about a year and really enjoy the good work so many people do. I have a BS in Mechanical Engineering from URI in Kingston, RI and have been a professional software engineer for ~20 years, and actually wrote and maintained some Fortan code back in the late 80’s. Do you have the snippet of code that reads in the data file? I agree with Dan Hughes’s assessment but would like to see the code to be sure. If Dan is correct then the first half of the “.or.” statement would never be satisfied. I’m interested in the parsing logic. I think I even have an old Fortan book in the attic!

    Also, at first glance the metadata for Block Island, Kingston and Providence look reasonable. Block Island is very rural, nine months out of the year the population is less than one thousand. To your point there is a heavy marine influence. Kingston is the home of URI which can swell to right around 10,000 residents when classes are in session, which accounts for the dim rating. Otherwise the town runs around 2,000 in population. Providence is a city of about 100,000 but there is also a strong marine influence there. It is situated at the mouth of Narragansett Bay and often lies on the wet side of the snow/rain line when Alberta Clippers roll through. Hope this helps and I would like to contribute as a code reviewer if possible.

  101. Steve McIntyre
    Posted Feb 21, 2008 at 1:28 PM | Permalink

    #80,91. Steve Mosher, a question about Fortran.

    If you have a station with values “1R” in cols 31:32, will the condition IF(NAME(31:32).EQ.’ R’ be true or false. IT would be FALSE in R and I presume also in Fortran. If so, isn’t this a red herring?

  102. Anthony Watts
    Posted Feb 21, 2008 at 1:29 PM | Permalink

    Just got off the phone with Mosher, he’s traveling, so will be offline a bit.

    Here is how I see it:

    By using a logical “or” statement rather than “and” the number of rural stations that meet the criteria defined increase dramatically, providing little if any selectivity for a broad swath of rural stations.

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN
    ISR=ISR+1

    Should read:

    IF(NAME(31:32).EQ.’ R’.and.NAME(31:31).EQ.’1′) THEN
    ISR=ISR+1

    Simple logic error “or” instead of “and” allows adjustment of a broader station criteria.

    Which explains why Cedarville, a lights=0 station with no urbanization in the middle of nowhere as nowhere can be, gets its surface record cranked up for no apparent reason.
    see this:

    http://wattsupwiththat.wordpress.com/2008/02/17/cedarville-and-giss-adjustments/

  103. Posted Feb 21, 2008 at 1:34 PM | Permalink

    re:#100

    Based on the data examples in the above comments, it looks like all have _LNL where _ is a blank, L is a Letter, N is a Number, L is a Letter. The IF test will always be true if _L = _R, independent of the N. It’s counting all Rs.

    There is the possibility that data are read in one order and then re-arranged somewhere so that the Number precedes the Letter and the Number is put into 31:31. In this case the first part of the IF always tests False but the second part tests True if the Number in 31:31 is 1.

    I hope I have this correct.

  104. DeWitt Payne
    Posted Feb 21, 2008 at 1:37 PM | Permalink

    Raven #52,

    However, suspect the biases in the GISS data have infected the satellite data too because the satellite groups were force to tweak their algorithms until they produced the same trend as the GISS.

    Please cite a credible source for this statement. AFAIK, all adjustments to the calculation algorithms have been the result of the discovery of errors in those algorithms (orbital decay, orbital drift, etc.). The most recent being an incorrect formula in the UAH algorithm. The fact that the trends from RSS and UAH are now similar to GISS and Hadley is not convincing evidence that RSS and UAH were “forced to tweak their algorithms…”.

  105. wkkruse
    Posted Feb 21, 2008 at 1:45 PM | Permalink

    These 2 lines are taken from the station list.

    711040070 BARKERVILLE,BC lat,lon (.1deg) 531 -1215 R A cc=403 0
    711080010 NORTH NICOMEN,BC lat,lon (.1deg) 492 -1220 R2B cc=403 12
    It looks to me that the R,S,U code precedes the 1,2,3 code. For the ROW, the 1,2,3 is blank (except for those nearby Canadian and Mexican stations.) R,S,U clearly comes before the 1,2,3 and must be in column 31 and 1,2,3 must be in column 32. So if column 31 and 32 are ‘R ‘, then we have a ROW rural station. Or if column 32 is ‘1’, we have a US rural station.

  106. JM
    Posted Feb 21, 2008 at 1:48 PM | Permalink

    Please cite a credible source for this statement. AFAIK, all adjustments to the calculation algorithms have been the result of the discovery of errors in those algorithms (orbital decay, orbital drift, etc.). The most recent being an incorrect formula in the UAH algorithm. The fact that the trends from RSS and UAH are now similar to GISS and Hadley is not convincing evidence that RSS and UAH were “forced to tweak their algorithms…”.

    What may happen in this kind of corrections is that corrections are made in the direction of what people believe to be the correct value and only until there is no longer a difference. Potential correction in the opposite direction are never researched.

  107. Larry T
    Posted Feb 21, 2008 at 1:56 PM | Permalink

    re: my previous fortran comment- i had the misfortune to work on a Unisys to IBM fortan coversion and many errors in the code wre of the kind where of the kind that i referred too. I do not know what machines the code runs on but i do know that different implemenatation of fortran have different initializations of how they do constant comparisions. I am not sure it is a problem — just something that i would test to see if it was valid in my evaluation of the software for bugs.

  108. Doug
    Posted Feb 21, 2008 at 2:08 PM | Permalink

    re:#103

    I’ve seen a lot of things in code over the years so I shy away from assuming what a string array should look like at the time of inspection. I would be happy to reverse engineer the code and compare results with anyone. This is what I do as part of my job. I manage a team of ten C#/.Net developers and regularly review code for adherance to standards. When I saw this code it reminded me of some old tricks. Most often I’ve seen cases like this in block programming languages where a seemingly useless logical clause was left in place 1) to support backwards compitibility for an older data format or 2) provide a quick way to test an upper or lower bound condition by quickly adding adding white space in the right place to a data set. Of course the third option is a logic flaw, but without the actual program requirements the best we can do is make an educated guess. I don’t endorse such practices and such code would never hit production in the financial industry where I now work. However, I’ve worked in acedemia where standards were not as stringent and also contracted for the DoD, from where most of the industry’s standards and best practices have eminated. Like I said, I’ve seen a lot and am very curious as to what the code looks like.

  109. MarkW
    Posted Feb 21, 2008 at 2:16 PM | Permalink

    I wonder how college towns are calculated?

    Their populations can swell tremendously depending on whether classes are in session. Do they use the vacation population, or the in session population, or some kind of average?

    Regardless, there has to be enough concrete and asphalt around to support the population at the in session levels.
    Also the brightness would vary tremendously depending on the time of year.

  110. Posted Feb 21, 2008 at 2:19 PM | Permalink

    Doug, the code is here: http://data.giss.nasa.gov/gistemp/sources/GISTEMP_sources.tar.gz

    PApars.f is in STEP2

    Thanks for any assistance.

  111. MarkW
    Posted Feb 21, 2008 at 2:19 PM | Permalink

    Another question about how brightness is calculated. Are all the measurements taken at the same time of year?
    I’ve been in some towns that go all out regarding Christmass decorations. Other towns barely notice the season. All of that outdoors lighting is going to affect the numbers.
    Beyond that, I suspect in most areas there is a more after dark, outdoors activity in the summer, compared to the winter. With a corresponding increase in outdoor lighting.

    The more I think about it, the worse the idea of using “brightness” as a proxy for UHI seems to be.

  112. Posted Feb 21, 2008 at 2:26 PM | Permalink

    re:#101

    It is False in Fortran.

  113. Kenneth Fritsch
    Posted Feb 21, 2008 at 2:27 PM | Permalink

    Re: #104

    Please cite a credible source for this statement. AFAIK, all adjustments to the calculation algorithms have been the result of the discovery of errors in those algorithms (orbital decay, orbital drift, etc.). The most recent being an incorrect formula in the UAH algorithm. The fact that the trends from RSS and UAH are now similar to GISS and Hadley is not convincing evidence that RSS and UAH were “forced to tweak their algorithms…”.

    I second this request. I recall having a discussion with DeWitt P previously at CA on this subject and without resolving whether there is a direct link to MSU measurements from surface temperature measurements — for calibration, be it implicit or explicit.

    Unless someone can lay out the case for a proper and accurate adjustment to land surface temperatures in face of the “quality control” problems revealed here and other places, I have been thinking that an easier excercise in getting a handle on temperature trends might be tracking down the accuracy/precision of satellite and SST measurements or at least give them equal bandwidth with what is used in discussing the surface temperatures in the US.

  114. Gary
    Posted Feb 21, 2008 at 2:31 PM | Permalink

    Not all college towns are the same, even the rural ones, because the local population doesn’t have to be adjacent to the campus. At Kingston, RI the town center and most of the suburbanization is about four miles away. Since the 1930s, temperatures have been taken at the bottom of the hill on which the university is located. Cold air drains downslope while the “urban” heat of the campus at 150′ higher elevation goes up. Then there is the summer seabreeze from the ocean 10 miles to the south chilling hot afternoons. Compare this with the station at Ithaca, NY – a small city containing Cornell University and Ithaca College. The station is to the east by a couple of miles in an open agricultural field. Downtown lights give this an “S2C” code (small city – bright), yet the station is probably far enough away from the UHI and may be more affected by the asphalt road 30′ away.

  115. D. Patterson
    Posted Feb 21, 2008 at 2:39 PM | Permalink

    Here is an example and source on the calibration of microwave sounding units and the use of adjustments which result in conformity with surface observations already known to have problems with incorrect measurement and adjustment methods.

    Grody et al. Calibration of multisatellite observations for climatic studies: Microwave Sounding Unit (MSU)JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 109, D24104, doi:10.1029/2004JD005079, 2004.
    [….]
    One of the major difficulties in using empirical corrections, such as (10), is that inaccuracies in both the
    model and adjustment parameters can produce errors in the corrected measurements.
    [….]
    It should be understood, however, that the corrections listed in Table 3 are based on the MSU measurements supplied to us by NOAA, which contains ‘‘corrections’ to the NOAA 12 nonlinear coefficients based on the paper by Mo [1995]. Obviously, the adjustments listed in Table 3 would be different if, for example, the MSU was calibrated using the ‘‘incorrect’ set of nonlinear coefficients or if in fact no nonlinearity was applied.
    [….]
    This study reports our latest findings concerning the MSU calibration. However, of major importance is the effect of these new calibration adjustments on the derived global climatic trend. In our first investigation [Vinnikov and Grody, 2003] where only a constant bias was applied to the measurements, the global trend increased from 0.22 K/decade to 0.26 K/decade when a second harmonic diurnal adjustment was included in the determination of the bias and trend. However, the application of these new calibration adjustments with no second-harmonic adjustments results in a global trend with nearly the same 0.17 K/decade obtained from surface observations (Vinnikov et al., submitted manuscript, 2004).

  116. steven mosher
    Posted Feb 21, 2008 at 2:41 PM | Permalink

    RE 91. First Dan Hughes since he understands the current mystery I am Working on.

    Yes, that is the last thing that is bugging me. this reordering of columns.
    I’m not ready to through the bug flag until I figure that out.

    The station_file.txt is an OUTPUT FILE not input, so the order MAY have
    been switched. I KNOW its an output file because stations have been removed
    ( crater lake) but I have not found the place in the code where station_file.txt is
    written. I have also not found the place where the ‘R1C’ feilds are initially
    read in.

    A friggin data flow diagram would be nice, even a cartoon version.

    In anycase, the condition that hansen described in H2001

    RURAL AND UNLIT ( 256 out of 942)

    Is not captured by the code rural OR unlit.

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN
    ISR=ISR+1 !!! INCREMENT number of rural
    IF(ISR.GT.NSTAM) STOP ‘ERROR: NSTAM TOO SMALL’ !! Rural stations exceed max stations
    MFSR(ISR)=MFCUR
    I1SR(ISR)=I1Snow
    ILSR(ISR)=I1Snow+LENGTH-1
    CSLATR(ISR)=COS(XLAT*PI180)
    SNLATR(ISR)=SIN(XLAT*PI180)
    CSLONR(ISR)=COS(XLON*PI180)
    SNLONR(ISR)=SIN(XLON*PI180)
    IDR(ISR)=ITRL(3)
    ELSE
    ISU=ISU+1
    IF(ISU.GT.NSTAM) STOP ‘ERROR: NSTAM TOO SMALL’
    MFSU(ISU)=MFCUR
    I1SU(ISU)=I1Snow
    ILSU(ISU)=I1Snow+LENGTH-1
    CSLATU(ISU)=COS(XLAT*PI180)
    SNLATU(ISU)=SIN(XLAT*PI180)
    CSLONU(ISU)=COS(XLON*PI180)
    SNLONU(ISU)=SIN(XLON*PI180)
    IDU(ISU)=ITRL(3)
    CC(ISU)=NAME(34:36)
    END IF
    I1Snow=I1Snow+LENGTH
    GO TO 50
    90 CONTINUE
    NSTAu=ISU ! total number of bright/urban or dim/sm.town stations
    NSTAr=ISR ! total number of dark/rural stations
    LDTOT=I1Snow-1 ! total length of IDATA used
    write(*,*) ‘number of rural/urban stations’,NSTAr,NSTAu

  117. steven mosher
    Posted Feb 21, 2008 at 2:41 PM | Permalink

    RE 91. First Dan Hughes since he understands the current mystery I am Working on.

    Yes, that is the last thing that is bugging me. this reordering of columns.
    I’m not ready to through the bug flag until I figure that out.

    The station_file.txt is an OUTPUT FILE not input, so the order MAY have
    been switched. I KNOW its an output file because stations have been removed
    ( crater lake) but I have not found the place in the code where station_file.txt is
    written. I have also not found the place where the ‘R1C’ feilds are initially
    read in.

    A friggin data flow diagram would be nice, even a cartoon version.

    In anycase, the condition that hansen described in H2001

    RURAL AND UNLIT ( 256 out of 942)

    Is not captured by the code rural OR unlit.

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN
    ISR=ISR+1 !!! INCREMENT number of rural
    IF(ISR.GT.NSTAM) STOP ‘ERROR: NSTAM TOO SMALL’ !! Rural stations exceed max stations
    MFSR(ISR)=MFCUR
    I1SR(ISR)=I1Snow
    ILSR(ISR)=I1Snow+LENGTH-1
    CSLATR(ISR)=COS(XLAT*PI180)
    SNLATR(ISR)=SIN(XLAT*PI180)
    CSLONR(ISR)=COS(XLON*PI180)
    SNLONR(ISR)=SIN(XLON*PI180)
    IDR(ISR)=ITRL(3)
    ELSE
    ISU=ISU+1
    IF(ISU.GT.NSTAM) STOP ‘ERROR: NSTAM TOO SMALL’
    MFSU(ISU)=MFCUR
    I1SU(ISU)=I1Snow
    ILSU(ISU)=I1Snow+LENGTH-1
    CSLATU(ISU)=COS(XLAT*PI180)
    SNLATU(ISU)=SIN(XLAT*PI180)
    CSLONU(ISU)=COS(XLON*PI180)
    SNLONU(ISU)=SIN(XLON*PI180)
    IDU(ISU)=ITRL(3)
    CC(ISU)=NAME(34:36)
    END IF
    I1Snow=I1Snow+LENGTH
    GO TO 50
    90 CONTINUE
    NSTAu=ISU ! total number of bright/urban or dim/sm.town stations
    NSTAr=ISR ! total number of dark/rural stations
    LDTOT=I1Snow-1 ! total length of IDATA used
    write(*,*) ‘number of rural/urban stations’,NSTAr,NSTAu

  118. conard
    Posted Feb 21, 2008 at 2:45 PM | Permalink

    Take a look at text_to_binary.f line 50

    name(31:31)=li(102:102) ! US-brightness index 1/2/3=dark/dim/brite
    name(32:32)=li(68:68) ! population index (R/S/U=rural/other)
    name(33:33)=li(101:101) ! GHCN-brightness index A/B/C=dark/dim/brt
    name(34:36)=li(1:3) ! country code (425=US)

  119. wkkruse
    Posted Feb 21, 2008 at 2:46 PM | Permalink

    Dan HUghes #110
    My earlier surmise was wrong. The code for papars.f says in the early comments that Name(31:31) = brightness index and Name (32:32)= R,S,U index. So the if statement that mosher referenced, IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN ISR=ISR +1 will do what it is supposed to do if NAME(31:31) is blank when there is no brightness index. Remember that NAME(31:32) is a concatenation of positions 31 and 32 of NAME. ‘ R’ in positions 31 and 32 will indicate a ROW rural station and ‘1’ in position 31 will indicate a US rural station. The code is correct.

  120. Philip_B
    Posted Feb 21, 2008 at 3:01 PM | Permalink

    I’m really keen on desert weather stations since due to the low rel. humidty most of the complicating effects water vapor are minimal.

    Harold, unfortunately, I think that is something of a myth. I live in Western Australia where we have quite a lot of desert. I am struck by 2 phenomena. One is how cold, cloudy days in summer are. You can see that in the temperature data for Kalgoorlie where the last few days have been cloudy (with rain) and the max temperature hasn’t gone above 15C versus the feb average max of 32C. The other is how often we get humid winds (but no clouds) from the desert interior and how hot those days are (always the hottest days of the year).

    So paradoxically, high humidity with clouds results in very cold days, whereas high humidity without clouds results in very hot days.

  121. Anthony Watts
    Posted Feb 21, 2008 at 3:37 PM | Permalink

    Steve Mc, I think we need new thread for GISS code discussion, we have 2-3 conversatios going on here in this one

  122. steven mosher
    Posted Feb 21, 2008 at 3:41 PM | Permalink

    RE 101. SteveMc read the code again. This one fooled me as well.

    The TEST in the code is

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN…..

    ‘ R’ fooled you that is ‘space R’

    The description of the Input file is

    NAME(32:32)=pop.flag R/S/U rur/sm.town/urban

    The OUTPUT file ( station_file.txt) records those stations that are ACTUALLY USED.

    So I’m trying to sort out this confusion in the ordering, BUT it really doesnt matter.

    if you believe H2001 he says he looks at 250 or so Rural Unlit sites. as he describes it 25%
    of the small town stations

    In the data table1 (page 14) 256 of 942 GHCN stations are

    Steve: Steve Mosh, when I compare the stations that are actually adjusted (and I have classsified every one of the 7364 GISS stations now, including the USHCN stations), I’m satisfied that, within the USHCN network, it’s the NAME(31:31) =1 stations that are left unadjusted and the other stations go into the adjustment algorithm. I counted 221 code=1 stations. There are 10 code=2 and 4 code=3 stations which aren’t adjusted, but I suspect that they go into the algorithm and empirically are left unchanged as the trends match, though I can’t be sure. If this is not the case, then I have no idea how the branches are made.

  123. steven mosher
    Posted Feb 21, 2008 at 3:41 PM | Permalink

    RE 101. SteveMc read the code again. This one fooled me as well.

    The TEST in the code is

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN…..

    ‘ R’ fooled you that is ‘space R’

    The description of the Input file is

    NAME(32:32)=pop.flag R/S/U rur/sm.town/urban

    The OUTPUT file ( station_file.txt) records those stations that are ACTUALLY USED.

    So I’m trying to sort out this confusion in the ordering, BUT it really doesnt matter.

    if you believe H2001 he says he looks at 250 or so Rural Unlit sites. as he describes it 25%
    of the small town stations

    In the data table1 (page 14) 256 of 942 GHCN stations are

  124. DeWitt Payne
    Posted Feb 21, 2008 at 3:43 PM | Permalink

    D.Patterson #115,

    Back when UAH was still producing very low trends, Vinnikov and Grody wrote a paper claiming the actual trend was higher than the surface trend (the 0.26 C/decade cited above). The rebuttal by Christy and Spencer showed that Vinnikov and Grody’s method produced unphysical diurnal temperature data (two peaks rather than one).

    I read this as comparing satellite trends to surface trends after the fact, not as a result of adjusting the algorithm to produce a particular trend. Calibration of the MSU’s is an interesting subject in itself. The basic calibration is done before the satellite is launched by measuring the response to objects over a wide range of temperature. That’s the source of the twelve coefficients cited above, IIRC. IIRC, the coefficients were changed after it was discovered that the calibration algorithm was producing incorrect coefficients. Calibration is maintained in space by correcting slope and intercept by observing the emission from a plate on the satellite maintained at a constant temperature and emission from the cosmic microwave background.

    The MSU data has the problem that the sensors have finite bandwidth. That means that a range of altitude and temperature contributes to the intensity measured. There is no a priori reason to prefer a particular altitude or temperature (ill-posed problem) so I’m pretty sure that reference must be made at some point to other measurements to produce the actual anomaly data. I found a reference to what I think is the source of the method, but I’m too lazy to look it up again. It’s here on CA somewhere.

  125. Steve McIntyre
    Posted Feb 21, 2008 at 3:48 PM | Permalink

    #121. LEt’s leave it here for now, we know where to find it.

    Some new information. I’ve made a compilation of USHCN stations that have been adjusted (nice to have scraped data from GISS to work with) and compared that to the various flags.

    All the stations with the middle code=1 are not adjusted. Nearly 99% of the stations with middle code 2 or 3 are adjusted; I suspect that the other 1% are ones that go into the algorithm but empirically aren’t adjusted because the trends coincide.

    So we can boil things down to the middle code – the one which does not even have a column heading. In the first 4000 stations or so, this column is blank and just looks like a separator between the two tags on the left and right. I didn’t even notice this tag when I first read this table last year.

    Now that Steve Mosher has spotted this tag from working back from the code, it provides some clues to an old mystery – Grand Canyon was one of the very first stations that I looked at in decoding the Y2K problem. It was R-rural and lights=0, but it was adjusted, ending up with something that looked more like Tucson. We can now see why (in algorithm terms) it is adjusted – it has a middle code of 2 and therefore gets adjusted. But why does it have a middle code of 2? Where do these things come from? Why would a station which is R-rural and lights=0 be classified 2 rather than 1?

    Today’s NASA mystery.

  126. steven mosher
    Posted Feb 21, 2008 at 3:55 PM | Permalink

    Sorry.

    Bracket things are messing up my posts.

    I’ll simplify. H2001 indicates that they have 250 or so Unlit Rural stations.
    This is supported by Table1 page 14.

    The code does this to select Rural Unlit

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN

    So a site that was URBAN DARK (U1) would satisfy this condition

    717050000 MONCTON,N.B. lat,lon (.1deg) 461 -647 U1C cc=403 26

    Population = 55K

    MONCTON Gissraw – MONCTON Homogenized.

    Does MONCTON get adjusted ( U1) or not?

    Simple test.

  127. steven mosher
    Posted Feb 21, 2008 at 4:06 PM | Permalink

    The test is Name(31:32) eq. ‘ R’ space R

    In the text of H2001 he says there are about 25% or about 250 rural Unlit stations.

    table 1 page 14 shows

    256 stations of 942 are Rural Unlit.

    The code says If name = R or lights =1.

    Find the ‘and’

  128. conard
    Posted Feb 21, 2008 at 4:17 PM | Permalink

    is it my imagination or did the heading of the station list file just change:

    http://data.giss.nasa.gov/gistemp/station_data/station_list.txt

    was:
    —ID—- Legend: R/S/U=rural/sm.town/urban A/B/C=dark/dim/bright cc=country-code brightness-index

    now:
    —ID—- Legend: cc=country-code
    P: R/S/U=rural/sm.town/urban (population)
    N: 1/2/3=dark/dim/bright (satellite light data 1995, only near cont. US)
    B: A/B/C=dark/dim/bright (part of GHCN’s inventory file)
    PNB brightness-index

  129. conard
    Posted Feb 21, 2008 at 4:24 PM | Permalink

    steven mosher:

    perhaps I do not understand what you are asking? using my previous posts as a guide …

    if there is a station without a N value and a rural P value select it.
    if there is a station with a N dark value select it

    the first case covers ROW, the second the cont. us

  130. John Goetz
    Posted Feb 21, 2008 at 4:32 PM | Permalink

    #128 conard

    Yes, I still have a cached copy up in a tab, so I went to the link you provided, did a reload, and viola!

  131. Kenneth Fritsch
    Posted Feb 21, 2008 at 4:39 PM | Permalink

    http://www.climateaudit.org/?p=2746#comment-214781

    Back when UAH was still producing very low trends, Vinnikov and Grody wrote a paper claiming the actual trend was higher than the surface trend (the 0.26 C/decade cited above). The rebuttal by Christy and Spencer showed that Vinnikov and Grody’s method produced unphysical diurnal temperature data (two peaks rather than one).

    I continue to see references to Vinnikov and Grody, usually as an aside, but not often as being refuted by Christy and Spencer.

    The MSU data has the problem that the sensors have finite bandwidth. That means that a range of altitude and temperature contributes to the intensity measured. There is no a priori reason to prefer a particular altitude or temperature (ill-posed problem) so I’m pretty sure that reference must be made at some point to other measurements to produce the actual anomaly data. I found a reference to what I think is the source of the method, but I’m too lazy to look it up again. It’s here on CA somewhere.

    When all is said and done we still do not have a connection between surface temperatures and satellite measurements. It is an important issue in all these discussions in my judgment. Perhaps it is time to pose the question to Christy and/or Spencer.

  132. steven mosher
    Posted Feb 21, 2008 at 4:42 PM | Permalink

    RE 128… IT JUST CHANGED. I checked it when I did my number 80 post.

    This is too weird.

    Somebody go download the lastest drop of the GISS code .

  133. Gene
    Posted Feb 21, 2008 at 4:51 PM | Permalink

    http://tamino.wordpress.com/2008/02/21/practical-pca/
    work is already being done to correct the misconceptions on this page.

    Steve:
    Tamino makes no reference to Climate Audit and contrary to your claim here he makes no suggestion that there are any misconceptions on this page. None of the sites mentioned in Tamino’s post are discussed here. In an earlier post, I compared Blanco to Lampasas – an exercise quite similar in concept to what Tamino did – to provide evidence that the Lampasas re-location introduced problems after 2000. Tamino suggests that there were errors at Cedarville and Mt Shasta in the 1920s. That’s quite possible. I’ve hardly suggested that the process of identifying errors in this data set was anywhere near completion.

    It’s my understanding that the sort of dataset error that Tamino is pointing out here is what USHCN was already supposed to have ironed out in their SHAP and FILNET procedures. So what you’re saying is that Tamino is giving evidence of the failure of the USHCN procedure, which hardly comes as a shock to me. The FILNET objective makes sense to me, but I’m not presently in a position to appraise their actual methodology as I’m not sure exactly what they do.

    I’ve also not expressed any opinion to date on the merits or otherwise of the GISS two-legged adjustment. Sometimes thigs that seem to make sense create odd biases e.g. the Hansen Y2K problem which had a bias, but one couldn’t tell from its face what the direction was.

  134. Steve McIntyre
    Posted Feb 21, 2008 at 5:00 PM | Permalink

    #128. You’re right. It did just change. I had the prior version in my browser. It was, as you said,

    —ID—- Legend: R/S/U=rural/sm.town/urban A/B/C=dark/dim/bright cc=country-code brightness-index
    200460003 GMO IM.E.T. lat,lon (.1deg) 806 581 R A cc=222 0
    200690003 OSTROV VIZE lat,lon (.1deg) 795 770 R A cc=222 0
    202740000 OSTROV UEDINE lat,lon (.1deg) 775 822 R A cc=222 0
    202920005 GMO IM.E.K. F lat,lon (.1deg) 777 1043 R A cc=222 0
    203530000 MYS ZELANIJA lat,lon (.1deg) 770 686 R A cc=222 0
    206740006 OSTROV DIKSON lat,lon (.1deg) 735 804 R A cc=222 0
    207440000 MALYE KARMAKU lat,lon (.1deg) 724 527 R A cc=222 0
    207440001 MALYE KARMAKU lat,lon (.1deg) 724 527 R A cc=222 0
    208910006 HATANGA lat,lon (.1deg) 720 1025 R A cc=222 12

    I refreshed the cache and lo and behold, we know have a defined term:

    —ID—- Legend: cc=country-code
    P: R/S/U=rural/sm.town/urban (population)
    N: 1/2/3=dark/dim/bright (satellite light data 1995, only near cont. US)
    B: A/B/C=dark/dim/bright (part of GHCN’s inventory file)
    PNB brightness-index
    200460003 GMO IM.E.T. lat,lon (.1deg) 806 581 R A cc=222 0
    200690003 OSTROV VIZE lat,lon (.1deg) 795 770 R A cc=222 0

    Hey, Reto and Jim, don’t be shy. You can come and play with us.

  135. steven mosher
    Posted Feb 21, 2008 at 5:07 PM | Permalink

    Alright at 11:34 PM I accessed GISS Station_list.txt

    And This is what it said

    —ID—- Legend: R/S/U=rural/sm.town/urban A/B/C=dark/dim/bright cc=country-code brightness-index

    200460003 GMO IM.E.T. lat,lon (.1deg) 806 581 R A cc=222 0
    200690003 OSTROV VIZE lat,lon (.1deg) 795 770 R A cc=222 0
    202740000 OSTROV UEDINE lat,lon (.1deg) 775 822 R A cc=222 0

    For Several days, here and at Tamino and at Atmoz I have been trying to figure out
    This mystery of Miles city, Cedarville, and other apparently Rural places that are
    Adjusted.

    Now, Less that 3 hours after posting about this weird little thing in the code,
    where RURAL AND UNLIT is pciked out by code that Says ‘ R’ OR 1.
    within 3 hours of that, ( and the MONCTON thing) we see the header
    Magically change to this:

    —ID—- Legend: cc=country-code
    P: R/S/U=rural/sm.town/urban (population)
    N: 1/2/3=dark/dim/bright (satellite light data 1995, only near cont. US)
    B: A/B/C=dark/dim/bright (part of GHCN’s inventory file)
    PNB brightness-index

    Holynightlights batman.

    Conrad.. Explain to me in a logic table what Conditions you think
    the test Pulls out.

    column 31 is 1/2/3 or space
    column 32 is R/S/U

    Then (name(31:32) eq ‘ R’ OR Name(31:31) eq ‘1’) is true when

    Well When N eq 1.

    So this picks out RuralDARK, PeriUrbanDARK and UrbanDrak.

    regardless of the true value of Name(31:32) eq ‘ R’

    Witness MONCTON. U1 ; urban dark and unadjusted

    So, can you explain the various possibilities for name(31:32).

    Also, I’m not finding the bin to text file you referenced

  136. steven mosher
    Posted Feb 21, 2008 at 5:07 PM | Permalink

    Alright at 11:34 PM I accessed GISS Station_list.txt

    And This is what it said

    —ID—- Legend: R/S/U=rural/sm.town/urban A/B/C=dark/dim/bright cc=country-code brightness-index

    200460003 GMO IM.E.T. lat,lon (.1deg) 806 581 R A cc=222 0
    200690003 OSTROV VIZE lat,lon (.1deg) 795 770 R A cc=222 0
    202740000 OSTROV UEDINE lat,lon (.1deg) 775 822 R A cc=222 0

    For Several days, here and at Tamino and at Atmoz I have been trying to figure out
    This mystery of Miles city, Cedarville, and other apparently Rural places that are
    Adjusted.

    Now, Less that 3 hours after posting about this weird little thing in the code,
    where RURAL AND UNLIT is pciked out by code that Says ‘ R’ OR 1.
    within 3 hours of that, ( and the MONCTON thing) we see the header
    Magically change to this:

    —ID—- Legend: cc=country-code
    P: R/S/U=rural/sm.town/urban (population)
    N: 1/2/3=dark/dim/bright (satellite light data 1995, only near cont. US)
    B: A/B/C=dark/dim/bright (part of GHCN’s inventory file)
    PNB brightness-index

    Holynightlights batman.

    Conrad.. Explain to me in a logic table what Conditions you think
    the test Pulls out.

    column 31 is 1/2/3 or space
    column 32 is R/S/U

    Then (name(31:32) eq ‘ R’ OR Name(31:31) eq ‘1’) is true when

    Well When N eq 1.

    So this picks out RuralDARK, PeriUrbanDARK and UrbanDrak.

    regardless of the true value of Name(31:32) eq ‘ R’

    Witness MONCTON. U1 ; urban dark and unadjusted

    So, can you explain the various possibilities for name(31:32).

    Also, I’m not finding the bin to text file you referenced

  137. Posted Feb 21, 2008 at 5:07 PM | Permalink

    #134 Too funny! You never know who is lurking CA.

  138. steven mosher
    Posted Feb 21, 2008 at 5:18 PM | Permalink

    Good god!! Well one nice thing is this. If you post something at Tamino and they cant figure it out then nasa get put on the job. because god forbid tamino, the proxy should be embarassed ( I actually like him )

    It would be So simple to post a proper list of input files, proper documentation
    of the stations, documentation of the stations changed, which stations where used to
    adjust which stations.

    its 10K lines of code. and 60% of it is nonsensical input/output dance.. AND as I said
    A friggin nightmare waiting to happen.

    There’s more here guys..

  139. Gary Gulrud
    Posted Feb 21, 2008 at 5:28 PM | Permalink

    This is like watching the forensics on CSI (actually flipping past CSI) with the perps in the closet, under the desk, etc., sppppoookkyy stuff.

    I’m not thinking of betting on the merely clever, bad karma.

  140. Steve McIntyre
    Posted Feb 21, 2008 at 5:34 PM | Permalink

    Most of the code looks like it was written for a machine with about 512K memory or something like that. Writing data out to files and back in.

  141. MarkR
    Posted Feb 21, 2008 at 5:39 PM | Permalink

    JohnV. Just Posted on Tamino…

    “They fail to mention that the two adjustments mostly cancel each other out…. The(y) fail even to consider that in the global record, there’s a Mt. Shasta for every Cedarville, in fact one might guess that there are far more stations with a cooling adjustment than with a warming one, designed to mimic UHI which is predominantly false warming.”

    The old “two wrongs make a right law”, I guess?

    Will it won’t it get through the Tamino trap?

  142. Steve McIntyre
    Posted Feb 21, 2008 at 5:42 PM | Permalink

    In the other GISS data file
    http://data.giss.nasa.gov/gistemp/station_data/v2.temperature.inv.txt
    which has 7364 records (about 1000 more than the station file that we’ve been looking at with the codes), there is a quantitive index for “lights”. I’ve done a concordance of the two data sets to see if the quantitative lights measurement matched the classification – there’s a correlation but it doesn’t match.

  143. conard
    Posted Feb 21, 2008 at 5:42 PM | Permalink

    STEP2/text_to_binary.f

    STEP2/input_files/v2.inv
    P is column 68
    B is column 101
    N is column 102

    oh yeah, column numbers start at 1 not zero.

    I don’t have time right now to examine your question but will later this evening

  144. Anthony Watts
    Posted Feb 21, 2008 at 5:45 PM | Permalink

    RE140, Steve that would be about right. Remember much of this started out on older mainframes, where RAM was at a premium but hard disk space or tape spool was more readily available.

    Coding on older mainframes in FORTRAN was always a challenge, as there were many ways to overflow the limited memory if not careful.

    Big projects like this then evolved a strategy of pipeline processing of a series of black box code modules run in batch, with each black box creating an output file for the next black box to use as an input file. The problem then became one of keeping track of what happens as the data flows along the pipeline. A mistake at one stage, if not caught, affects the entire remaining pipeline.

    I once did a 3D rendering system in the mid/late 80’s that consisted of just such a pipeline operation, done in 640k on an 8MHZ IBM AT…the batch files were huge.

    So think pipeline processing when thinking through the GISS process.

  145. wkkruse
    Posted Feb 21, 2008 at 5:58 PM | Permalink

    steven mosher, I read Hansen’s description of the process to say that the brightness code supplants the population code for US and neighboring stations. So getting URBAN DARK ( ‘1U’) or PERI URBAN DARK (‘1S’) is irrelevant. They’re both unlit and therefore rural according to GISS. The code also has to check for NAME(31:32) to get rural stations not in the US or nearby.

  146. Steve McIntyre
    Posted Feb 21, 2008 at 6:08 PM | Permalink

    Here’s a plot of the middle (lit) code in http://data.giss.nasa.gov/gistemp/station_data/station_list.txt against the lights index in http://data.giss.nasa.gov/gistemp/station_data/v2.temperature.inv.txt . Today’s mystery. There’s some sort of connection. Perhaps the code is drawn from a different version of the lights index and they’ve put a different and inconsistent version up because, well, just, because. Or perhaps there’s some more complicated algorithm. It’s a mystery, but we’re making headway on this.

  147. Steve McIntyre
    Posted Feb 21, 2008 at 6:19 PM | Permalink

    Now that we’ve decoded the adjustment tag, let’s re-visit Miles City, which Atmoz reasonably identified as a “good” site. Here’s the record for Miles City showing that, despite its seemingly “good” location, it is classified as a 3 and therefore does not contribute to the subset used as a reference point for the hinge analysis.

    742300020 MILES CITY FCWOS lat,lon (.1deg) 464 -1059 R3C cc=425 26

    So the question is really what the code=1 sites look like? And note that there are quite a few (48 or so) non-USHCN sites that are code=1. My guess is that the code=1 network is probably going to be pretty reasonable as these things go. The question will then turn to whether the code=R sites in China are “good”.

  148. pouncer
    Posted Feb 21, 2008 at 6:45 PM | Permalink

    Y’see, guys? This is exactly why reputable scientists are reluctant to archive stuff.

    Put your data and your code out there for the mob to pick at, and sure as the
    world they’ll find SOMETHING to gripe about. Dark places coded as bright,
    Canadian cities coded with lat/long of Panama, logical “AND” tests that include
    things and logical “OR” that wind up excluding … Murphey’s Law bites everybody
    but as long as the shroud of authority covers the fine details none of the
    general population need worry about it. But let the sun shine in and the
    details be examined and soon, well, the whole “reputable” thing kinda winds up
    in trouble, y’know.

  149. Anthony Watts
    Posted Feb 21, 2008 at 7:50 PM | Permalink

    Mosh/Steve Mc

    column 31 is 1/2/3 or space

    I wonder what the code does in the case of a space? does it assign a null value or something else?

    I’ve seen situations where code assigns the ascii value of a string literally, space is a “HEX 20” or “Decimal 32”, not a null “0”

  150. Sam Urbinto
    Posted Feb 21, 2008 at 7:55 PM | Permalink

    I’m running out of time, and haven’t read everything here, but here’s something to ponder, from an idea I got on the BB regarding something along the lines of “If you go outside at midnight on August 1st in these 5 cities, what temperature do you expect?”

    My answer would be something like “Whatever the temperature range was 90% of the time for the years in question at that location.” Obviously, if 90% of the day/time readings were between 45-50 degrees, I’d expect it to be 45-50 degrees, but we humans are adaptable; it could be 116 or it could be 30. Made me think about averages.

    So let’s take an experiment. Bizarro city is the only monitoring area, in the center of a 5×5 grid but it is a very very very large city and and has records for the center of each of the 4 sectors that make up the city. The average temperature is 50 F. Odd Grid has data for 4 years also, and it has four stations, one in the center of each quadrant of the 5×5 grid, at an airport. The average temperature is 50 F also.

    So at 23:45 GMT, on August 1st of the four years as an average, both have 50 F. What does that tell us? Nothing, really. If I walk out the door, I might expect 50. I’d be wrong in each case, at least based upon the past, without knowing the specifics.

    For years 1-4; temperatures each year, mean, and standard deviation (rounded to nearest decimal point).

    Bizarro City Sector A -50 50 -50 50 0 58 = expect next year be -50
    Bizarro City Sector B 0 50 0 50 25 29 = expect next year 0
    Bizarro City Sector C 125 75 90 110 100 26 = expect next year 125?
    Bizarro City Sector D 75 75 75 75 75 0 = expect next year 75

    Bizarro City; Mean 50 F STDDEV 24

    Odd Grid Airport 1 40 60 40 60 50 12 = next year 40
    Odd Grid Airport 2 50 50 50 50 50 0 = next year 50
    Odd Grid Airport 3 20 80 20 80 50 35 = next year 20
    Odd Grid Airport 4 -8 47 102 60 50 55 = next year ???

    Odd Grid; Mean 50 F STDDEV 25

    Now, if I knew the exact per year temp for a given place on the 23:45 GMT reading, for each of the 4 years, on the 5th year I could something know what it was likly going to be, but others, well, not so much…. But as you see, I’m treating these all as some sort of patterns, where even if something tis bizarre and odd were to happen, I could no more tell next year’s temperature for sure at any of those locations than I could if the next spin of the wheel is green black red or what the next roll of the die will be. Would I have a better chance in Bizarro City Sector D or Odd Grid Airport 2 just because the standard deviation is 0? From the data, probably.

    So, again, the point is; what is our 90th percentile, what’s within 1 standard deviation, how many measurements do we have?

    There’s 8760 hours in a year. Out of those, the highest and lowest hour is picked, for each station, to get a mean for “the day”; 2 out of 24 per day. Out of each hour, there are 3600 1 second readings possible; we get 1 of 3600 per hour. So out of a possible 3.2 million observations per year, we take 8760 and use 730

    Interesting, isn’t it?

  151. steven mosher
    Posted Feb 21, 2008 at 7:57 PM | Permalink

    RE 141. I could not find this JohnV post

    I respect the hell out of JohnV.

    ( I guess we should revisit the CRN123R and make sure that it is CRN123 Rural and DARK.)

    When the team they have a code question they turn to JohnV and he writes good code.
    he is Honest and open and not snarky. I miss his presence here.

    But ths isnt about Adjustments cancelling each other out ( funny I had pulled mount Shasta
    as an interesting case to look at) GISS think it’s Rural Bright!!!

    MORE PRECISELY. By population its rural, By IMHOFF NIGHLIGHTS its bright.

    The real questions is: is there UHI?

    FIRST THE GISS DATA RECORD.

    725920030 MOUNT SHASTA lat,lon (.1deg) 413 -1223 R3B cc=425 13

    R3B

    P: R =RURAL population less than 10K
    N: 3 =BRIGHT by Nighlights (Imhoff97)
    B: B DIM by GHCN.

    Mount Shasta is in northern california. I love the place. Anthony lives closer than I do,
    so I suppose he loves it too. RURAL Bright? What does Imhoff 97 have to say about BRIGHT
    and RURAL?. what does he have to say about BRIGHT and URBAN?

    “In the western United States, the DMSP/OLS approach
    gave larger urban-area estimates than did the
    1990 Census. At this point in time. it is difficult to
    differentiate between two probable causes for this discrepancy
    (1) the DMSP/OLS may be accurately delineating
    urban growth and is “correct”; (2) our DMSP appreach
    is indeed overestimating by picking up too much
    rural infrastructure and classifying it as urban area.”

    Basically, Nighlights may overestimate RURAL as URBAN… where? In the western
    United states. Where is mount Shasta california??????

    is Mount Shasta BRIGHT and UHI infected?

    Now, I have read Imhoff97 who defines Hansens BRIGHT. I have been to Mount Shasta
    ( it’s a mystical place guys) Anthony has been to mount Shasta. Heck he lives about
    a couple hours away.

    Mount Shasta is not Urban. Why does it light the satillite pixel?

    How does nighlights work? what does bright mean? how was it tested.

    What did IMhoff 97 PURPORT to establish? Urbanity? Or Rurality.

    “A thresholding technique was used to convert a prototype
    “city lights” data set from the National Oceanic and
    Atmospheric Administration’s National Geophysical Data
    Center (NOAA/NGDC) into a map of ‘urban areas” for
    the continental United States. Thresholding was required
    to adapt the Defense Meteorological Satellite Program3
    Operational Linescan System (DMSP/OLS)-based NGDC
    data set into an urban map because the values reported
    in the prototype represent a cumulative percentage lighted
    for each pixel extracted from hundreds of nighttime cloud-
    .screened orbits, rather than any suitable land-cover classification.
    The cumulative percentage lighted data could
    not be used alone because the very high gain of the OLS
    nighttime photomultiplier configuration can lead to a
    pixel (2.7X2.7 km) appearing “lighted” even with very
    low intensity, nonurban light sources. We found that a
    threshold of 89% yielded the best results, removing
    ephemeral light sources and “blooming” of light onto water
    when adjacent to cities while still leaving the dense
    urban core intact. This approach gave very good results
    when compared with the urban areas as defined by the
    1990 U. S. Census; the “urban” area from our analysis
    being only 5% less than that of the Census.

    IN short, Imhoff tests 1 boundary. The boundary between URBAN and non Urban

    That boundary is tested by looking at cities like Chicago, Sacramento and Miami.

    here is how it works.

    The satillte made 231 passes. The passes that matter are the CLOUDLESS passes.

    On cloudless passes it sees the city lights.

    If 89% of the Cloudless passes lit the PIXEL, THAT 2.7KM pixel is called BRIGHT.

    Chicago is bright. Sacramento is bright. Miami is Bright.

    “Using this approach. we determined a threshold
    value for the whole U.S. data set based on the
    average value from three large metropolitan areas: Miami,
    FL: (Chicago, IL: and Sacramento, CA (Fig. 3). The
    average value was 89%.”

    “Figure 1. Composite DMSP/OLS image of the continental United States showing lit area as percent occurrence, or the percentage
    of time during which a grid cell was lit in the building of the composite. The NGDC product was made from 231 cloudscreened
    orbital swaths. The raw data was divided into two classes: 8-88% (black) and GT 88% (white). The nonthrrsholded, or
    total lit area for the raw composite, can be assessed by summing both classes. The 8-88% class was eliminated by the spatial-intrgritv
    thresholding technique used in this study. The GT 88% class had the best match to Urban area estimated bv the census.”

    URBAN was defined by GT 88% of the cloudless passes by the satillite being LIT. THIS was matched
    against census. So BRIGHT means that GT 88% of the cloudless passes have a lit pixel.

    So, where do DARK and DIM come from? well there is another LINE drawn. BUT this line
    is a noise line

    In english… Dark/Dim is determined by the 8% line.. a noise threshold. The ability of this
    Boundary to DETECT Ruralness is never tested.

    Using all the data in the NGDC city-lights data with
    DN GT 8% (their lowest value) was unacceptable owing to
    the VNIR sensitivity of the OLS sensor. The most obvious
    error induced by this sensitivity is blooming.

    SO, you have Two boundary. one drawn at 8% which is a noise floor of sorts, and another
    drawn at 89%. IMHOFF 97 tests the adequacy of the upper boundary 89% to capture urbanity
    and census figures. That is the only tests he does.

    NOW, why is mount shasta BRIGHT?

    Lets rewind the tape. What does Imhoff talk about interms of limitations of his method:

    “We found that a
    threshold of 89% yielded the best results, removing
    ephemeral light sources and “blooming” of light onto water
    when adjacent to cities while still leaving the dense
    urban core intact. This approach gave very good results
    when compared with the urban areas as defined by the
    1990 U. S. Census”

    Anthony, how close to the Lake is the station?

  152. steven mosher
    Posted Feb 21, 2008 at 8:25 PM | Permalink

    RE 108. Doug. The code is online at nasa giss.

    before I throw the bug Flag on this I want to hear Conrads explaination of
    the truth table for If Name(31:32) = ‘ R’. And I need to track down the provenance of
    all the columns. Some come from GHCN others from USCHN.. ..

    All that said, NIGHLIGHTS is inconsistent with other measures. In Hansen99 Hansen
    referred to Gallos work, which I believe was based on NDVI a index of vegatation.
    There are also satillite products of impervious surfaces ( concrete)

    So, between Nighlights, vegative index, impervious surfaces, and Anthony watts we should
    Be able to find the BEST 135 stations in the USA. Why 135? NOAA (VOSE) argued that the
    long term climate trend of the USA could be captured by 135 high quality stations. This
    was a CRN study.

    WHEW!

    Lots of stuff. The more eyes the better as reading OPC ( other peoples code) is always
    a challenge and easy to get wrong. I wish my Fortran were more up to date… In any case

    Google nasa giss. go to the gistemp page, source files are there.

    That reminds me I need to get the latest version.

  153. conard
    Posted Feb 21, 2008 at 8:48 PM | Permalink

    I have been trying to get f95 running on my solaris virtual machine but it seems to have an affinity for core dumps. I will abandon the effort and play run-time after I get the kids to bed. For now I am standing by my comments in 129 and wkkruse in comment 145.

    Parenthetically, if anyone (Dan Hughes?) has had good luck getting fortran working with gcc on Leopard I would love to hear from you.

    in short—
    The station is selected:

    if N does not have a value [space] AND has a P value of ‘R’
    OR
    there it has an N value of ‘1’

    thereby preferring N==’1′ over P==’R’.

    If i am still missing your point let me know and I will go back and read your comments more slowly.

    ps– conard not conrad 😉
    A common mistake– I had to sign my mortgage papers three times before they got it right.

  154. Anthony Watts
    Posted Feb 21, 2008 at 9:07 PM | Permalink

    RE154, Mount Shasta, odd station, is probably bright, Mt. Shasta has street lights, tourist trade. The station (ASOS, but no airport) is downtown (as was previous max-min + CRS), well away from the lake shasta 15+ miles, and about 2 blocks east of Interstate 5

    see here, I surveyed it this summer:

    http://gallery.surfacestations.org/main.php?g2_itemId=664

    see my post #152, possible string/integer issue?

  155. Gary
    Posted Feb 21, 2008 at 9:29 PM | Permalink

    #128 – FWIW, the station file list was changed at Thursday, February 21, 2008 4:34:59 PM (their server clock time ’cause it’s after your post time). Find it with View Page Info in Firefox.

  156. BarryW
    Posted Feb 21, 2008 at 9:49 PM | Permalink

    Dumb question, what complier are they using? the if(name(31:32) .eq. ‘ R’) statement won’t compile in the Fortran 95 complier I tried. I’m assuming that name is a character array. I’m seeing that construct throughout the code.

  157. Brian
    Posted Feb 21, 2008 at 10:09 PM | Permalink

    Maybe this is too simple, but why can’t these station sites be found on google earth or something similar to see how rural or urban they are? You’d think that would be more accurate than this “lights” business. It would sure cut down on the code a bit.

    BTW, I appreciate all the work you do to make sure things are correct.

  158. steven mosher
    Posted Feb 21, 2008 at 10:13 PM | Permalink

    RE 157. Anthony according to Imhoff97 CHICAGO is bright. Sacramento is BRIGHT.

    Shasta aint chicago.

  159. steven mosher
    Posted Feb 21, 2008 at 10:15 PM | Permalink

    RE 159. Dont know what compiler, we could never egt the code working.

    If you look through the Shell scripts you’ll see some of the settings.

    I think we assumed it was AIX OS.

  160. Anthony Watts
    Posted Feb 21, 2008 at 10:17 PM | Permalink

    RE161 I was thinking about it in terms of surrounding areas, not GISS.

  161. Sarah Hamilton
    Posted Feb 21, 2008 at 10:33 PM | Permalink

    To clarify for conard and others, the question is the interpretation of Hansen’s 2001 paper. If indeed he meant that the stations selected should be rural AND unlit then the code isn’t quite correct for the North American stations. It should be:

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:32).EQ.’1R′) THEN

    Instead we have this:
    IF(NAME(31:32).EQ.’ R’.or.NAME(31:31).EQ.’1′) THEN

    which means that stations 1U and 1S (urban and periurban) are also being selected to adjust neighbouring stations. Does it make sense to do this? Certainly the city of Moncton NB seems like an odd definition of rural, regardless of whether the satellites say it is unlit or not. How many 1U and 1S stations are there?

  162. Steve McIntyre
    Posted Feb 21, 2008 at 10:43 PM | Permalink

    Folks, the code=1 stations are NOT adjusted; the code=2, code=3 stations are adjusted. So there’s no point worrying any about how it compiles, because we know what it DOES. It treats NAME(31:32) = ” R” as FALSE if its “1R”, “2R”, “3R’. Let’s go on to the next problem – how the code values are obtained.

  163. Posted Feb 21, 2008 at 10:47 PM | Permalink

    I’m not sure how you guys get any real work done with all the discussion that goes on here.

    It appears the temperature data for Lampasas, TX was not corrected in the USHCN for the station move in 2000. More Lampasas, TX Corrected Data

  164. Evan Jones
    Posted Feb 21, 2008 at 10:50 PM | Permalink

    Why not simply take (for 1900-date):

    –the World GISS hundred year adjusted
    –IPCC World,
    –HadCRU World
    –NOAA World adjusted,
    –NOAA World raw (with only outliers removed)

    and lay them on top of each other, and then we can get an eyeball at how radical these adjustments are in aggregate.

    That will give us a basis to be going on with.

    Or link me to the yearly stats and I’ll do it. (I already have HadCRU and NOAA adjusted but keep getting lost trying to find the rest.)

  165. Evan Jones
    Posted Feb 21, 2008 at 10:55 PM | Permalink

    Actually I don’t have HadCRU and NOAA. I only have the US numbers. So I need the global data. A general look would be helpful.

  166. conard
    Posted Feb 21, 2008 at 11:24 PM | Permalink

    NASA folks:
    Thanks for the quick and competent response.

    Steve Mosher / Sarah Hamilton:
    Thanks. I do understand your position (N and P should have equal weights) but do not necessarily agree that the code or method is in error. Now I would change my mind if it can be demonstrated that the disputed 20 stations from the rural set { ‘ R’, ‘1R, ‘1S’, ‘1U’ } have a significant impact on the results of the calculation.

    Station Counts:
    ‘1S’ == 18
    ‘1U’ == 2

    ‘1S’ stations:
    715270020 DUNNVILLE PUMPING STN,ON lat,lon (.1deg) 428 -796 S1A cc=403 0
    722480030 EL DORADO/FAA AIRPORT lat,lon (.1deg) 332 -928 S1A cc=425 0
    722860020 BEAUMONT USA lat,lon (.1deg) 339 -1170 S1B cc=425 0
    723060070 WELDON USA lat,lon (.1deg) 364 -776 S1B cc=425 0
    723090030 KINSTON 5SE lat,lon (.1deg) 352 -775 S1A cc=425 9
    723520030 ALTUS IRRIGATION RES STN lat,lon (.1deg) 346 -993 S1A cc=425 0
    723650040 LAS VEGAS SEWAGE PLT lat,lon (.1deg) 355 -1052 S1A cc=425 0
    724380090 CRAWFORDSVILLE 5S lat,lon (.1deg) 400 -869 S1A cc=425 12
    724450010 VICHY/ROLLA NAT’L ARPT lat,lon (.1deg) 381 -918 S1A cc=425 0
    725230010 BRADFORD/FAA AIRPORT lat,lon (.1deg) 418 -786 S1A cc=425 0
    725330030 HUNTINGTON lat,lon (.1deg) 409 -855 S1A cc=425 9
    725570020 VERMILLION 2SE lat,lon (.1deg) 428 -969 S1A cc=425 0
    727650010 DICKINSON/FAA AIRPORT lat,lon (.1deg) 468 -1028 S1A cc=425 0
    744900030 CLINTON lat,lon (.1deg) 424 -717 S1C cc=425 9
    745460010 EMPORIA/FAA AIRPORT lat,lon (.1deg) 383 -962 S1A cc=425 0
    747910010 GEORGETOWN 2E lat,lon (.1deg) 334 -792 S1B cc=425 9
    764050000 LA PAZ, B.C.S lat,lon (.1deg) 243 -1104 S1A cc=414 0
    722030010 BELLE GLADE EXP STN lat,lon (.1deg) 267 -806 S1A cc=425 0

    ‘1U’ stations:
    715270040 BRANTFORD CANADA lat,lon (.1deg) 431 -803 U1B cc=403 14
    717050000 MONCTON,N.B. lat,lon (.1deg) 461 -647 U1C cc=403 26

  167. Jon
    Posted Feb 21, 2008 at 11:25 PM | Permalink

    Sarah in #161 gets it right. The code should read:

    IF(NAME(31:32).EQ.’ R’.or.NAME(31:32).EQ.’1R′) THEN

    Which would be “rural & no lights measurement | rural & dark”.

    Steve’s comment #162 is thus a red herring.

    It treats NAME(31:32) = ” R” as FALSE if its “1R”, “2R”, “3R’.

    This is exactly correct, exactly what code does, and exactly not the problem. The problem is with the other part of the expression.

  168. steven mosher
    Posted Feb 21, 2008 at 11:25 PM | Permalink

    RE 156.

    You and wkkuse figured this out very quickly great. Also thanks for the heads up on the
    datafile change.

    In any case on the chance that either
    of you know Dr. Lo or Dr. Sato, I would like to say thank you for releasing the code.

    here is what confused me

    name(31:31)=li(102:102) ! US-brightness index 1/2/3=dark/dim/brite
    name(32:32)=li(68:68) ! population index (R/S/U=rural/other)
    name(33:33)=li(101:101) ! GHCN-brightness index A/B/C=dark/dim/brt
    name(34:36)=li(1:3) ! country code (425=US)

    This Puts BRIGHTNESS ( nghlights) in Column 31, POPulation in Column 32, and GHCn brightness
    in column 33 and Country code in Column 34-36.

    Lets take a look at a line from the stations_list.txt

    715270040 BRANTFORD CANADA lat,lon (.1deg) 431 -803 U1B cc=403 14
    717050000 MONCTON,N.B. lat,lon (.1deg) 461 -647 U1C cc=403 26

    What do we see? U1C then a country code for Canada

    Brantford is population 74,000 according to GISS, and MONCTON is population 55,000

    So, it looked to me like Urban. Right? It looked like orders of variables had been
    switched. Plus I could NEVER find the place where the nighlights data was added
    to files. can you help?

    One other thing that bug me.
    H2001 says there are no Urban Unlit.

    Now the 1 variable is clearly the nighlights variable ( now that the documentation has magically appeared) And its clearly that because only the US and some of canada and some of mexico have this variable

    SO: in the Stations_lst file we have U1C for example. U comes before 1. U means urban
    1 means dark.

    This looked to me like this

    Name(31:31) = R/S/U
    Name (32:32) = 1/2/3
    Name (33:33) = A/B/C
    Name (34:36) = COUNTRY CODE

    Which made no sense.

    That is what got me confused and wondering

    Anyway. The other thing that confused me was that TAble 1 of Hansen2001 on page 14
    Shows NO UrbanDARK. Yet, the station file has 2. Weird? Plus I could not find the
    code that created this table on page 14. So that made more confusion.

    Actually, if you can get the code running, there would be some very very
    simple tests that would put this all to rest..

  169. Steve McIntyre
    Posted Feb 22, 2008 at 12:34 AM | Permalink

    Here is the difference between the USHCN TOBS and Filnet for Lampasas and the 5-station average for the 5 stations used by Atmoz in his current comparison: Albany, Ballinger, Blanco, Dublin, and Llano.

    This shows convincingly (as did the comparison with Blanco by itself) that the 2000 station move had a noticeable impact and that the USHCN adjustments failed to pick up the station move. I suspect that Atmoz is going to concede the point. It looks like he has some sort of collation problem in his mean comparisons.

    BTW these series would work well in the Mannomatic if you spread the series out over the millennium.

  170. Bob Koss
    Posted Feb 22, 2008 at 3:04 AM | Permalink

    #155

    Gary,

    Thanks for the headsup.

    I just compared them and the only changes are in the legend and that seems to be just formating. The stations are exactly the same all the way through.

  171. D. Patterson
    Posted Feb 22, 2008 at 3:49 AM | Permalink

    124 DeWitt Payne says:

    February 21st, 2008 at 3:43 pm

    131 Kenneth Fritsch says:

    February 21st, 2008 at 4:39 pm
    http://www.climateaudit.org/?p=2746#comment-214781

    [….]
    When all is said and done we still do not have a connection between surface temperatures and satellite measurements. It is an important issue in all these discussions in my judgment. Perhaps it is time to pose the question to Christy and/or Spencer.

    The excerpts from Grody et al was offered as only an example of what just one quick search could reveal about the acknowledged problems with error when using the MSU data to derive inferred temperature information. Implicit was the reliance of the final results upon accurate and valid mathematical, statistical, and scientific methodologies after the collection of the observational data. The language in the Grody paper may somewhat obscure how some investigators tend to indirectly guide methodologies by keeping in mind the surface observational research of Trenberth, Hansen, and others; but even when surface data is not used directly, such surface data indirectly influences the results from MSU data when researchers tailor methodologies to mimic results from surface data and/or GCMs related to surface data. For one of many possible other examples:

    Mears et al. A Reanalysis of the MSU Channel 2 Tropospheric Temperature Record. Remote Sensing Systems, Santa Rosa, California (Manuscript received 10 October 2002, in final form 23 May 2003)

    [….]
    Researchers generally agree that the surface warming observed over the past century is at least partially anthropogenic in origin, particularly that seen in the past two decades (Hansen et al. 2001; Houghton et al. 2001).[…] Despite excellent coverage (more than half the earth’s surface daily), the MSU data suffer from a number of calibration issues and time-varying biases that must be addressed if they are to be used for climate change studies.
    [….]
    Initial studies of the midtroposphere MSU channel 2 data performed by Christy and Spencer (Christy and Spencer 1995; Christy et al. 1998, 2000, 2003; Spencer and Christy 1990, 1992a,b) uncovered a number of important sources of error in those data, including intersatellite offsets, the significance of diurnal warming with slow evolution in the satellite local equator crossing times (LECT), and the presence of a significant correlation between observed intersatellite brightness temperature differences and satellite hot calibration load temperature.
    [….]

    The absence of an absolute external calibration reference requires us to choose a single baseline brightness temperature as the arbitrary reference offset[….]

    [….]

    There are a number of differences in methodology between RSS and CS that may contribute to the observed discrepancies in the deduced trends. In Table 6, we summarize differences between the two methods that could significantly change the long-term global time series. First, there are significant differences in the origin and application of the adjustments used to correct for diurnal drift. We calculate the diurnal cycle for each 2.58 3 2.58 grid point using a GCM (CCM3), and then validate it in a number of ways using consistency with MSU data. Christy and Spencer account for diurnal drifts by considering systematic cross-scan differences between measurements taken at slightly different local times (Christy et al. 2000, 2003), thus deducing the effect of diurnal drift directly from the MSU measurements themselves. Despite this difference, on a global scale, the two adjustments are in good agreement with each other. When we reprocess our data using the CS diurnal correction, the global trend is decreased by only 0.006 K decade21, less than 7% of the total difference between our results.

    Temperatures are inferred from MSU satellite data using post-observational mathematical and statistical methods. The post-observational mathematical and statistical methods are widely acknowledged to be subject to significan potential error. Methodologies which do not rely upon GCM estimates and their fundamental parameterization by reference to surface observational trends demonstrate a divergence from surface observational trends, while methodologies which do rely upon GCM estimates demonstrate much lesser divergence from the surface observational trends used to generate the parameterizations used to calibrate the GCMs used to calibrate the MSU data.

    Suffice it to note, some of the methodologies used to generate results from the satellite microwave soundings engage investigators in yet another round of chasing circular references to the authorities used in supporting the research. Accordingly, caution is necessary before making any assumptions or conclusions whatsoever about any and all claims of independent verification of MSU temperature results. With respect to the satellite MSU derived temperature results, there are enough post-observational mathematical and statistical assumptions and methods to keep an ensemble of investigators busy for a long time to come just trying to trace the authorities for the mathematics and methodologies employed, much less validating the computations and results. As they say, Never Assume Anything [especially when dealing with climate science sources]. Accordingly, it cannot be assumed or concluded that the MSU results are produced without the influence of surface observational trends, because we already know some methods and adjustments (diurnal cycle adjustment) use GCMs, assumptions, and biases based upon the surface observaitonal trends published by Trenberth, Hansen, and others. This is evident from only a cursory search, and a much more comprehensive searh and investigation by the readers of this blog can doubtlessly produce much more evidence of the MSU adjustments being influenced by surface observational trends, if they should choose to do so.

  172. PHE
    Posted Feb 22, 2008 at 5:27 AM | Permalink

    Sorry. Completely off-topic, but I found this quote from RC’s latest topic quite charming:
    “Well, what can we conclude from the discussion: Paleo-tempestology is a brand new field of study and there is undoubtedly a long way to go before the reconstruction of extreme events (like hurricanes) in the past will be anything more than suggestive. However, there’s a lot more data out there waiting to be collected and analysed.”

    And adapted to a similar subject: “Paleo-dendrochronology is a field of study where there is undoubtedly a long way to go before the reconstruction of millenial global temperatures in the past will be anything more than suggestive. However, there’s a lot more data out there waiting to be collected and analysed.”

  173. Posted Feb 22, 2008 at 7:26 AM | Permalink

    Re Geoff Sherrington #13, 56 and Aurbo #49, there was a whole thread on the TOB (aka TOBS) bias and adjustment in Thread #2106, dated 9/24/07.

    I started off as a TOBS Denialist like Geoff (see #10, 15 of that thread), but Aurbo made a True Believer of me with his posts #19, 22 in that thread. See my Recantation in #110 (as corrected at 112, 113).

  174. steven mosher
    Posted Feb 22, 2008 at 7:41 AM | Permalink

    RE 166, I agree the code picks out rural dark, small dark and urban dark. the name(31:32) thing
    threw me. Primarily becuase the station_list.txt has the “columns ordered like ” R1, S2 eetc

    The larger problem is how nightlights can rank a city of 77,000 as dark Brantford and
    a city of 3600 as bright, mount shasta.

  175. D. Patterson
    Posted Feb 22, 2008 at 7:48 AM | Permalink

    173 Hu McCulloch says:

    February 22nd, 2008 at 7:26 am

    Your conversion in faith is quite premature. Try comparing observations from stations on a 5 minute reporting schedule and are subject to frequent FROPA with observations from stations on a 5 minute reporting schedule and are NOT subject to frequent FROPA, and make this comparison throughout four seasons or longer. Likewise, try comparing similar comparisons of continuous observations from observational stations subject other types of discontinuous weather events between seasons and years, such as seasonal frequency of inversions, adiabatic wind patterns, and so forth. The failings of averaging only the maximum and minimum hourly, 3 hourly, or 6 hourly observations becomes apparent when such an average is compared to the average from a continuous observational record.

  176. steven mosher
    Posted Feb 22, 2008 at 7:56 AM | Permalink

    re 157 brian, Hansen needed a method that cound be applied automatically, there are 1221 sies in the usa. in his 1999 paper he suggested he was going to look at gallos work. gallo’s paper looked
    at the vegative index. by 2001 hansen had settled on imhoff’s work which used nightlights
    to determin urban. as you can see nightlights says a city of 77000 is dark, hile a city of 3600
    is bright.

    in the rest of the world, mount shasta (pop 3600) would not get adjusted. in the us it does.
    in the rest of the world moncton (pop 55000) would get adjusted, since its close to the usa
    it does not get adjusted.

    The problem is this imhoff did not test the accuracy of nightlights to accurately capture RURALITY.
    which is how hansen2001 uses it.

    there are probably some better satillite products today that could be used.
    Finally, I have not been able to locate the source that H2001 used, that is the actual image

  177. Joe Black
    Posted Feb 22, 2008 at 8:26 AM | Permalink

    We need to resurrect a version of the real time global temperature web site
    http://www.junkscience.com/GMT/Change/history.gif

    using the Rev’s automated land system combined with that flotilla of bob and weave ocean temp bots, or find the Earth’s asshole (I’ve heard many Canadians suggest Windsor, eh?)to stick a thermometer in there and all gather around to watch.

  178. Posted Feb 22, 2008 at 9:07 AM | Permalink

    It looks like they are rearranged when written in routine invnt.f in STEP2
    WRITE(6,'(A,I9,1X,A30,1x,a,I5,I6,1x,a1,a1,a1,a4,a3)’)
    * fname(1:NDOT+2),ITR1(3),name(1:30),
    * ‘lat,lon (.1deg)’,(ITR1(i),i=1,2),
    * name(32:32),name(31:31),name(33:33),’ cc=’,name(34:36)

    re:#143 Can anyone point me to the file that has the variable N in column 102

    thanks

  179. D. Patterson
    Posted Feb 22, 2008 at 9:07 AM | Permalink

    177 Joe Black says:

    February 22nd, 2008 at 8:26 am
    We need to resurrect a version of the real time global temperature web site

    using the Rev’s automated land system combined with that flotilla of bob and weave ocean temp bots, or find the Earth’s asshole (I’ve heard many Canadians suggest Windsor, eh?)to stick a thermometer in there and all gather around to watch.

    Kola Peninsula, 180C/356F, -12,000+m…which does what for the mean global temperature [s]?

  180. John Goetz
    Posted Feb 22, 2008 at 9:18 AM | Permalink

    #176 steven

    there are probably some better satillite products today that could be used.

    Brian in comment #157 had it right – use Google Earth. It really would not take a grad student or low-level employee that long to do and the accuracy should be at least as good if not significantly better than what is used now.

    KML files are available, which makes the job VERY easy. I just took a look at the locations for 10 stations and was able to draw a conclusion on all 10 in less than 10 minutes. But let’s give this grad student/low-level employee the benefit of a doubt and allocate 10 minutes per site to zoom in and out to render a decision (1, 2, 3 score), take a screen shot(s) of the location for archival purposes, and file everything away before moving on to the next location. That should allow analysis of 48 sites per business day, but we will round down to 40. If there are 1221 US sites to be studied, that translates to 31 business days, or 6 weeks pay. This study could we extended to all stations worldwide if additional funds could be found.

    We all know based on Anthony Watt’s work that the stations are often not located at precisely the locations indicated by the records, but so what. The locations that went into the satellite lights analysis were no more accurate. What we do know is that the satellite imagery used by Google Earth is much more recent than what went into Imhoff (1995 data).

  181. steven mosher
    Posted Feb 22, 2008 at 9:18 AM | Permalink

    RE 178. I’ve been looking off and on for it for 2 days..

    Can you grep for it?

  182. Posted Feb 22, 2008 at 9:27 AM | Permalink

    I hope I’m not going backwards here, but it is likely that I am.

    The file gistemp.txt, which comes with the GISSTemp download says at the very top:

    Basic data set: GHCN – ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2
    v2.mean.Z (data file)
    v2.temperature.inv.Z (station information file)

    I’ve been to ncdc.noaa,goc/rub/data/ghcnv2 by both ftp and http.

    I see a file v2.temperature.inv. I do not see v2.temperature.inv.Z. The readme v2.temperature.readme says:

    All files except this one were compressed with a standard UNIX compression.

    Whatever the case, I’ve downloaded the v2.t …. inv file. A file v2.inv comes with the GISSTemp download.

    The two files seem to be not the same. The file I get from ncdc.noaa stops at column 101 ???

    All pointers appreciated.

  183. MarkW
    Posted Feb 22, 2008 at 9:29 AM | Permalink

    Hansen needed a method that cound be applied automatically, there are 1221 sies in the usa.

    This is one of my biggest complaints about the current state of climate science. The practitioners spend more time looking for a method that is easy than they do looking for one that is accurate.

    Such attitudes are acceptable (barely) when a science is in it’s infancy. But you don’t base policies on the findings of a science that is in it’s infancy.

  184. John Goetz
    Posted Feb 22, 2008 at 9:33 AM | Permalink

    Referring to #166 conard:

    Conard gave us a list of 1S stations, and I started going through the list and noticed that the first four:

    715270020 DUNNVILLE PUMPING STN,ON lat,lon (.1deg) 428 -796 S1A cc=403 0
    722480030 EL DORADO/FAA AIRPORT lat,lon (.1deg) 332 -928 S1A cc=425 0
    722860020 BEAUMONT USA lat,lon (.1deg) 339 -1170 S1B cc=425 0
    723060070 WELDON USA lat,lon (.1deg) 364 -776 S1B cc=425 0

    stopped reporting before 1995. In the case of Weldon it was 24 years before 1995. I wonder how meaningful is it to use the 1995 satellite data to rate stations that stopped reporting before 1995, and then to use the resulting information to adjust stations that continued to report post-1995.

  185. Posted Feb 22, 2008 at 9:44 AM | Permalink

    D. Patterson writes,

    Your conversion in faith is quite premature. Try comparing observations from stations on a 5 minute reporting schedule and are subject to frequent FROPA with observations from stations on a 5 minute reporting schedule and are NOT subject to frequent FROPA, and make this comparison throughout four seasons or longer.

    It’s true that the min/max mean is only a noisy approximation to the continuously monitored average daily temperature. However, it’s the only data we have for most USHCN stations, and lots others in the world, so we have to make do with it. Because of Aurbo’s double counting issue, there is not just noise, but a bias that depends on the time of observation, and when this gets changed, say from 5PM to 7AM, the bias changes. The substantial official TOB adjustment may not be perfect, but it sounds reasonable.

    I’m a lot more worried by the “homogenization” procedures that apparently adjust the minority of good stations to agree with the consensus of their urban airport, parking lot, and sewage plant neighbors. It would make more sense to me to just throw out the LaGuardias, Marysvilles and Urbanas for climate history purposes, and not try to “urban adjust” these bad stations. Of course the de-icing crews at those airports need to know whether it’s above or below freezing at the airport, so those stations do serve a valuable purpose and should not just be closed. The others should just be shut down, perhaps after a year of overlap with a new station nearby for comparison. Stations like Lampasas that have a well-defined date of move to unacceptable location can be used up to their move, but there’s no point trying to adjust their subsequent data.

    (LaGuardia, Newark, O’Hare, SFO, etc, are all hand-picked CRU stations. LaGuardia is not on the GISS list Mosh posted, but Newark and O’Hare are. SFO (the airport itself) is GISS station #729940002.)

    FROPA, BTW, = FROntal PAssage. Nothing a Claritin-D won’t clear up!

  186. Anthony Watts
    Posted Feb 22, 2008 at 10:07 AM | Permalink

    RE 180, John Goetz,

    Google Earth and Microsoft Live Earth can be useful, but the coverage is not complete enough yet for the majority of the network to be surveyed. I’ve tried. The detail is not good enough for much of the great plains for example.

    But here is one I found using nothing but Google Earth, Cheesman Lake in Colorado.

    http://maps.google.com/maps?ie=UTF8&q=39.2202,-105.2783&t=h&z=17&iwloc=addr

    zoomed closer I can see the Stevenson screen box just west of the concrete pad.

    http://maps.google.com/maps?ie=UTF8&q=39.2202,-105.2783&ll=39.220308,-105.278233&spn=0.00165,0.002722&t=h&z=19

  187. BDAABAT
    Posted Feb 22, 2008 at 10:18 AM | Permalink

    This has been an incredible thread to read through! I can’t follow the FORTRAN, but how cool is it that basic investigations by some curious folks lead to such intriguing information!

    Keep up the good work everyone!
    Bruce

  188. steven mosher
    Posted Feb 22, 2008 at 11:07 AM | Permalink

    RE 180. Here is my thought.

    You go back to the climate science on UHI. basically Oke’s work, but others as well.

    http://en.wikipedia.org/wiki/Urban_heat_island

    UHI is caused by several factors.

    1. GEOMETRY: As OKE describes this the Urban enviroment can have radiative canyons
    and wind shelters. A radiative canyon is like a corner reflector. The heat goes
    in and is delayed in getting out by reflection, multi path, etc. A canyon ( buildings)
    is a solar concentrator ( not very efficient, but a concentor) Building ALSO limit
    Sky view. So heat absobed doesnt see the sky, can’t escape as easily. Also, Buildings
    reduce the effect of turbulant mixing. Basically wind shelters. In a flat area the
    breeze will allow vertical mixing and cooling. Where there are buildings, the “boundary
    layer” will be thick. Basically this is the UHI Bubble. Its a function of building height
    amongst oother things

    2. EVAPOTRANSPIRATION: how the surface stores and gives up heat. In Urban areas you have
    concrete and asphalt which store heat and give it up more slowly. This impacts night
    time temps. Basically TMIN. If your tmax stays stable and your tmin goes up, thats
    a signal of concrete infection.

    3. WASTE HEAT: waste heat is heat from humans, their cars, their buildings. Population.
    density is most important.

    Your Final UHI is going to be a result of all three. So to Identify Areas with NO UHI
    you want to test for the following.

    LOW BUILDING HEIGHT: USHCN actually has a data feild for this but it isnt used.
    You could process satilite photoes to determine average building height.
    HIGH VEGATIVE INDEX: There is a Satillite product for this
    LOW IMPERVIOUS SURFACE: no concrete, there is a satillite product for this.
    LOW POPULATION: that feild exists.

    Nighlights doesnt capture UHI. Nighlights captures the fact that a PIXEL was lit.
    It assumes that lights mean buildings and concrete and people. It’s a crude proxy.

    Lets do a TEST. GOOGLE MAP Brantford ON. Nightlights calls this city of 77,000 DARK
    The google Map Mount Shasta California. Population 3600. Nighlights calls this BRIGHT.
    Then Google map chicago. that is also bright.

    Bottom line: what is needed is a test that picks out Rural. Few or low buildings, little or no concrete, few people, and last but not least a Quality siting to eliminate microsite.

  189. steven mosher
    Posted Feb 22, 2008 at 11:24 AM | Permalink

    here is athing that puzzles me. why restrict nighlights to the US and use population for the rest of the world?

    Isnt there good data today on the extent of rural/urban areas for the whole globe?
    I started here.. Still a work in progress

    http://beta.sedac.ciesin.columbia.edu/gpw/

    Columbia? where I have heard about that place.

  190. steven mosher
    Posted Feb 22, 2008 at 11:31 AM | Permalink

    Hmm. Hansen2001 uses nighlights only for the US and nearby canada and Mexico.

    The nighlights data have a resolution of 2.7km

    Why not use the 1km data

    http://beta.sedac.ciesin.columbia.edu/gpw/ancillaryfigures.jsp#1kmdens

    It would be a fun project to take a look at how a better version of nighlights
    classified all the sites in the world. Why one method for the US and another
    method for the rest?

    And a method ( nighlights) that has been improved upon since 1995. Puzzling.

  191. steven mosher
    Posted Feb 22, 2008 at 11:38 AM | Permalink

    RE 182. I cannot find the source of the v2 file that comes with the distribution. Its just there.
    I thought the “manual steps” readme might explain but it didnt.

    There are probably other support programs that are required to maintain this stuff, but
    they are not released. It’s really not a ‘release’ in our sense of the word. Its code
    thrown over the wall.

  192. steven mosher
    Posted Feb 22, 2008 at 11:43 AM | Permalink

    re 180 John G. Yes I agree that a check of Google maps is manually feasible. My point
    would be to look at known indicia of UHI.

    1. Building height.
    2. Surface Composition
    3. Vegatative Index
    4. Population.

    To that I would the human cross check and yes Surfacestations type work

  193. steven mosher
    Posted Feb 22, 2008 at 11:51 AM | Permalink

    RE 185. Hu. Go read the TOBS paper. The tobs effect changes by latitude and by season.

    From memory I think the adjustment can be as high as 1C, but the model carries a Stderr of
    .2C, again from memory.. so Have a read.

    On a related matter. If I collect the data from 100 stations I have 100 independent samples
    for contructing an average and calculating a varience.

    Now, if I use Information from 50 of those stations, to create a bias correction for the
    other 50, do I still have 100 independant samples?

    It’s not a trick question, it just puzzles me

  194. Sam Urbinto
    Posted Feb 22, 2008 at 12:13 PM | Permalink

    What bothers me is using stations that stopped reporting in 1995 to adjust stations that keep reporting and using satellite information that’s 13 years old, counting on urban/rural by a method that does weird things and relying on software and methods so old it ran on computers that had to “use floppies as memory”.

    Certainly in this day and age we could get a little more accurate, ya think?

  195. Posted Feb 22, 2008 at 12:18 PM | Permalink

    re:#191

    Thanks Steven.

    Got any more code problems to look at/work on?

  196. Earle Williams
    Posted Feb 22, 2008 at 12:27 PM | Permalink

    re #193

    steven mosher,

    Step back and think about the data model. What is your variance due to? Is this a normal distribution? If so then there is no meaningful way to interpolate the neighboring value, because the data are independent. If they are not independent then you may be able to divine a model to fit that interdependence. What is it? Linear interpolation is the easiest but what model fits the data best?

    If you do have some interdependence between observation points, then how much of the observed variance is due to the temperature distribution and how much is due to the natural variation and instrumental error?

    For myself, the first thing I would do is calculate and plot gridded surfaces for the daily max and min and see what that tells me about the spatial variance. Then look at monthly plots.

    I’ve brought it up before and I see that Hans Erren did too recently (I think it was Hans) and that is the use of kriging to generate a gridded surface to represent your observations, with an associated uncertainty distribution indicative of the spatial variance and undersampling of the data. The upside to that is you have a metric for the increased uncertainty of trying to fill in the holes of missing data.

    By the way WM Briggs has a great post on the topic of data models. Now that I think of it though, you’ve probably already posted over there on that topic. 🙂

  197. Doug
    Posted Feb 22, 2008 at 12:49 PM | Permalink

    As far as I can tell, the input files for the PApars.f module are a series of files entitled fort.31, fort.32, fort.33, fort.34, fort.35 and fort.36. Here is the code that creates the file names, open the files and assigns the file I/O handles to the reserved I/O identifiers 31, 32, etc.:

    do in=31,30+inm
    fn=’fort.xx’
    write(fn(6:7),'(i2)’) in
    open(in,file=fn,form=’unformatted’)
    end do

    Does anyone know of these fort.xx (31, 32, etc.) files? Until we crack them open we can’t be certain how they are formatted and ultimately, how data is read into the program. The SREAD subroutine reads the data one record at a time from these files and through EQUIVALENCE (very dangerous practice) packs the data in the NAME character string. Until now I’ve assumed the data records looked like the ones listed in http://data.giss.nasa.gov/gistemp/station_data/v2.temperature.inv.txt but now I’m not so sure because I can’t get the NAME string to populate correctly in my mind.

    A few editorial comments. If this code is indicative of the larger body of software at GISS then I am both shocked and saddened. Fortran and other block programming languages are frought with design limitations, clunky I/O and OS dependent add-ons or “features” that behave differently on different Unix machines. There are reasons why code like this has long since been abandoned by private industry and the DoD. Without an object-oriented approach there is no encapsulation of data and methods that operate on them, no inheritance of methods from class to class and no extensibility in the overall design. The use of a three tiered architecture and open standards (XML, third party tools, etc.) is the way to go. I wonder if I submitted a grant request to modernize the software …

  198. Evan Jones
    Posted Feb 22, 2008 at 12:53 PM | Permalink

    What is the GISS average adjustment for 1900? 1940? For the whole world? Does anyone know?

  199. Larry T
    Posted Feb 22, 2008 at 12:55 PM | Permalink

    HI has the effect of increasing temperature of a urban area and also some of it surroundings. Almost all of the UHI effects to be seen world-wide will be to add a positive bias to more recent temperature readings. A few places (I am thinking urban area in Philippines or Washington State near major catastrophic volcano eruptions or New Orleans after Katrina) may have had a negative UHI effect. The effect on temperature readings would be an increase in slope of the rate of increase in these records. The hinge method employed to modify the data has a tendency to decrease the older temperature records and increase the slope of temperature record. This is a programming bug of the highest order when the obvious change in station data should be a decrease in slope. Anyone who uses this data modifications is definitely “unhinged”.

  200. Bernie
    Posted Feb 22, 2008 at 12:57 PM | Permalink

    Steve
    It does strike me as a “tricky” question. To call the total sample still independent wouldn’t you have to show that the correction is the same for both groups of 50? (Or establish the bias at a very high level of confidence.) If it is different and you impose the bias correction of one group on the other, then they would no longer be independent.

    Still I will wait for the heavy duty statisticians to chime in.

  201. John Goetz
    Posted Feb 22, 2008 at 1:41 PM | Permalink

    #199 Larry

    Funny you should mention Hawaii. From http://data.giss.nasa.gov/gistemp/sources/gistemp.html:

    After noticing an unusual warming trend in Hawaii, closer investigation
    showed its origin to be in the Lihue record; it had a discontinuity around
    1950 not present in any neighboring station. Based on those data, we added
    0.8C to the part before the discontinuity.

  202. Steve McIntyre
    Posted Feb 22, 2008 at 2:04 PM | Permalink

    All the Canadian code 1 data ends by 1990! Later versions are available at the Canadian historical data website. The old GHCN versions in 1990 are an interesting crosscheck as the Canadians have adjusted early history – in the few cases that I’ve examined, they’ve lowered the temperatures in the first half of the 20th century. I’ll do a post on this.

  203. steven mosher
    Posted Feb 22, 2008 at 2:27 PM | Permalink

    RE 195. Figure out how to compile the stuff. last time we tried we hit a bunch of
    weird little problems ( mostly on IO ). We worked through some of them and I think we progressed into step3.. and then other topics tore people away. PLus it was hard to do remote debug
    with everybody having a different dev enviroment. It was fun if you like sand in your underware.

  204. Posted Feb 22, 2008 at 2:27 PM | Permalink

    Steve McI might want to snip this one.

    Pure Speculation at this Time:

    All this lowering of temperature in the early 20th century has an uncanny relationship to the figures here. The page is a summary of radiative forcings. In particular the side-by-side figures shown under the first figures (the first figures are labeled Figure 28).

    These figures are summaries of the radiative forcings. The one on the left has the components and the one on the right the Net Forcing. Note that the early 20th century has a Net Forcing that is negative. Note also that that forcing is due entirely to the effects of Stratospheric Aerosols (Annual Mean). Note too, that the detailed shape of the Net Forcing for all time periods is also set by the shape of the Stratospheric Aerosols. And finally note that the Aerosols (Stratospheric and Tropospheric) and the Aerosols Indirect Effect are basically the only thing that moderates the Well-Mixed Greenhouse Gases effect.

    Now all Aerosols are way way out of my fields of experience and expertise. But I doubt that these things were actually being measured in late 1800s and early 1900s. It is my understanding, especially using the radiative-equilibrium approach, that Net Negative Forcing decreases the Global Average Temperature Anomaly,

  205. Posted Feb 22, 2008 at 2:36 PM | Permalink

    re: #203

    Did that before and have tried again over the past couple of days.

    No joy.

  206. Posted Feb 22, 2008 at 2:44 PM | Permalink

    re: #181

    Steven, I use BBEdit on a Mac. I can find anything and everything in any and all text files 🙂

  207. steven mosher
    Posted Feb 22, 2008 at 2:51 PM | Permalink

    RE 197. Thanks Doug, I’ll have a look. We concur 100% on the quality of the distribution.
    go have a look at the Python in Step1. I cannot imagine mixing Python with Fortran in one distribution.
    Especially one that is only about 10K LOC. Nothing against Python or Fortran, language is language.
    But for maintainability pick a language. And pick a language your collegues know. I’ll guess of course that the new guy did the Python. He should have learned Fortran or rewritten the whole shebang in Python.

    the .txt file is likely an “output” file and not an input file. I say that with some hesitation.
    If you look at the program it was written at a time when memory was on the small side, so you have
    a lot of reading in, and writing out, intermediate files. Today, you’d just read in all the input files
    intoan object and then apply methods and we would have none of these questions.

    Modernizing the software. The global warming gravy train.

    I think MilStd 2167A would be in order for all climate code.

  208. Robert Wood
    Posted Feb 22, 2008 at 5:45 PM | Permalink

    Oh Boy!

    Since Jim is obviously following this closely, may I suggest he use only stations QA’d by Anthony, with no “corrections”.

  209. Robert Wood
    Posted Feb 22, 2008 at 6:07 PM | Permalink

    #149 Anthony,

    I think it depends upon how the compiler for whatever machine the code runs on interprets the “.EQ’R'” term when it must compare with two characters.

    Depending upon implementation, the comparison may be only a comparison of the second byte of the characters selected. Compilers back in those days were not too bright about flagging range and bound errors. I suspect that the comparand ‘anything”R”‘ would have compared correctly with ‘R’.

    YMMV

  210. Robert Wood
    Posted Feb 22, 2008 at 6:20 PM | Permalink

    #153 “Conard” you must be French.

    As we say au Canada: “Bin, c’est un conard”.

  211. Sam Urbinto
    Posted Feb 22, 2008 at 6:38 PM | Permalink

    The code.

    If you know it’s doing an implicit & by using a space (‘ R’) then for what your specific compiler wants, either escape the character, or replace the space with something appropriate for the machine. Like a logical operation that excludes.

  212. Robert Wood
    Posted Feb 22, 2008 at 7:09 PM | Permalink

    172 PHE

    “And adapted to a similar subject: “Paleo-dendrochronology is a field of study where there is undoubtedly a long way to go before the reconstruction of millenial global temperatures in the past will be anything more than suggestive. However, there’s a lot more research grants out there waiting to be collected and analysed.”

    There. Corrected. This is the source of the fr##d.

  213. thomas
    Posted Feb 22, 2008 at 7:20 PM | Permalink

    a question to Mr. Watts:

    about the inaccurate long/lat for the stations given by official sources compared to google or GPS: has anyone asked if they are all off by a similar amount which varies by latitude? The magnetic pole has moved so much in the last 30 years that even a survey coordinate on a housing plan from 30 years ago is now off and needs correction by surveyors

  214. Posted Feb 22, 2008 at 7:21 PM | Permalink

    Re: #151 S.M.

    From Imhoff et al. 1997: “The final DMSP/OLS continental U.S. city lights-data product used here consists of a composite image made from 231 orbital swaths gathered between fall 1994 and spring 1995…” The specific states mentioned by Imhoff et al. as having apparently too much urban nightlight were Idaho, Montana, and Wyoming.

    I wonder if part of the problem out West is streetlights shining on winter snow cover.

  215. Anthony Watts
    Posted Feb 22, 2008 at 8:41 PM | Permalink

    RE213, No I haven’t investigated it that much, its more of an annoyance than a survey killer. We almost always find the station. One we haven’t found, Dover, DE has a typo on the MMS entry for lat/lon we believe. When we do find stations, and a survey is done, we get a GPS fix and put up links from Google Earth, so it’s easy to go back and find it.

    Of course, some, like our recent commenter Eric McFarland, that fully don’t understand the issue of global coordinates, surveying, and magnetic drift, ascribe other motives/issues to it when its just simple imprecision.

    Eric how’s that surveying of a station coming? Find one yet?

  216. steven mosher
    Posted Feb 22, 2008 at 8:51 PM | Permalink

    Re 211. The weird thing is this. I cannot find the source for column 31.?
    The nightlights data exists for the whole world, and in this file, only selected
    sites inthe US canada and mexico are populated with the variable?

    Where does that happen? How does it happen? why some sites in canada but not other?

  217. Anthony Watts
    Posted Feb 22, 2008 at 9:20 PM | Permalink

    RE216 Mosh, I think its time for a mid audit summary of what we know and don’t know by bullet points, this will help foster an understanding for all whom are reading.

    For example, Nightlights list V1.01

    We know
    – Nightlights are used in the USA
    – Some spill over the boders in towns in Canada and Mexico
    – Nightlights aren’t used in ROW, population is
    – Some lights=0 stations get adjusted, Cedarville, CA for example
    – Some lights=0 stations don’t get adjusted, Cheesman Lake, CO for example
    – etc…

    We don’t know– Why ROW uses population and USA doesn’t
    – Where the source for column 31 is
    – What environment is needed to compile and run GISS code
    – etc….

    Since you have the most familiarity perhaps you can use this as a starting list and add to it, Steve Mc can add what he knows, and we can keep a running list going to help keep our known/unknowns straight. We strikethrough/add as we go along. We update the list with version numbers.

    This GISS code has too much complexity and entropy to keep in one’s head. Keeping a list allows attacking the large problem point by point.

    What do you think? Anyone?

  218. George M
    Posted Feb 22, 2008 at 9:30 PM | Permalink

    thomas and Anthony:
    I used to have to figure out the geographical coordinates of radio site locations back in the pre-GPS days in order to apply for licenses. We used USGS topographic maps of whatever the best scale was available for the location. I have had occasion to go back and compare a few of those to modern GPS data, and find errors of several hundred yards in some cases. Where I live, for example. If I go to the coordinates derived from a current 15 minute topo map with my GPS, I find myself a half a city block away from home. Also, there is some difference between the various map datum which may be used. NAD83 seems to be a common one, but others may be used, further confusing things. I forget whether SurfaceStations specifies which map datum to use or not (it should, Anthony). In any event, sites located more than 15 years ago were likely spotted on topo maps with all their shortcomings.

  219. BarryW
    Posted Feb 22, 2008 at 9:33 PM | Permalink

    Re 217

    I’d reverse the question. Why doesn’t ROW use nightlights and USA does? We’ve got good census data down to the district but how good is the ROW pop data? Is it because lights don’t correlate with population in the ROW? High population with low energy usage doesn’t strike me as a good indicator of UHI.

  220. Posted Feb 22, 2008 at 11:52 PM | Permalink

    Steve Mosher in #193 writes,

    RE 185. Hu. Go read the TOBS paper. The tobs effect changes by latitude and by season.

    From memory I think the adjustment can be as high as 1C, but the model carries a Stderr of .2C, again from memory.. so Have a read.

    You remember well. In #110 of the TOBS thread (see #173), I reported that Fig. 8 of the Karl et al paper (which I had already read) indicated an average shift of about -0.9 dC (averaged across both latitudes and seasons) when Tobs was changed from 5PM to 7AM, and that Karl gave his bias estimates a se of about 0.2 dC. Since about 28% of stations made this switch between 1941 and 1985, this would be an average shift of -.25 dC before adjustment, requiring an average adjustment of +.25 dC to compensate. The actual average TOB adjustment was about +.22 dC during that time (see #113 of the TOBS thread), which is close enough.

    According to NOAA’s MMS (via surfacestations.org), Lampasas read its temperature at 0700 from 1/1/0000 to 6/30/2000, and switched to 0630 from 10/1/2000 – present. I guess they took them 3 months to find a suitable parking lot when they made their move. 0700 to 0630 should increase the probabililty of double-counting cool nights and hence have a small cooling TOB effect, justifying a small upward adjustment after the move, but nothing like the jump that occurred.

    On a related matter. If I collect the data from 100 stations I have 100 independent samples for contructing an average and calculating a varience.

    Now, if I use Information from 50 of those stations, to create a bias correction for the other 50, do I still have 100 independant samples?

    If you “homogenize” 50 randomly selected stations, using data for the other 50, you more or less only have 50 independent stations in the end.

    Of course if you “homogenize” the 25 good stations that are “out of line” with the 75 other bad stations, you get garbage. Or sewage, to be more precise, if you rely on stations like Urbana OH as your norm.

    An interesting special case is that of “zombie” stations that no longer exist yet still churn out adjusted data. Case in point being Delaware OH, which hasn’t had a reading since 1/01, but has had annual average data every year since! See surfacestations.org gallery.

  221. Jaye
    Posted Feb 23, 2008 at 1:01 AM | Permalink

    Modernizing the software. The global warming gravy train.

    Sheesh that Hansen code was a pile of crap. Shell scripts that sequentially make binaries that process data, that make some more binaries that process some more data, then go to another folder, blah, blah,…that kind of stuff is amateurish IN THE EXTREME. Then to put python in the mix. Also you can tell that it was written by hordes of grad students, different styles, different capitalizations, etc.

    If one is going to depend on software to do one’s job, one should be marginally competent to write said code. I mean use a freakin’ string tokenizer fer chris sake. Given about two-four weeks and decent software engineer could rewrite that junk in C++ or Java in a way that the stuff would actually be maintainable, along with a suite of unit tests, a proper make/install system, etc.

  222. Stephen Richards
    Posted Feb 23, 2008 at 3:09 AM | Permalink

    Anthony

    Would the BB be a good place to keep your ‘defects’ list

  223. John F. Pittman
    Posted Feb 23, 2008 at 8:04 AM | Permalink

    http://www.danhughes.auditblogs.com/ #204 A quick OT comment, except how it relates to Hansen noted several times at the bottom of your link. I have a graph from googling historical coal consumption. The graph you linked to is indeed suspicous looking, especially aerosols estimates for the 1930’s. The annual global consumption figures show that at 1935 there was a huge drop in coal use compared to 1925 and 1945. Petroleum use was small and has a slightly negative trend around 1935. The problem is that coal use was so much greater than all other energy combined. As well as, at this time there were no restrictions on percent sulphur, nor opacity (indirect measuement of particulate matter (PM) ) restrictions. Sulphur and PM are the main components of “aerosols”. For the graph to be right the effects and half-lifes of SO2 and PM would have to be the approxiamately the same. Not only do I wonder about the night lights issue, it was recently posted that the half life of CO2 was much less than claimed. The forcings you linked appear to follow CO2

  224. steven mosher
    Posted Feb 23, 2008 at 8:10 AM | Permalink

    re 217 lights=0 does not determine whether a station is adjusted, column 31 does
    if its 1 then no adjust, 2 and 3 get adjusted. see my explaintion on your new post

  225. John F. Pittman
    Posted Feb 23, 2008 at 8:15 AM | Permalink

    continuation (hit the wrong button) except that the well mixed greenhouse gases is relatively flat in the estimates. This is obviously incorrect per other claims of “well mixed” or the effects of “aerosols” Hansen’s (or Sato) estimates (Hansen and Sato are credited at the bottom of your link) for this time period either do not reflect what actually occurred, or there was a large positive forcing that is not in his graphs. The “black carbon” and “aerosol indirect effect” in the graph are laughable when you look at coal consumption in the Great Depression!

  226. steven mosher
    Posted Feb 23, 2008 at 8:16 AM | Permalink

    re 184. there is a complicated proceedure for combining rural stations that surround
    sthe station to be adjusted. its in Papars.f. i will figure it out and explain

  227. Steve McIntyre
    Posted Feb 23, 2008 at 9:00 AM | Permalink

    Steve Mosh, take a look at column 31 in the Canadian data. Some Canadian stations have a column 31 code; others don’t. There doesn’t seem to be any rhyme or reason to why some do and some don’t.

  228. steven mosher
    Posted Feb 23, 2008 at 9:18 AM | Permalink

    re 227 adjacent to the usa is what h2001 claims.. that way he can get rural stations within
    500km or 1000km of an urban station

  229. Ric Locke
    Posted Feb 23, 2008 at 10:24 AM | Permalink

    Fascinating. Wish I had time to do forensic Fortran — I used to be good at it.

    Warning: do NOT, repeat NOT, trust Google Earth or Microsoft Live Earth at the detail level. I know some of the people who do data capture for them, and what you see is for all practical purposes a Photoshop exercise — texture mapping image data on questionable surface models.

    Regards,
    Ric

  230. Anthony Watts
    Posted Feb 23, 2008 at 10:59 AM | Permalink

    Re228 Thanks. Mosh, can you make a list as I did in 217 and maintain it then?

    The way we are tracking it now across multiple posts and threads is like working in a room with thousands of post it notes stuck to the wall.

    Such is not conducive to collaborative discovery.

  231. aurbo
    Posted Feb 25, 2008 at 1:04 AM | Permalink

    A few personal notes, anecdotal and otherwise, concerning the general tenor of many of the above posts.

    First, it appears that GISS is using a paleo-operating system with which to store and analyze their data. Their employment of what seems to be a non-standard nor mil-spec version of Fortran is absurd. To suggest they can’t afford an upgrade is even more absurd.

    Regarding aerosols: During the period from the late 1930s to mid 1940s, I used to walk several blocks (and through Central Park as well) on my daily trek to and from school which was located in the West 90s while I lived on the East 80s. During the colder months of the year I would leave footprints in the fly-ash which covered the sidewalks. The ash cover was produced by the particulates emanating from the nearly general use of coal for heating the brownstones which lined each side of the cross streets. Another consequence of this ubiquitous ash was that snow on the ground would turn various shades of gray within a day or two of its precipitation. This accelerated the melting process and is one reason why persistent snow cover was relatively uncommon in the Metropolitan area until the use of coal gradually declined and essentially vanished by the late 1950’s [N.B. is anybody old enough out there to remember that the sponsor of the radio program “The Shadow” was Blue Coal?] The air was a lot more polluted back then than it has been for the past 40 years. For a worst case scenario Google “Donora PA smog 1948”

    Now, before one passes this off as the product of an unreliable memory and hence not easily verifiable, the data not only exists, but is in computer readable, downloadable form and easy to analyze by anyone interested in doing so. Just FTP the standard synoptic database from NCDC and do daily or monthly counts of the frequency of code 04 (visibility reduced by smoke) and code 05 (haze) in the present weather group. I spent quite a few years hand-plotting weather maps in my early employment and on most days under fair skies and a stagnant high pressure system I was plotting code 04s and 05s. In looking at data from the past 40 years, the presence of smoke and to a lesser, but significant extent, haze has been greatly reduced. A similar, but less precise search can be done analyzing horizontal visibilities in the absence of precipitation or fog. These can be easily extracted from the METAR hourly reports.

    Finally, I might exclude Southern CA from this analysis as their smog problem is a product of the micro-climate and large amounts of photo-chemical reactants which although improved is far from completely resolved.

  232. MarkW
    Posted Feb 25, 2008 at 7:20 AM | Permalink

    BarryW,

    The second and third world is a lot less electrified than the US. The ratio of lights to people is a lot lower. As an extreme example, have you seen the nightlights picture of the Korean pennisula? It’s real easy to tell where S. Korea ends and N. Korea starts.

  233. Steven Mosher
    Posted Feb 25, 2008 at 7:41 AM | Permalink

    re 230. will do, ive been working on other stuff

  234. BarryW
    Posted Feb 25, 2008 at 10:22 AM | Permalink

    Re 232

    Yes but the amount of energy generated (i.e., heat) is much less per person. I’d have to believe that in equivalently populated cities the UHIE for N. Korea is much less that in S. Korea.

  235. MarkW
    Posted Feb 25, 2008 at 11:44 AM | Permalink

    Heat from electricity is only one aspect of UHI, and not a major one at that. The biggest are the amount of paved surface, and the amount of vegetation removed. Most 2nd and third world cities have very high population densities. This will be a major source of heat, even without electric lights.

  236. Geoff Sherrington
    Posted Feb 29, 2008 at 5:03 AM | Permalink

    TOBS again

    It’s 1.25 am in California now and that makes for easy typing in Australia since most of USA is asleep. And it’s a cool evening as the last day of Feb 2008 ends. The Bureau of Meteorology reported that –

    Victoria has experienced a relatively cool summer with maximum and minimum temperatures typically 1 to 3 degrees below average across the State, the coolest since 1995 in some areas.

    That was a historical snippet for year 2002, just to get you going.

    For year 2008, Feb Tmax averaged 25.8 and Tmin averaged 14.5. The long-term Feb averages, years 1855-2007 are Tmax 25.1 and Tmin 15.9. Look at the difference in Tmin for this month!!

    One swallow does not a summer make, but we are taught that CO2 causes global wrming everywhere, relentlessly, increasingly, with settled science…..

    Back to Time of Observation Bias as adjusted for thermometers which record a daily Tmax and a daily Tmin. The TOBS adjustment became an issue because you could do simple sums that showed that if you read the instruments at different times of day, you could get yesterday’s max or min instead of today’s. So I’ve just re-read Karl for the nth time.

    The problem seems to be that the accepted methodolgy is to adjust the temperature axis, usually graphed as Y. The mistake occurs in time, on the X-axis. So why not adjust time instead of temperature?

    Here is a quick, imperfect attempt at a flowsheet to correct for time.

    Read time of observation. Select the preceding 24 hours. If Tmax and Tmin are both present, (and we assume this for this simple model) one is right. Review the past month. Calculate the time of day when the maximum is most often reached – say 2 pm. If the time of observation is before this, use the immediately prior Tmin as correct and go back to the high before it and record that as the Tmax for the day. (If the time of observation is after the monthly Tmax average time, reverse the labelling and accept the prior Tmax on the day as correct and search for the Tmin in the day preceding it, then accept it as the correct Tmin for that day.)

    The main objection occurs when a station reading has been historically taken close to the daily max or min. This can be the case where (say) 7 am readings are taken and the daily Tmin has or has not been registered. Historically, it is less common to take readings when close to monthly Tmax, say sometime between noon and 4 pm.

    Of course, written station records override these assumptions when available.

    If this method is used, in the majority of cases by far, either Tmax or Tmin will be correct and should not need adjustment. If one of them relates to the previous day, it could be higher or lower, but will usually be much the same if there has not been an extreme change in the critical part of the day.

    Tmean (which I dislike) can then be taken arithmetically for each day (including an occasional wrong value up or down) and can then be averaged to get monthly obs.

    The advantages of this approach are that few temperature changes will be needed and when adjustments are made, they will be close to the preceding day and usually rare, and usually cancelling pos with neg over the long term.

    Now shoot me down please.

  237. Geoff Sherrington
    Posted Feb 29, 2008 at 5:29 AM | Permalink

    Lights – some thoughts

    Satellites measure lights as Mosh and others described above. However, the probability that a light detector at whatever frequency precisely captures a town’s lights, completely from N to S and E to W as it sweeps along is just an accident of geometry on a given pass. If the PM tube sees just the edge of town it will report a small lights. If it traverses the centre, it will report larger. If it makes a number of passes and these are combined, then you have to know the maths of how they are combined and the associated errors. In particular, the positional errors are so large that a combination of say 10 images is unlikely to aggregate to more than a blob which has little to do with the true shape of the town – or the location of the climate station.

    How this blob relates to Google maps etc is a further source of error, large enough to cause me severe disinterest in the topic.

    If you too had spent years trying to find radiometric and magnetic anomalies from an aircraft, then going bush to find them on the ground, you’d have a better appreciation of what I mean. And the aircraft typically flew 80-100 meters above terrain, not several hundred km like a satellite does.

    I’d really be giving the whole concept of lights a big miss. It’s got classic signs of being promotional with spin, not hard science.

  238. Posted Feb 29, 2008 at 8:30 AM | Permalink

    Re Geoff Sherrington, #236, I think you are wrongly assuming that we know when Max and Min occurred. In fact, for most stations all we know is the time of observation and what the min and max tmeperatures were since the last reading and resetting of the thermometers or MMTS. It would be trivial to wire a thermistor to a 15-year-old PC to integrate temperature over a 24 hour period to get the true daily mean, but even the “modern” MMTS are read manually once a day for 24-hour max and min, just like the old liquid-in-glass models.

    Even if NWS went to a computer-age technology, we would still be stuck with over a century of historical data to compare the present to. In the TOBS thread, I argued that the best time to measure, in terms of avoiding the double counting bias and wild card, would be around 9AM or 9PM. The most popular times have been 5PM or 7AM, but 7AM is too close to the overnight min, and 5PM too close to the daily high. Switching from 5PM to 7AM is going to generate a big drop in mean tmeperature, that requires an offsetting upward adjustment. Eyeballing Karl’s Figure 8, it looks like about +.9 dC would be justified.

  239. ChrisZ
    Posted Feb 29, 2008 at 10:31 AM | Permalink

    I have been lurking and reading here for quite some time, and the ongoing TOB discussion is the one that puzzles me most – both because I still can’t figure out how such a bias should come to exist at all due to the changes you describe (except maybe a one-off outlier on the very day the observation time is changed because one Tmin or Tmax is counted twice or not at all), and because it should be ridiculously easy to end the discussion by either showing the non-existence or determining the amount of the effect. Let me explain:

    If I’m not very mistaken, there exist hourly or even by-the-minute temperature data for certain places and periods. Take one month or so of these and hack it into 24-hour slices – first at 0:00, then (with the same source data) at 1:00, 2:00, and so on. To shorten the procedure, start by cutting your 24-hour periods at the two times of day noted as “most critical” above, namely 5PM and 7AM. For each set of 24-hour data slices, find Tmin and Tmax and calculate Tmean. Compare the two sets of Tmax/Tmean/Tmin etc. found. I have a strong feeling the two sets at worst have an offset by a day against each other, but certainly NOT by so-and-so-many degrees Celsius (as the one-day offset can obviously give both positive and negative temp differences depending on the slope of the curve at any point, but they will be negligible when looking at whole months, years, or even decades).

    Back to lurk mode (unless someone point me to the kind of data I am talking about so that I may make the experiment myself…)

  240. Sam Urbinto
    Posted Feb 29, 2008 at 10:32 AM | Permalink

    How about a reading at the top of every two hours at two locations (or 1 location for two consecutive days) starting at midnight:

    12 45 50
    02 50 51
    04 50 52
    06 50 53
    08 55 54
    10 55 55
    12 50 56
    14 50 57
    16 55 58
    18 55 59
    20 60 60
    22 60 45

    So both days are a mean of 57.5

    But what does that really tell you?

One Trackback

  1. […] to GISS, pre-Y2K, post-Y2k and current from 19 Versions and Whadda You Get Climate Audit Quote: Pre-Y2K: As of mid-2007, prior to the identification of the Y2K error, NASA used the […]