The September 2007 Bear Market in NASA Temperature "Pasts"

Since August 1, 2007, NASA has had 3 substantially different online versions of their 1221 USHCN stations (1221 in total.) The third and most recent version was slipped in without any announcement or notice in the last few days – subsequent to their code being placed online on Sept 7, 2007. (I can vouch for this as I completed a scrape of the dset=1 dataset in the early afternoon of Sept 7.)

We’ve been following the progress of the Detroit Lakes MN station and it’s instructive to follow the ups and downs of its history through these spasms. One is used to unpredictability in futures markets (I worked in the copper business in the 1970s and learned their vagaries first hand). But it’s quite unexpected to see similar volatility in the temperature “pasts”.

For example, the Oct 1931 value (GISS dset0 and dset1 – both are equal) for Detroit Lakes began August 2007 at 8.2 deg C; there was a short bull market in August with an increase to 9.1 deg C for a few weeks, but its value was hit by the September bear market and is now only 8.5 deg C. The Nov 1931 temperature went up by 0.8 deg (from -0.9 deg C to -0.1 deg C) in the August bull market, but went back down the full amount of 0.8 deg in the September bear market. December 1931 went up a full 1.0 deg C in the August bull market (from -7.6 deg C to -6.6 deg C) and has held onto its gains much better in the September bear market, falling back only 0.1 deg C -6.7 deg C.

All records of the August bull market in Detroit Lake pasts have been erased from the NASA website, but I managed to complete my downloads in time and am in a position to try to decode exactly what’s been going on.

Continue reading

USHCN Survey Results based on 33% of the network

With 33% of the USHCN weather station network now surveyed, the site quality rating is now applied, see the USHCN Station Master List file in HTML and XLS format.

The rating system for site quality was borrowed verbatim from the new Climate Reference Network being put into operation by NCDC and NOAA to ensure quality data. Their siting criteria can be found here.

I welcome input on this work in progress. The site rating will now be a running total in the spreadsheet and always available online as new stations are added to the survey. What is important to note is that the majority of stations that have a rating of 4 are MMTS/Nimbus equipped stations, which according to NCDC’s MMS equipment lists, make up 71% of the USHCN network. It appears that cable issues with the electronic sensors have forced them closer to buildings, roads, etc because NOAA COOP managers don’t often have the budget, time, or tools to trench under roads, sidewalks etc to reach the site where Stevenson Screens once stood. While this isn’t always the case, a pattern is emerging.

CRN-rating.gif

For background, see this first: Conference presentation given at CIRES/UCAR on 8/29/07 describing this project and the methods used to assign station site quality ratings, along with examples of many site issues seen thus far.

Click to view the slideshow I presented at UCAR

Immediately after the conference, a senior official at NCDC requested a copy of the above slide show, which I provided to him on CDROM. After receiving it, in a follow up email he inquired as to distribution rights which I granted within NCDC and NOAA for the purpose of review. That was last week. Thus far no issues have been raised with the presentation content. Since no issues were raised at the conference or in the two weeks afterwards (two weeks as of today) I have decided to release it publicly.

Note that of the 33% surveyed, only 13% meet the CRN site criteria (Rating of 1 and 2)for an acceptable location to accurately measure long term climate change free of localized influences.

Continue reading

Unthreaded #20

No discussion of CO2 measurements, thermodynamics, theory of radiation, etc. please – other than to identify interesting references.

Hansen Then and Now

In the “good old days” (August 25, 2007: after they had corrected their Y2K error), I downloaded Hansen’s “combined” version (his dset=1).

Jerry Brennan observed today that Hansen appeared to have already “moved on”, noticing apparent changes in Detroit Lakes and a couple of other sites. Here is a comparison of the Detroit Lakes (combined) as downloaded today, compared to the version downloaded less than 3 weeks ago. As you see, Detroit Lakes became about 0.5 deg C colder in the first part of the 20th century, as compared to the data from a couple of weeks ago.

detroi10.gif

In a few more weeks, maybe Hansen will have 1998 – and perhaps even 2006 – on top again.

Despite these large changes, Hansen, as with the Y2K corrections, did not provide any notice to readers that major changes (not arising through ordinary operations) had been inserted in his records.

It looks like this is the reason for the conundrum observed in my last post . I never thought of checking to see if Hansen had altered early 20th century values for Detroit Lakes MN between August 25 and Sept 10. It’s hard to keep with NASA adjusters. As noted previously, no wonder Hansen can’t joust with jesters, when he’s so busy adjusting his adjustments.

Here is the same comparison for Boulder CO, another site mentioned by Jerry, showing major changes in Boulder temperature estimates for the 1980s!

detroi11.gif

As a result of revisions made within the last 2 weeks, NASA now believes that the temperature increase in Boulder since the 1980s is about 0.5 deg more than they believed only a couple of weeks ago. Boulder is the home of IPCC Working Group 1, the site of UCAR’s world headquarters, NCAR’s site and home to hundreds, if not thousands of climate scientists. You’d think that they’d have known the temperature in Boulder in the early 1980s to within 0.5 degree. I guess not.

Crossword Puzzle #4: Re-Visiting Y2K

In my opinion, one of the main purpose of compiling the Hansen code is to produce intermediate versions to determine what Hansen really did. Today I’m going to compare the Step 0 version of Detroit Lakes MN (familiar to many readers from the Y2K error) as produced from running Hansen’s code with a current USHCN data set to the corresponding version presently online at NASA GISS. They are materially different. In trying to figure this out, I’ve waded through the relevant code in Step 0 in which Hansen splices USHCN data into GHCN and spent a little time pondering how this particular code – prior to the recent patch – could yield the observed Y2K error. (I might add that, although Hansen thanked me for observing that they needed a patch to correct the error, it’s more precise to say that I observed that the error resulted from using two different versions – my own suggestion would have been to use a consistent data version, rather than to add another adjustment.)

In some earlier speculation, I wondered whether Hansen might not have been using a vintage version of the USHCN data. After looking through the matter some more, I’m 99% certain that Hansen used (and continues to use) a vintage USHCN version ending in 1999 – even though the online USHCN version is current up to late 2006 (more current than the corresponding GISS data.) I’ll summarize the evidence for Hansen’s continued use of a vintage USHCN version, and in the course of this description, improve the description of Hansen’s Step 0 as part of a process enabling the implementation of this step in a more current software environment. Continue reading

Hansen Code

Technical Comments only.

The Bias Method's Perfect Siberian Storm

I have been spending some time (my wife would say “too much time”) examining how the Hansen Bias Method influences the temperature record. We have already observed that the Hansen method introduces an error in cases where the different versions are merely scribal variations. See http://www.climateaudit.org/?p=2019 and discussion.

The cause of the error has also been pinned down: in the case where a scribal version has only two of three monthly temperature values in a quarter available, Hansen calculates the anomalies of the available two months. It is important to note that the anomalies are the difference between the month’s recorded value and the month’s average value for the period of scribal record. Hansen takes these anomalies, averages them, and then sets the estimate of the “missing” month’s anomaly equal to this average. The “missing” monthly temperature value is then estimated by adding the estimated anomaly to the scribal record’s mean for the month. This occurs even when there is a temperature value available for the missing month in another scribal record. From the two available monthly values and the third, estimated monthly value a quarterly average is calculated, followed by a calculation of the annual average from the quarterly averages. Finally, for the two scribal records that are being combined, Hansen averages the annual averages for the overlap period, and, if there is a difference between the two averages, determines that to be a bias of one version relative to another and adjusts the earlier version downwards (or upwards) by the amount of the bias.

While the method clearly will corrupt the data set, there doesn’t seem to be any reason why it would introduce a material bias in northern hemisphere or global trends. We’ve observed cases in which the method caused early values to be falsely increased (Gassim) and cases where the method caused early values to be falsely reduced (Praha), and one’s first instinct is that Hansen’s method would not affect any overall numbers. (Of course that was one’s initial impression of the impact of the “Y2K” error on the US network.)

However, that proves not to be the case, because of a “perfect storm” so characteristic of climate errors.

Hansen’s network outside the US has 2 main components: GHCN records, which all too often end in 1990 for non-US stations (USHCN records continue up to date); and 1502 MCDW stations (mainly airports). The MCDW reports started in January 1987 and continue to the present day.

In Siberia, to take an important case under discussion, the overlap between the MCDW record and GHCN record is typically 4 years – from January 1987 to December 1990 or so. Here’s where the next twist in the perfect storm comes in. Instead of calculating annual averages over a calendar year, Hansen calculates them over a “meteorological year” of Dec-Nov. While there may be a good reason for this choice, it has an important interaction with his “Bias Method”.

Even if the two versions are temperature-for-temperature identical in the overlap period, the MCDW series is “missing” the December 1986 value and the 1987 DJF quarter must be “estimated”. Now suppose that Jan-Feb 1987 are “cold” (in anomaly terms) relative to December 1986 (also in anomaly terms). As it happens, this seems to be the case over large parts of Asia (other areas will be examined on another occasion). The variations in Asian anomalies are very large. Let’s say that over large regions of Asia, the Dec 1986 anomaly was 2.5 deg C higher than the Jan-Feb 1987 anomaly. And let’s say that all other values are scribally equal.

Under Hansen’s system of comparing annual anomalies, this difference of 2.5 deg C will enter into the average of 4 years ( in effect being divided by 48 months) and then rounded up to a “bias” of 0.1 deg C. Since the MCDW version is “cold” relative to the prior GHCN version, the GHCN version extending to earlier values will be lowered by 0.1 deg C.

It looks like there may be a domino effect if there is more than one series involved, with the third series extending to (say) the 1930s. Hansen combines the first two series (so that the deduction of 0.1 deg C is included in this interim step.) When the early scribal version is compared to the “merged” version, the early scribal version now appears to be running “warm” relative to the adjusted version by 0.1 deg C. So it “needs” to be lowered as well.

The net effect is to artificially increase the upward slope in the overall temperature trend for most of the stations we have studied. As noted earlier, this process can bias records in the other direction, but stations with the requisite conditions have been hard to come by – Gassim being one of the few.

Update Continue reading

Hansen's Station Lists

The section of Hansen’s code that we’d been looking at immediately prior to the dump of relatively unannotated code was how Hansen combined scribal versions of different stations in a 2-column case – which looks to contain a material error already discussed without the benefit of source code and which is going to be examined further. We had just started trying to figure out how Hansen dealt with multiple series. This appears to be considered in the Step 1 program comb_records.py, which employs a subroutine get_best in which station versions are ranked accorded to provenance as follows:

‘MCDW’: 4, ‘USHCN’: 3, ‘SUMOFDAY’: 2, ‘UNKNOWN’: 1

I am unaware of any mention of this ranking procedure in Hansen et al 1999, 2001. Hansen mentions MCDW and U.S. First Order (which seems to be mostly ASOS) as follows:

Second, updates of the GHCN data covering the most recent several years include only three component data sets [Peterson and Vose, 1997]: (1) up to about 1500 of the global MCDW stations that report monthly data over the Global Telecommunications System or mail reports to NCDC, (2) up to about 1200 United States Historical Climatology Network stations, which are mostly rural; (3) up to about 370 U.S. First Order stations, which are mostly airport stations in the United States and U.S. territories in the Pacific Ocean. Third, the update for the final (current) year is based mainly on MCDW stations

We’ve talked about USHCN in the past, but not much about MCDW or First Order (ASOS) networks. Hansen provides three lists that pertain to these three networks (the lists occurring in both Step 0 and Step 1 files):

ushcn.tbl
mcdw.tbl
sumofday.tbl

The first table ushcn.tbl is a concordance with 1221 rows (equal to the number of USHCN stations), linking USHCN identification numbers with GHCN station.inv numbers (carried forward into the GISS station.inv. ) This is the second such concordance of USHCN and GHCN identifications numbers to appear online – an earlier concordance being posted here http://www.climateaudit.org/data/ushcn/details.dat . (I’ve not compared the concordances yet.) So it’s nice to see GISS’ contribution. I checked the 1221 station IDs in ushcn.tbl for inclusion in the 7364 ids in GISS station.inv and all were included.

MCDW
The next list mcdw.tbl is more problematic. There are 1502 stations in the mcdw.tbl list (which dates from 1998 BTW), which is consistent with the number in Hansen et al 1999. The USHCN list was a concordance of USHCN numbers with GHCN numbers. The MCDW table also appears to be a concordance, but there are a couple of big differences with the USHCN series where the USHCN ids reconciled to USHCN listings and the GHCN ids rconciled to the GISS station.inv ids. In this case, neither happened.

Although 1301 GHCN ids in the concordance matched GISS station.inv ids, a total of 201 identifications did not – raising a couple of questions: where did these new IDs come from? what is their purpose? how are they used?

Also where do the 9-digit “MCDW” numbers come from. There doesn’t appear to be an online concordance of MCDW ids. There are individual reports at http://www1.ncdc.noaa.gov/pub/data/mcdw/ . In these reports, 5-digit WMO numbers are used; the first 3-digits are country codes (which differ somewhat from other lexicons.) There is a 4th digit in the concordance, which is usually 0. Again I don’t know the function. In the USHCN case, there was a list of stations in the network: if someone can identify the provenance of the MCDW numbers in the Hansen concordance, I’d appreciate it.

I’ve spot checked some individual GISS records back to MCDW publications and have been able to trace some individual values back.

The MCDW stations appear to be primarily airports, including many international airports. For the ROW, these constitute the lion’s share of information since 1990.

First Order/ASOS
The other new list is sumofday.tbl, which also appears to be a concordance of GHCN numbers of other identifications. There are 371 rows in this concordance – consistent with the number of First Order stations referred to in Hansen et al 1999. These are hourly-stations, most of which are ASOS since the 1990s (see post on the HO-83 thermometer).

Again, about 20 ids in this table do not reconcile with any GISS station.inv numbers. Also I’ve been unable so far to locate a data inventory in which all the station codes in the other side of the concordance could all be located.

AGU has specific policies on data provenance. Had Hansen observed these protocols, the exact digital source of the data would be specified – something that hasn’t happened so far, although they’ve indicated an attempt to improve their documentation.

Hansen Frees the Code

Hansen has just released what is said to be the source code for their temperature analysis. The release was announced in a shall-we-say ungracious email to his email distribution list and a link is now present at the NASA webpage.

Hansen says resentfully that they would have liked a “week or two” to make a “simplified version” of the program and that it is this version that “people interested in science” will want, as opposed to the version that actually generated their results.

Reto Ruedy has organized into a single document, as well as practical on a short time scale, the programs that produce our global temperature analysis from publicly available data streams of temperature measurements. These are a combination of subroutines written over the past few decades by Sergej Lebedeff, Jay Glascoe, and Reto. Because the programs include a variety of
languages and computer unique functions, Reto would have preferred to have a week or two to combine these into a simpler more transparent structure, but because of a recent flood of demands for the programs, they are being made available as is. People interested in science may want to wait a week or two for a simplified version.

In recent posts, I’ve observed that long rural stations in South America and Africa do not show the pronounced ROW trend (Where’s Waldo?) that is distinct from the U.S. temperature history as well as the total lack of long records from Antarctica covering the 1930s. Without mentioning climateaudit.org or myself by name, Hansen addresses the “lack of quality data from South America and Africa, a legitimate concern”, concluding this lack does not “matter” to the results.

Another favorite target of those who would raise doubt about the reality of global warming is the lack of quality data from South America and Africa, a legitimate concern. You will note in our maps of temperature change some blotches in South America and Africa, which are probably due to bad data. Our procedure does not throw out data because it looks unrealistic, as that would be subjective. But what is the global significance of these regions of exceptionally poor data? As shown by Figure 1, omission of South America and Africa has only a tiny effect on the global temperature change. Indeed, the difference that omitting these areas makes is to increase the global temperature change by (an entirely insignificant) 0.01C.

So United States shows no material change since the 1930s, but this doesn’t matter, South America doesn’t matter, Africa doesn’t matter and Antarctica has no records relevant to the 1930s. Europe and northern Asia would seem to be plausible candidates for locating Waldo. (BTW we are also told that the Medieval Warm Period was a regional phenomenon confined to Europe and northern Asia – go figure.]

On two separate occasions, Hansen, who two weeks ago contrasted royalty with “court jesters” saying that one does not “joust with jesters”, raised the possibility that the outside community is “wondering” why (using the royal “we”) he (a) “bothers to put up with this hassle and the nasty e-mails that it brings” or (b) “subject ourselves to the shenanigans”.

Actually, it wasn’t something that I, for one, was wondering about it all. In my opinion, questions about how he did his calculations are entirely appropriate and he had an obligation to answer the questions – an obligation that would have continued even if had flounced off at the mere indignity of having to answer a mildly probing question. Look, ordinary people get asked questions all the time and most of them don’t have the luxury of “not bothering with the hassle” or “not subjecting themselves to the shenanigans”. They just answer the questions the best they can and don’t complain. So should Hansen.

Hansen provides some interesting historical context to his studies, observing that his analysis was the first analysis to include Southern Hemisphere results, which supposedly showed that, contrary to the situation in the Northern Hemisphere, there wasn’t cooling from the 1940s to the 1970s:

The basic GISS temperature analysis scheme was defined in the late 1970s by Jim Hansen when a method of estimating global temperature change was needed for comparison with one-dimensional global climate models. Prior temperature analyses, most notably those of Murray Mitchell, covered only 20-90N latitudes. Our rationale was that the number of Southern Hemisphere stations was sufficient for a meaningful estimate of global temperature change, because temperature anomalies and trends are highly correlated over substantial geographical distances. Our first published results (Hansen et al., Climate impact of increasing atmospheric carbon dioxide, Science 213, 957, 1981) showed that, contrary to impressions from northern latitudes, global cooling after 1940 was small, and there was net global warming of about 0.4C between the 1880s and 1970s.

Earlier in the short essay, Hansen said that “omission of South America and Africa has only a tiny effect on the global temperature change”. However, they would surely have an impact on land temperatures in the Southern Hemisphere? And, as the above paragraph shows, the calculation of SH land temperatures and their integration into global temperatures seems to have been a central theme in Hansen’s own opus. If Hansen says that South America and Africa don’t matter to “global” and thus presumably to Southern Hemisphere temperature change, then it makes one wonder all the more: what does matter?

Personally, as I’ve said on many occasions, I have little doubt that the late 20th century was warmer than the 19th century. At present, I’m intrigued by the question as to how we know that it’s warmer now than in the 1930s. It seems plausible to me that it is. But how do we know that it is? And why should any scientist think that answering such a question is a “hassle”?

In my first post on the matter, I suggested that Hansen’s most appropriate response was to make his code available promptly and cordially. Since a somewhat embarrassing error had already been identified, I thought that it would be difficult for NASA to completely stonewall the matter regardless of Hansen’s own wishes in the matter. (I hadn’t started an FOI but was going to do so.)

Had Hansen done so, if he wished, he could then have included an expression of confidence that the rest of the code did not include material defects. Now he’s had to disclose the code anyway and has done so in a rather graceless way.

Waldo in Siberia

We have some firm sightings of Waldo in Siberia, as Warwick Hughes has long told us. There are very remarkable differences between temperature series depending on the site in Siberia. When Gavin Schmidt or James Hansen encounter differences of this order of magnitude between the U.S. and the ROW, they ascribe it to “regional” climate change – a pattern of temperature change in the 20th century which has left temperatures in the U.S. with relatively little change since the 1930s, while there has been much larger increases in the ROW.

There is a remarkable microcosm of this pattern in Siberia, where cities like Irkutsk and Bratsk have experienced sharp increases, while other sites have experienced relatively little change – a pattern no doubt ascribed by Schmidt and Hansen to “regionalization”. Here are a few plots : Continue reading