OK, What Caused the Problem?

Are you like me and a little puzzled as to exactly how the GHCN-GISS problem happened? GISS blamed their supplier (NOAA GHCN). Unfortunately NOAA’s been stone silent on the matter. I checked the Russian data at meteo.ru and there was nothing wrong with it. Nor is there anything wrong at GHCN-Daily for stations reporting there. So it’s something at GHCN-Monthly, a data set that I’ve been severely critical of in the past, primarily for the indolence of its updating, an indolence that has really reached a level of negligence.

In passing, while I was looking at Resolute data in connection with a question about a mis-step in temporarily losing some northern Canadian data while the Russian patch was being applied, I also noticed what appears to be a prior incident like the one that we just saw – only in reverse (and not a small error either, it was about 14 deg C). I’d also like to remind people that an identical incident with Finnish stations was reported here on Sep 23.

GHCN Non-Upating
Some of critics have asserted that I’ve been unfair in criticizing GISS, rather than GHCN. I submit that I’ve been drawing attention to problems at GHCN for a long time now. And last year, we actually had an incident when NASA GISS apologists made exactly the same fingerpointing excuses that they are now – that problems were GHCN’s fault and NASA GISS was blameless. Here are some prior posts on the matter.

In May 2007, I invited readers to help the climate community locate certain cities that the climate science community seemed unable to locate. The data set to which we had been directed by the CRU FOI officer as CRU’s course referred remarkably to “stations with no location”. I thought that CA readers would be intrigued by the idea of “stations with no location” and asked:

Were these pirate weather stations that changed locations to avoid detection by NASA? Were they voices from beyond – perhaps evidence of unknown civilizations? And there were over 420 such stations out of just over 5000 in total. So there were not just a few strangers among us. A number of the mystery stations came from the mysterious civilization known as “Chile”, whose existence has long been suspected.

A few months later, I invited readers to help NASA find the lost city of Wellington NZ, where NASA and GHCN had been unable to obtain records since 1988. I wondered whether the city had been destroyed by Scythians or perhaps Assyrians. Fortunately, one of the survivors made contact with us – indeed, the survivor was employed by a national meteorological service and assured us that records had in fact been kept since contact had been lost.

On another occasion, we pondered why GHCN had been unable to locate Canadian data (for Dawson and many other stations) since 1989 – and why NASA GISS had stood idly by, while nothing was done for 20 years. I asked:

How hard can it be to locate Canadian data? Maybe it’s time to ask the people who do the Consumer Price Index to compile temperature statistics. It’s all just data – maybe professional data people would do a better job than the present people who seem to have trouble getting off their La-Z-Boys.

We visited the same problem in connection with GHCN’s failure to update results from Peru and Bolivia since 1989, while NASA GISS merrily went about adjusting the data trends without bothering to collect up-to-date information readily available on the internet (and even at GHCN- Daily data). In this case, there was a small existing controversy as NASA GISS apologist (and Hansen’s other pit bull, Tamino) asserted stridently (see comments passim) that two sites, Cobija and Rurrenabaque, did not have post-1988 data and then, amusingly continued to assert this, in the face of simple proof that data almost up to the minute could be located on the internet.

Tamino:

There’s no post-1988 data for Cobija or Rurrenabaque

After I showed the existence of post-1988 data, a poster at Tamino’s site asked:

OK now i’m confused. Is there or is there not temp data for Cobija and Rurrenabaque after 1988? (as posted over at CA) Not trying to take any side here just losing faith on what to believe.

Even after I’d produced post-1988 data (and given active links to modern data), Tamino persisted:

[Response: I downloaded both the raw and adjusted datasets from GISS, and there’s no data beyond April 1989. ]

One of his posters persisted:

Dear Tamino,
I know you insist that “[t]here’s no data from Cobija or Rurrenabaque”, But McIntyre has posted the post 1988 temperature data for Cobija and Rurrenabaque at Climate Audit today.
Why the discrepancy?

To which Tamino answered:

[Response: He didn’t get it from GHCN or from NASA. Does it include adjustments for station moves, time-of-observation, instrument changes? Does Anthony Watts have photographs?]

Actually this wasn’t even true. I’d been able to get data from GHCN-Daily. Another reader persisted, asking the quesitons already raised here, as to why NASA GISS:

1) stopp[ed] using data series in 1988 when a full series exists till today (documented on CA for Cobija, Rurrenabaque).
2) Classiff[ied] stations as rural that are in fact urban (documented on CA for Yurimaguas, Moyobamba, Chachapoyas, Lambayeque, Tarapoto, Cajamarca, Tingo Maria) and adjusting them accordingly…

To which Tamino responded with the same fingerpointing argument recently used by Hansen’s other bulldog (Gavin):

[Response: … I asked you “in what way did GISS violate legitimate scientific method?” It appears that it’s not GISS but GHCN which left the post-1989 data out of the combined data supplied to GISS. Maybe there’s even a good reason. Clearly it was not GISS but GHCN which classified urban stations as rural. GISS was sufficiently dissatisfied with the classifications provided by GHCN to devise a whole new method and apply it to the U.S. Adjusting stations by comparing to other stations which have faulty population metadata is most certainly NOT a violation of legitimate scientific METHOD — it’s faulty metadata.

People who are not climate scientists typically have to scratch their heads a little when they see this sort of reasoning, which, as I just noted, is pretty much NASA’s present defence. The Other Dude Did It.

BTW NASA’s use of absurdly faulty population data from GHCN is an important issue in itself that we’ve discussed in the past. Because many of their “rural” locations outside the US are not “rural”, but medium-sized and rapidly-growing towns and even cities, their “adjustment” for UHI outside the U.S. is feckless and, in many cases, leads them to opposite adjustments in cities. This is a large topic in itself.

At their webpage, NOAA GHCN assures us that their quality control is “rigorous”.

Both historical and near-real-time GHCN data undergo rigorous quality assurance reviews.

This representation was endorsed in Hansen et al 1999 (with corresponding language in Hansen et al 2001) as follows

The GHCN data have undergone extensive quality control, as described by Peterson et al. [1998c].

I guess if you don’t actually update the majority of your data, it reduces the work involved in quality control.

I refer to these past posts as evidence that problems at GHCN have been on our radar screen long before the present incident. Indeed, I hope that access to GHCN procedures will be a positive outcome of the present contretemps.

Finland
CA reader, Andy, commented here on Sept 23, 2008:

BTW, GISS temperature data for the finnish towns like Oulu, Kajaani, Kuusamo etc shows exactly the same temperatures for July and August 2008. First time ever seen!

which was confirmed by Jean S here.

Resolute NWT
Now as promised above, here’s evidence of a prior incident. Because there’s been an amusing mishap with northern Canadian values being absent from the NASA map on Monday, present on Wednesday and absent again on Friday, I took a look at the Canadian source data for Resolute NWT, averaging the monthly values of -32.2 in March 2008 and -18.5 deg C in April.

url=”http://www.climate.weatheroffice.ec.gc.ca/climateData/bulkdata_e.html?timeframe=2&Prov=XX&StationID=1776&Year=2008&Month=10&Day=1&format=csv&type=dly”
test=read.csv(url,skip=23,header=TRUE)
round(tapply(test[,10],test$Month,mean,na.rm=T),2)

# 1 2 3 4 5 6 7 8 9 10 11 12
#-31.9 -35.3 -32.2 -18.5 -7.1 2.2 5.3 2.3 -5.0 -10.8 -20.0 NaN

I downloaded the most recent GHCN v2.mean data, unzipped it and looked at the 2008 values in the GHCN-Monthly data base. I bolded the March 2008 and April 2008 values, which are identical.

loc=”d:/temp/v2.mean”
v2=readLines(loc); id< -substr(v2,1,11);
temp=(id==”40371924000″);N=sum(temp)
v2[temp][N]

# “4037192400052008 -317 -351 -322 -322 -73 21 53 23 -50 -108-9999-9999″

The April 2008 value is invalid, being nearly 14 deg C colder than the actual value. I guess an error of 14 deg C is insufficient to engage their “rigorous” quality control.

Also, Jean S had already mentioned an identical incident with Finnish stations about a month before the most recent Hansen incident.

I don’t plan to spend time doing an inventory of incidents – surely NASA and NOAA have sufficient resources to do that. However, this one incident is sufficient to prove that the present incident is not isolated and that the same problem exists elsewhere in the system. I’m perplexed as to how the problem occurs in the first place, given that the error doesn’t occur in original data. I’m sure that we’ll find out in due course.

The bigger issue is, of course, why NOAA and NASA have been unable to update the majority of their network for 20 years.

CRUTEM and HadCRU October 2008

Released today on the promised schedule are CRUTEM3 and HadCRUT3 for October 2008. October 2008 was in the top 8 crutem3 (0.517 deg C)and in the top 6 hadcru3 (0.440 deg C) Octobers.

Because our collective eyes right now are fairly attuned to the colors of these grids and how changes in individual stations affect the contours, it’s interesting to take a look at these new contoured maps and compare them both to each other and to the GISS contour map.

First the CRUTEM3 (top), HadCRUT3 (middle), GISS (Nov 13 version – 250 km smoothing rather than 1200 km of the frontpage GISS image).

First a small point. Living in Toronto, I often look first at how the maps represent Toronto since I know what the weather’s been like here. For the most part the land portion of the HadCRUT3 map is identical to the CRUTEM3 map, but not where I live. In CRUTEM world, we experienced a colder than average October (which is how it felt on the ground here), while in HadCRU world we experienced a warmer than average October. One possible and even likely explanation is that HadCRU includes temperatures from the Great Lakes (which weren’t warm for swimming this year.) I didn’t go swimming at our place on Lake Ontario once this year. But maybe October water was less chilly than usual.

More points after you look at the graphics.



A second small point in the light of our watching the ball as stations got added to the GISS map of the Canadian Arctic. The land station contributions to HadCRU3 look like a slightly later GHCN version than CRUTEM3. HadCRU3 has a gridcell a bit to the southwest of Ellesmere Island (presumably Resolute) that got added after Nov 7, according to Gavin Schmidt. This gridcell is absent from the CRUTEM3 version. The Nov 13 GISS version lacks both Resolute and Alert for reasons, presumably because of an oversight as they put patches on patches.

A third point – the global average in the GISS 250 km version is 0.78 deg C, while it’s 0.61 deg C in the 1200 km version.

The CRUTEM3 version looks a lot like the GISS version. Although these compilations are often described as “independent”, recent events have clarified (if clarification were needed) that these compilations are not “independent”. Both rely almost exclusively on GHCN – GISS adds a few series around the edges.

The reverse engineering of CRUTEM3 looks almost pathetically easy given that we’ve already waded through step 0 of GISS, where they collate different GHCN versions (dset0) into a single station history (dset1.) CRU doesn’t have the bewildering sequence of smoothing operations that Hansen uses at multiple stages (though Hansen, mercifully, doesn’t use Mannian butterworth smoothing).

To my knowledge, unlike GISS, CRU does not make the slightest attempt to adjust for UHI, relying instead of articles like Jones et al 1990 purporting to show that UHI doesn’t “matter”.

We can already emulate GISS step0 – not that it makes any sense, but it provides a benchmark. Here’s all that seems to be necessary to produce a gridded CRUTEM3 series given a dset1 data set. First, create an anomaly-version of the series. I have a simple function anom on hand and this could be done as follows:

dset1.anom=apply(dset1,2,anom)

Then one could make an average of dset1 series within gridcell i as follows, where info is an information dataset in my usual style containing for each station, inter alia, its lat, long and gridcell number (called “cell” here):

for (i in 1:2592) grid[,i]=apply(dset1.anom[,info$cell==i],1,mean,na.rm=T)

This would yield the CRUTEM3 series. My guess as to why they don’t want to show their work is because they probably use hundreds of line of bad Fortran code to do something that you can do in a couple of lines in a modern language. Anyway, I’ll experiment with this at some point, but this is my hypothesis on all that’s required to emulate CRUTEM3. CRU has been funded by the US DOE; if, like GISS, they are doing nothing other than trivial sums on GHCN data, one feels that the money would be better spent on beefing up QC and data collection at GHCN.

I downloaded CRUTEM3.nc (today’s version) and checked for gridcells with October anomalies of 5 deg C or higher and then checked to see what stations were in those gridcells. I obtained teh following list, all but one in Siberia, the other one being Barrow, Alaska (where there is an extraordinary contrast between nearby stations that deserves comment.)

1710 22223724000 NJAKSIMVOL’ 62.43 60.87
1716 22223921000 IVDEL’ 60.68 60.45
1729 22224817000 ERBOGACEN 61.27 108.02
1720 22224125000 OLENEK 68.50 112.43
1723 22224329000 SELAGONCY 66.25 114.28
1721 22224143000 DZARDZAN 68.73 124.0
1724 22224343000 ZHIGANSK 66.77 123.4
1686 22220069000 OSTROV VIZE 79.5 76.98
1688 22220292000 GMO IM.E.K. F 77.72 104.3
1693 22221432000 OSTROV KOTEL’ 76 137.87
1685 22220046000 GMO IM.E.T. 80.62 58.05
3361 42570026000 BARROW/W. POS 71.3 -156.78

It looks like CRU has been paying attention to the GHCN commotion and has avoided using one of the problem versions. It is however interesting that some of the above stations (Olenek, Erbogacen etc) were problem stations and one would hope that one of the data distributors has actually checked these stations against original daily versions.

Should the Credibility Crunch Move to NOAA?

Some RC commenters are bemoaning the criticism that GISS is currently weathering. For example, Tamino:

Although the error originated NOT with GISS but before numbers even got through their door, I’ve heard no cries for heads to roll at NOAA or NWS — just vicious attacks on GISS.

As so often, Tamino simply makes stories up out of whole cloth. While I do not accept NASA’s blaming of their supplier as an excuse, I’ve clearly stated my hope that the present controversy will spark a long overdue assessment of the many problems at GHCN, some of which have been covered at Climate Audit from time to time. In my most recent post, I observed:

Perhaps this may give a long overdue impetus for a proper examination of GHCN’s failure to properly update readily available station data.

In my opinion, much of the criticism that GISS is presently receiving is because of Gavin Schmidt’s rancorous and ineffective public relations campaign at realclimate on behalf of Hansen, where he’s spent more time making baseless attacks of his own, than in simply chinning up, taking his medicine and properly acknowledging both Watts Up and Climate Audit. Pit bull tactics are seldom effective in circumstances like the present.

We’ve also been criticized for not being “constructive”. Well, eliminating errors is constructive in my opinion. I’d also like to report that over a year ago, I wrote to GHCN asking for a copy of their adjustment code:

I’m interested in experimenting with your Station History Adjustment algorithm and would like to ensure that I can replicate an actual case before thinking about the interesting statistical issues.  Methodological descriptions in academic articles are usually very time-consuming to try to replicate, if indeed they can be replicated at all. Usually it’s a lot faster to look at source code in order to clarify the many little decisions that need to be made in this sort of enterprise. In econometrics, it’s standard practice to archive code at the time of publication of an article – a practice that I’ve (by and large unsuccessfully) tried to encourage in climate science, but which may interest you. Would it be possible to send me the code for the existing and the forthcoming Station History adjustments. I’m interested in both USHCN and GHCN if possible.

To which I received the following reply from a GHCN employee:

You make an interesting point about archiving code, and you might be encouraged to hear that Configuration Management is an increasingly high priority here. Regarding your request — I’m not in a position to distribute any of the code because I have not personally written any homogeneity adjustment software. I also don’t know if there are any “rules” about distributing code, simply because it’s never come up with me before.

I never did receive any code from them. However an examination of GHCN code is clearly warranted and, if NOAA’s resources are as constrained as GISS’ are said to be, one would think that they would unreservedly facilitate third party examination of their code.

Verhojansk
Now there are many puzzles in GHCN adjustments, to say the least, and these adjustments are inhaled into GISS. Verhojansk is in the heart of the Siberian “hot spot”, presently a balmy minus 22 deg C. The graphics below compare GISS dset0 in the most recent scribal version to GISS dset 2 (showing identity other than small discrepancies at the start of the segment); the right compares GISS dset0 to the GHCN-Daily Average.

Over the past 20 years, the GISS version (presumably obtained from GHCN monthly) has risen 1.7 deg C (!) relative to the average taken from GHCN Daily results.


Left- GISS dset 2 minus Giss dset0 [[7]]; fight – Giss minus GHCN Daily

What causes this? I have no idea. It would be well worth finding out. The last GISS fiasco led to GISS code becoming available. Maybe the present fiasco will lead to GHCN code becoming available.

Memo to Gavin Schmidt

From: He Whose Name You are Not Allowed to Utter

Gavin, you said:

[Response: There are still (at least) four stations that have Oct data in place of september data but that didn’t report september data (Kirensk, Irkutsk, Bratsk, Erbogacen). I expect that the SEP=OCT check that NOAA did, just didn’t catch these. Still, this is embarassing – but will be fixed today. Nobody is ‘indifferent’. – gavin]

As you said elsewhere:

Why anyone would automatically assume something nefarious was going on without even looking at the numbers is a mystery to me.

Why would you assume that Erbogacen, Kirensk, Bratsk and Irkutsk did not report September data? I hope that you didn’t do so “without looking at the numbers”. Just to make things easy for you, here is a script that will do download daily GHCN data for you.

download.file(“http://data.climateaudit.org/data/giss/giss.info.tab&#8221;,”temp.dat”,mode=”wb”);load(“temp.dat”)
source(“http://data.climateaudit.org/scripts/station/collation.functions.txt&#8221;)

id1=c(22224817000, 22230230000, 22230309000, 22230710000)
giss.info[!is.na(match(giss.info$id,id1)),c(“id”,”site”)]

# id site
#1729 22224817000 ERBOGACEN
#1777 22230230000 KIRENSK
#1779 22230309000 BRATSK
#1791 22230710000 IRKUTSK

site1=giss.info$site[!is.na(match(giss.info$id,id1))]
id0=paste(“RS0000″,substr(id1,4,8),sep=””)
chron=NULL
for(i in 1:4) chron=ts.union(chron, read.ghcnd(id0[i]))
dimnames(chron)[[2]]=site1
chron

As you see, Erbogacen and Kirensk both reported September data to GHCN (daily). So your statement that these stations didn’t “report” September data is incorrect. Exactly why the September daily data didn’t get incorporated into the GHCN monthly data is one of many climate science mysteries on which we would welcome enlightenment.

The GHCN daily file seems to have lost track of Bratsk and Irkutsk after the year 2000, though GHCN monthly has, for the most, kept track of these two stations. Obviously Irkutsk and Bratsk both “reported” September data. They are both reasonably large cities whose temperatures can be located on the Internet Irkutsk Bratsk. Exactly why NASA GISS (and NOAA GHCN) were unable to locate this readily available is a mystery that has puzzled us at Climate Audit for a long time and perhaps you can enlighten us on why this task has seemingly baffled the NASA “professionals”.

In the meantime, perhaps you should withdraw your claim that these stations failed to “report” September data and replace this with a more accurate statement, saying that, for reasons that you (and many others do not understand), this data was not incorporated in the GHCN monthly file.

Perhaps this may give a long overdue impetus for a proper examination of GHCN’s failure to properly update readily available station data.

Watch the Ball

NASA spokesman Gavin Schmidt announced at realclimate that Hansen et al had fixed the Russian (and elsewhere data), following corrections to the data made by their supplier (NOAA GHCN.) Even though errors of over 10 deg C had occurred over the world’s largest land mass, it only reduced GISS October temperature by 0.21 deg C (from 0.86 deg C to 0.65 deg C) and still left a large “hot spot” over Siberia. Schmidt reported at realclimate (which increasingly seems to be NASA’s method of communicating with the public):

The corrected data is up. Met station index = 0.68, Land-ocean index = 0.58, details here. Turns out Siberia was quite warm last month.

Actual temperatures in much of the lurid “hot spot” will average a balmy minus 40 deg C and lower over most of the next few months. Olenek or Verhojansk sound like ideal venues for large-scale gatherings of climate scientists.

The GISS website states that changes were made to incorporate corrected GHCN files (the new file timestamped 12.58 pm today):

2008-11-12: It seems that one of the sources sent September data rather than October data. Corrected GHCN files were created by NOAA. Due to network maintenance, we were only able to download our basic file late today. We redid the analysis – thanks to the many people who noticed and informed us of that problem.

Now look closely at the two figures below and see what else you notice.

   

Left: NASA GISS as of Nov 10, 2008; right – as of Nov 12, 2008.

All of a sudden, a “hot spot” has developed over the Canadian Arctic Islands and the Arctic Ocean north of North America, that wasn’t there on Monday (it was gray on Monday). A smaller hot spot also developed over Australia.

I had downloaded the GHCN file on Monday (and saved it). I downloaded the GHCN file once again and checked for stations that had October values today, but not on Monday. All but two were in Australia with the other two also in the SH.

I haven’t crosschecked the Australian data but at least there’s some new data to support this part of the change. There was no new information from GHCN on the Canadian Arctic Islands. So what accounted for the sudden hot spot in the Canadian Arctic Islands??

Why can Hansen obtain values for October in the Canadian Arctic Islands today when he couldn’t on Monday?

Maybe NASA spokesman Schmidt can explain exactly how.

Update Nov 13, 11.30 am: NASA spokesman Gavin Schmidt, complying with Hansen’s policy not to mention my name, provided the following answer on how Hansen “fixed” the problem:

However, between last friday (when GISTEMP downloaded the first GHCN data) and today (thursday), stations in Australia and northern Canada were reported. People claiming on other websites that Oct data for Resolute, Cambridge Bay and Eureka NWT are not in the latest download, should really check their files again. (To make it easy the station numbers are 403719240005, 403719250005 and 403719170006). Why anyone would automatically assume something nefarious was going on without even looking at the numbers is a mystery to me. None of these people have any biases of their own of course.

If data for the three Canadian sites were added between Friday, Nov 7 and Monday, that would explain matters. Gavin’s accusation that the above question was asked “without even looking at the numbers” is another fabrication by a NASA employee. I reported that I had compared stations in the GHCN download before the issue was public and after the issue the public. The Canadian data was in both data sets. According to NASA spokesman Schmidt, it appears that NASA used an even earlier version of the GHCN data. That answers the question.

It is also reasonable to inquire as to whether changes in methodology had occurred. In September 2007, after the Y2K problem, Hansen changed their methodology without any notice or announcement with the effect of reversing once again the order of 1934 and 1998. Determining that they had changed data sources from SHAP to FILNET wasn’t easy and a change in sources or method could hardly be precluded in this instance, though NASA has now said that this was not the case.

Gavin Schmidt: "The processing algorithm worked fine."

In the last few days, NASA has been forced to withdraw erroneous October temperature data. The NASA GISS site is down, but NASA spokesman Gavin Schmidt said at their blog outlet that “The processing algorithm worked fine.”

Schmidt blamed the failure on defects in a product from a NASA supplier and expressed irritation that NASA should bear any responsibility for defects attributable to a supplier:

I’m finding this continued tone of mock outrage a little tiresome. The errors are in the file ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2/v2.mean.Z, not in the GISTEMP code (and by the way, the GISTEMP effort has nothing to do with me personally). The processing algorithm worked fine.

Although NASA blamed the error on their supplier (GHCN), in previous publications by Hansen et al, NASA had asserted that their supplier carried out “extensive quality control”:

The GHCN data have undergone extensive quality control, as described by Peterson et al. [1998c].

and that NASA (GISS) carried out their own quality control and verification of near real-time data:

Our analysis programs that ingest GHCN data include data quality checks that were developed for our earlier analysis of MCDW data. Retention of our own quality control checks is useful to guard against inadvertent errors in data transfer and processing, verification of any added near-real-time data, and testing of that portion of the GHCN data (specifically the United States Historical Climatology Network data) that was not screened by Peterson et al. [1998c].

Schmidt said that no one at NASA was even employed on a full-time basis to carry out quality control for the the widely used GISS temperature estimates

Current staffing from the GISTEMP analysis is about 0.25 FTE on an annualised basis (i’d estimate – it is not a specifically funded GISS activity).

Schmidt said that independent quality control would require a budget increase of about $500,000. NASA supporters called on critics to send personal checks to NASA to help them improve their quality.

At Verhojansk station, which I selected at random from the problem Russian statements, average October 2008 temperature was reported by NASA as 0.0 degrees. This was nearly 8 deg C higher than the previous October record (-7.9 deg). Contrary to the NASA spokesman’s claims, their quality control algorithm did not work “fine”.

What is more worrying is that no one seems to be minding the store. Schmidt says that the entire effort only takes about 1/4 of a man-year annually. (They are pretty busy at conferences, I guess.) CA readers know that the GISTEMP program is a complete mess and needs to be re-written from scratch. Schmidt seems not to even want to bother doing the work at NASA, saying that he’d prefer to hire ice sheet modelers and cloud parameterizers. He called on NOAA to do the job properly:

Those jobs are better done at NOAA who have a specific mandate from Congress to do these things.

On this point, I agree with NASA spokesman Schmidt. If NASA is not going to do the job properly, then it shouldn’t do the job at all. NASA should not be depending on the kindness of strangers for their quality control. Ross McKitrick has long observed that the collection of temperature data is a job sort of like making a Consumer Price Index and it should be done by professionals of the same sort. It doesn’t make any sense for people like James Hansen and Phil Jones to be trying to do this on a part-time basis. As long as it’s being done on such a haphazard basis, there’s really no way to prevent incidents like this one (or last year’s “Y2K” problem.)

Did Napoleon Use Hansen's Temperature Data?

It’s colder in Russia in October than in September, as Napoleon found out to his cost in 1812.

Sitting in the ashes of a ruined city without having received the Russian capitulation, and facing a Russian maneuver forcing him out of Moscow, Napoleon started his long retreat by the middle of October.

Flash forward almost 200 years later. NASA has just reported record warmth in October throughout Russia, with many sites experiencing similar temperatures in October as in September – perhaps the sort of situation that Napoleon had hoped for (not similar as anomalies, but similar in actual temperatures in deg C.)

Actually, many stations didn’t just experience similar absolute monthly temperatures. Many stations had exactly the same monthly temperatures in October as in September. Here are the last three years for the Russian station, Olenek, showing NASA GISS monthly temperatures (in deg C) bolding Sept and Oct 2008. October 2007 had an average temperature of -9 deg C, as compared to 3.1 deg C in Sept 2007. October 2008 had the identical temperature as September 2008.

YEAR JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2006 -34.0 -29.9 -23.5 -18.1 1.6 10.6 16.9 11.5 4.4 -14.6 -27.7 -29.1
2007 -27.9 -41.5 -21.6 -4.0 0.1 12.4 13.5 11.3 3.1 -9.0 -24.8 999.9
2008 -30.0 -29.4 -19.6 -13.4 1.3 12.0 13.1 12.1 3.1 3.1 999.9 999.9

This exact match of October 2008 to September 2008 was repeated at many other Russian stations. A CA reader notified me of this phenomenon earlier today and I’ve confirmed for myself that the information is accurate. Based on what he described as a “Cursory” look, he sent me the following list currently “updated” stations that exactly replicate the Sept data: Almaty, Omsk, Salehard, Semipalatinsk, Turuhansk, Tobol’sk, Verhojansk, Viljujsk, Vilnius, Vologda. I can add Hatanga, Suntora, GMO ImEKF. Not all stations were affected – Dzerszan, Ostrov Kotal, Jakutsk, Cokurdah appear to have correct results.

Let’s consider the opposite situation. Suppose that March temperatures had been inadvertently carried forward into April, yielding a massive cold anomaly in Russia. One feels that Hansen would have been all over the opposite error like a dog on a bone – he would have been his own bulldog.

In any event, we here at Climate Audit are always eager to assist NASA. On earlier occasions, we helped identify the lost city of Wellington, New Zealand, where NASA has been unable to locate climate records for nearly 30 years.

Today, we are able to provide NASA with up-to-date weather reports confirming that October in Russia is colder than September. Verhojansk temperatures are conveniently online here and temperatures are currently a nippy -18 deg C.

You’re welcome, Jim.

Santer Refuses Data Request

On Oct 20, 2008, I sent Santer the following request:

Dear Dr Santer,

Could you please provide me either with the monthly model data (49 series) used for statistical analysis in Santer et al 2008 or a link to a URL. I understand that your version has been collated from PCMDI ; my interest is in a file of the data as you used it (I presume that the monthly data used for statistics is about 1-2 MB). Thank you for your attention, Steve McIntyre

I received an automated response saying that Santer was at a workshop (Chapman Water Vapor) and would be returning on Oct 30. On Nov 7, 2008, not having received a reply, I sent a reminder to Santer

Could you please reply to the request below, Regards, Steve McIntyre

I once again received an automated reply, this time that he was at a meeting of the Hadley Centre Science Review Group in Exeter and that he would be back in his office on Nov. 10th

I received the following discourteous reply today:

Dear Mr. McIntyre,

I gather that your intent is to “audit” the findings of our recently-published paper in the International Journal of Climatology (IJoC). You are of course free to do so. I note that both the gridded model and observational datasets used in our IJoC paper are freely available to researchers. You should have no problem in accessing exactly the same model and observational datasets that we employed. You will need to do a little work in order to calculate synthetic Microwave Sounding Unit (MSU) temperatures from climate model atmospheric temperature information. This should not pose any difficulties for you. Algorithms for calculating synthetic MSU temperatures have been published by ourselves and others in the peer-reviewed literature. You will also need to calculate spatially-averaged temperature changes from the gridded model and observational data. Again, that should not be too taxing.

In summary, you have access to all the raw information that you require in order to determine whether the conclusions reached in our IJoC paper are sound or unsound. I see no reason why I should do your work for you, and provide you with derived quantities (zonal means, synthetic MSU temperatures, etc.) which you can easily compute yourself.

I am copying this email to all co-authors of the 2008 Santer et al. IJoC paper, as well as to Professor Glenn McGregor at IJoC.

I gather that you have appointed yourself as an independent arbiter of the appropriate use of statistical tools in climate research. Rather that “auditing” our paper, you should be directing your attention to the 2007 IJoC paper published by David Douglass et al., which contains an egregious statistical error.

Please do not communicate with me in the future.
Ben Santer

I sent the following FOI request today to NOAA:

National Oceanic and Atmospheric Administration
Public Reference Facility (OFA56)
Attn: NOAA FOIA Officer
1315 East West Highway (SSMC3)
Room 10730
Silver Spring, Maryland 20910

Re: Freedom of Information Act Request

Dear NOAA FOIA Officer:
This is a request under the Freedom of Information Act.

Santer et al, Consistency of modelled and observed temperature trends in the tropical troposphere, (Int J Climatology, 2008), of which NOAA employees J. R. Lanzante, S. Solomon, M. Free and T. R. Karl were co-authors, reported on a statistical analysis of the output of 47 runs of climate models that had been collated into monthly time series by Benjamin Santer and associates.

I request that a copy of the following NOAA records be provided to me: (1) any monthly time series of output from any of the 47 climate models sent by Santer and/or other coauthors of Santer et al 2008 to NOAA employees between 2006 and October 2008; (2) any correspondence concerning these monthly time series between Santer and/or other coauthors of Santer et al 2008 and NOAA employees between 2006 and October 2008.

The primary sources for NOAA records are J. R. Lanzante, S. Solomon, M. Free and T. R. Karl.

In order to help to determine my status for purposes of determining the applicability of any fees, you should know that I have 5 peer-reviewed publications on paleoclimate; that I was a reviewer for WG1; that I made a invited presentations in 2006 to the National Research Council Panel on Surface Temperature Reconstructions and two presentations to the Oversight and Investigations Subcommittee of the House Energy and Commerce Committee.

In addition, a previous FOI request was discussed by the NOAA Science Advisory Board’s Data Archiving and Access Requirements Working Group (DAARWG). http://www.joss.ucar.edu/daarwg/may07/presentations/KarL_DAARWG_NOAAArchivepolify-v0514.pdf.

I believe a fee waiver is appropriate since the purpose of the request is academic research, the information exists in digital format and the information should be easily located by the primary sources.

I also include a telephone number (___) at which I can be contacted between 9 and 7 pm Eastern Daylight Time, if necessary, to discuss any aspect of my request.

Thank you for your consideration of this request.

I ask that the FOI request be processed promptly as NOAA failed to send me a response to the FOI request referred to above, for which Dr Karl apologized as follows: “due to a miscommunication between our office and our headquarters, the response was not submitted to you. I deeply apologize for this oversight, and we have taken measures to ensure this does not happen in the future.”

Stephen McIntyre

I sent the following letter to the editor of the journal:

Dear Dr McGregor,

I am writing to you in your capacity as editor of International Journal of Climatology.

Recently Santer et al published a statistical analysis of monthly time series that they had collated from 47 climate nodels. I recently requested a copy of this data from Dr Santer and received the attached discourteous refusal.

I was unable to locate any information of data policies of your journal and would appreciate a copy of such policies.

The form of my request was well within the scope of the data policies of most academic journals and I presume that this is also the case in respect to the policies of your journal. If this is the case, I would also appreciate it if you required the authors to provide the requested collation in the form used for their statistical analysis. While the authors argue that the monthly series could be collated from PCMDI data, my interest lies with the statistical properties of the time series, rather than with the collation of the data.

Regards, Steve McIntyre

A New Caramilk Secret

Jean S, UC and I spent a considerable amount of time a couple of years ago, trying to figure out how MBH99 confidence intervals were calculated – see here. I asked the NAS panel to investigate the matter, but they failed to do so. After their report, I asked NAS president Cicerone to merely write to Mann asking him to provide an explanation – Cicerone refused. I asked Gerry North to do so; North agreed to do so, but I never heard anything more about it. So either he agreed to ask for an explanation and then didn;t follow through or Mann refused even a request from Gerry North.

North was supposedly one of the reviewers for Mann et al 2008, which also refers to confidence intervals. One would have hoped that North, already on notice about this issue, would have ensured that Mann et al 2008 clearly explained their calculation of confidence intervals. No such luck.

Figures 2, 3 and S5 all illustrate “95% confidence intervals” by “lightly shaded regions of similar color)”. Mann et al observe: “For the CPS (EIV) reconstructions, the instrumental warmth breaches the upper 95% confidence limits of the reconstructions beginning with the decade centered at 1997 (2001)” and in the caption to Figure 3 state “Confidence intervals have been reduced to account for smoothing.”

The “Methods” state:

Uncertainties were estimated from the residual decadal variance during the validation period based (32, 42) on the average validation period r2 (which in this context has the useful property that, unlike RE and CE, it is bounded by 0 and 1 and can therefore be used to define a ‘‘fraction’ of unresolved variance).

That’s it. There is no further explanation. The SI contains a column entitled “uncertainty” which changes in century-steps. I’ve examined the source code and, once again, it appears to be incomplete. I’ve been unable to locate any code evidencing the calculation of the confidence intervals.

Another Caramilk secret. I’ll experiment a bit with some of the endless Butterworth smooths and see if anything turns up.

The Rain in Spain

Erroneous geographical locations of precipitation proxies have been source of mild amusement to CA readers. Long ago, in connection with MBH, we observed that the rain in Maine fell mainly in the Seine – an error that Mann stubbornly refused to correct in Mann et al 2007 – not that any reviewer cared.

A few weeks ago, we observed with mild amusement that, as Liza Doolittle knew, the rain in Spain fell mainly in the plain – except that in Mann-world, it fell in the plains of Kenya.

Shortly afterwards, Mann reported the corrected the error without acknowledging the source – even though proper acknowledgment of sources is mandatory under Penn State codes of conduct.

I reported the error more as a matter of amusement, and this largely because of the prior history with the rain in Maine. But here’s something a little more substantial – Mann corrected the error and re-did his calculations. Below is a figure showing the impact of correcting the location of one proxy on his SH reconstruction. The difference in the 18th century is over half a degree, as shown below.

From http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/corrections/corrected_eiv_shshfulCRU.pdf

We sometimes are told that the various errors don’t “matter”. But here’s a case where merely changing the location of one proxy from Kenya to its correct location in Spain alters the estimate for the SH by about 0.5 deg C for an entire century. Oh yes, at the NAS panel hearings, Mann said that he knew the AD1000 temperature within 0.2 deg C.