Since August 1, 2007, NASA has had 3 substantially different online versions of their 1221 USHCN stations (1221 in total.) The third and most recent version was slipped in without any announcement or notice in the last few days – subsequent to their code being placed online on Sept 7, 2007. (I can vouch for this as I completed a scrape of the dset=1 dataset in the early afternoon of Sept 7.)
We’ve been following the progress of the Detroit Lakes MN station and it’s instructive to follow the ups and downs of its history through these spasms. One is used to unpredictability in futures markets (I worked in the copper business in the 1970s and learned their vagaries first hand). But it’s quite unexpected to see similar volatility in the temperature “pasts”.
For example, the Oct 1931 value (GISS dset0 and dset1 – both are equal) for Detroit Lakes began August 2007 at 8.2 deg C; there was a short bull market in August with an increase to 9.1 deg C for a few weeks, but its value was hit by the September bear market and is now only 8.5 deg C. The Nov 1931 temperature went up by 0.8 deg (from -0.9 deg C to -0.1 deg C) in the August bull market, but went back down the full amount of 0.8 deg in the September bear market. December 1931 went up a full 1.0 deg C in the August bull market (from -7.6 deg C to -6.6 deg C) and has held onto its gains much better in the September bear market, falling back only 0.1 deg C -6.7 deg C.
All records of the August bull market in Detroit Lake pasts have been erased from the NASA website, but I managed to complete my downloads in time and am in a position to try to decode exactly what’s been going on.
First, here is a graphic showing the changes to the Detroit Lakes MN in the August “bull market” as NASA moved quickly to correct the “Y2K” error that I had drawn their attention to. Their patch was essentially a step adjustment at 2000, which had the effect of increasing all earlier values by about 0.8 deg C.
Second, here is a similar graphic showing the changes to Detroit Lakes MN in the September “bear market”. As you can see, Hansen has clawed back most of the gains of the 1930s relative to recent years – perhaps leading eventually to a re-discovery of 1998 as the warmest U.S. year of the 20th century.
Aside from other issues – which we shall get to – we have two crossword puzzles here: where did the data come from? In fact, the precise provenance of the NASA USHCN data has been raised in recent posts – most recently here where I posited the use of a vintage USHCN data set. One of the nice things about climateaudit group is that readers often have good answers. Jerry Brennan suggested that the vintage data at ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/OtherIntermediates/ be consulted. There were two potentially relevant files here hcn_shap_ac_mean.Z and hcn_mmts_mean_data.Z. I examined these files for Detroit Lakes and compared them to the three NASA versions online over the summer and am pretty much able to trace the machinations back to their source.
1) the vintage USHCN data set hcn_shap_ac_mean.Z , as Jerry Brennan surmised, was almost certainly used in the pre-Y2K version and the Y2K-adjusted version;
2) the September bear market at Detroit Lakes MN was precipitated by an unannounced switch to the USHCN data set hcn_doe_mean.Z.
Here are some detailed comparisons. First here is a comparison of the NASA “pre-Y2K” version against the vintage SHAP_AC version. You can see the step at 2000; values – if they were available for plotting – would continue at the upper step. There are slight monthly differences relating to some NASA procedure doing monthly adjustments, but this graphic shows to a moral certainty that the SHAP_AC version was in use pre-Y2K.
The next figure compares the NASA Sept 7 version to the vintage SHAP_AC version. The monthly perturbation introduced by NASA has increased but the two versions are obviously connected – and you can see that the Y2K patch has eliminated the step from using inconsistent versions.
However, the changes introduced in the September bear market changed the relationship as shown below. So what is the provenance of the new data?
It’s not the vintage MMTS version, a comparison to which is shown below:
But it looks like the current USHCN “adjusted” version plus the step adjustment (which isn’t needed for this series – something that I observed earlier).
The current USHCN data is located in a file entitled hcn_doe_mean and there is a reference to this file in the source code placed online on Sept 7, 2007. This is a different file than the hcn_shap_ac_mean file that was used prior to Sept 7. Perhaps the change from hcn_doe_mean to hcn_shap_ac_mean is the sort of “simplification” that Hansen had in mind when he said, on the occasion of the code being placed online, that they:
would have preferred to have a week or two to combine these into a simpler more transparent structure, but because of a recent flood of demands for the programs, they are being made available as is. People interested in science may want to wait a week or two for a simplified version.
However, this sort of change should not be introduced in the guise of “simplification”. It’s a substantive change in procedure. Maybe it’s an improvement; maybe it’s not. If Hansen is making changes to “improve” his methodology, users are entitled to know of the change when they’re introduced, not after the fact through reverse engineering.
I have no information on why Hansen is picking this particular time to make unannounced “improvements” to his methodology. However, it seems like a poor time to be doing so, as many people will undoubtedly question the motives of doing so at this particular time – and particularly without any announcement. Of course, it could be an unintentional “accident”, just as the “Y2K” switch in versions was an “accident” – in which case, the timing of a second accident seems particularly inopportune.