I spent quite a bit of time today ploughing through USHCN station data, doing a lot of grunt work. For example, I spent time aking a concordance of 1221 USHCN identification numbers with GHCN id numbers (they aren’t the same) and I haven’t seen a concordance anywhere. Most of it could be done through matching lat-long neighborhoods and common name starts. I got all but 3 matched through 3 fairly ad hoc rules. It was actually easier than the smaller Jones China concordance because the latitudes and longitudes were all correct or correct within a very close rounding error. One of the reasons for doing this is that Hansen et al have arrchived their lights value for GHCN id #, but not for USHCN id #. I’ve collated a data file for all USHCN stations which also looks up the GHCN metadata: lights, vegetation, topography. They have a lot of codes and metadata, but they don’t appear to have a code for parking lot or for whether an electric light is in the box with the thermometer. The data file is here.
In the Hansen temperature calculation, he says that the long-term trends are defined by “rural” stations, defined in Hansen et al as having lights=0 according to their brightness index. Having collated the lights information with USHCN identifications, I then extracted the California stations with lights=0.. Details (excerpted from the larger file above) are here. There were 17 such stations, a couple of which we’ve discussed over the past few days, courtesy of Anthony Watts: BRAWLEY 2SW, CEDARVILLE, CUYAMACA , DEATH VALLEY, ELECTRA PH, FAIRMONT, FORT BRAGG 5N, HAPPY CAMP RS, INDEPENDENCE, LAKE SPAULDING, LEMON COVE, NEEDLES FAA AP, ORLEANS, SUSANVILLE AP, TEJON RANCHO, WILLOWS 6W, YOSEMITE PARK HEADQUARTERS.
Lake Spaulding is one of the 17. It’s the one with the weather station attached to the boat in the parking lot. It turns out that until recently there was a much better location for the station, but it was recently changed to a site that is worthless. One wonders why IPCC has never stated that sites like the old Lake Spaulding are valuable for long-term weather keeping and at the total mismangement of the system in which Jones and Hansen have failed to oppose such changes.
I’ve collated 7 different versions for each site: the three USHCN versions (areal, time-of-obs adjusted, filnet (Which is mainly Karl’s Station History Adjustment), 2 GISS versions: raw and adjusted and two GHCN versions: raw and adjusted. By and large the threefold adjusted USHCN version is approximately the GISS raw version. In each case, I plotted out the impact of the various adjustments from USHCN raw to GISS “raw” to GISS adjusted.
A few patterns seem to emerge. First, in the majority of these “best” sites, there is a warm 1930s. The net impact of the several adjustment stages is to reduce the 1930s relative to the end of the 20th century. Probably the largest contributor to this was the USHCN Station History Adjustment, which is done according to a procedure of Karl and Williams 1987. (Karl, T. R., and C. N. Williams, Jr. 1987. An approach to adjusting climatological time series for discontinuous inhomogeneities. Journal of Climate and Applied Meteorology 26:1744-1763. http://ams.allenpress.com/archive/1520-0450/26/12/pdf/i1520-0450-26-12-1744.pdf) Whatever the “good” reason for this adjustment, the main practical effect seems to be the lowering of past high temperatures. It also seemed that GISS was very likely to exclude past high temperatures from their collation, but were like a dog on a bone for past low temperatures. I’ll show one station to give a flavor of what I was doing.
Cedarville
The first plot is a simple spaghetti graph of the 7 different annualized versions as shown below, showing warm 1930s in this case.

Figure 1. Spaghetti graph of 7 versions of station history (1961-1990 anomaly basis)
The next is a plot showing 3 stages from USHCN raw to GISS adjusted: from USHCN raw t0 USHCN filnet (adjusted); from USHCN filnet to GISS raw (these tend to be very close) and from GISS raw to GISS adjusted. In this case, the USHCN adjustments collectively increased the trend by about 1 deg C and the GISS adjustments another 1 deg C.

The next figure disentangles the USHCN adjustments by type. The time-of-observation adjustment makes little difference, but the Station History Adjustment makes a large difference.

USHCN station history information is described here , while the station histories are at http://cdiac.ornl.gov/ftp/ushcn_monthly/station_history . This file is extremely hard to read as it uses lots of indicators in Fortran punch card format.
begin_date dist_prev_loc conf_dist_move dir_move elev_chg hgt_chg CRS DT MN MX MMTS
05 01 1894 NA 0 999 NA NA 1 0 1 1 0
02 01 1914 0 0 0 0 0 1 0 1 1 0
10 01 1937 0 0 0 0 0 1 0 1 1 0
04 28 1938 0 0 0 0 0 1 0 1 1 0
03 01 1949 0 0 0 0 0 1 0 1 1 0
04 01 1950 0 0 NNE 0 -2 1 0 1 1 0
04 20 1957 0.1 0 S -3 2 1 0 1 1 0
02 12 1959 0.2 0 N 0 0 1 0 1 1 0
03 20 1960 0 0 NW -2 0 1 0 1 1 0
07 24 1969 0.2 0 NNW 0 0 1 0 1 1 0
08 16 1974 0 0 0 0 0 1 0 1 1 0
09 12 1975 0 0 E 0 0 1 0 1 1 0
11 02 1977 0 0 W 0 0 1 0 1 1 0
Server upgrade is coming soon
Due to the loading on the current dedicated server causing regular server crashes, I have decided to move CA and other hosted blogs to a new server with twice as much memory (2GB) and utilizing a proper server-oriented Linux rather than using a general purpose Linux. It demonstrates that if you’re going to do serious hosting, then you need to plan ahead rather than hope that your current benchtest system will do the job – it will, but not reliably long term.
At the moment, I am trying to decide between CentOS5 or SuSE Linux Enterprise Server.
What this will mean is that sometime in the next 2 weeks, CA will go offline for a few hours while the DNS updates around the world. I will let everyone know closer to the time when this is likely to occur.