Hansen “Step 1″ which was the source of the August crossword puzzles does not occur in a crossword puzzle form in the USHCN stations that we’ve been discussing this week. The “Hansen bias” in combining scribal versions, noticed by John Goetz, does not occur in USHCN stations for two separate reasons: (1) in the USHCN stations that I’ve examined, there is only one dset=0 version and the dset=1 version is equal to the dset=0 version and the “Hansen bias” issue noticed in the Russian stations doesn’t come into play; (2) to be material, the “Hansen bias” requires a very short overlap between the historical data and the MCDW data. For the Russian stations, this is typically only 4 years so that the December 1986 anomaly problem comes into play.
Jean S, as so often, was the first to figure out how the Hansen bias is implemented in multiple versions. He sent me his Matlab code about 2 weeks ago. Since then Hansen has archived his Step 1 code. I’ve done it my own way in R and am compatible with Jean S’ results up to an annoying rounding difference, that should be resolvable.
Here are notes on the process.
Hansen’s description is as follows:
Two records are combined as shown in Figure 2, if they have a period of overlap. The mean difference or bias between the two records during their period of overlap (dT) is used to adjust one record before the two are averaged, leading to identification of this way for combining records as the “bias” method (HL87) or, alternatively, as the “reference station” method [Peterson et al., 1998b].
The adjustment is useful even with records for nominally the same location, as indicated by the latitude and longitude, because they may differ in the height or surroundings of the thermometer, in their method of calculating daily mean temperature, or in other ways that influence monthly mean temperature. Although the two records to be combined are shown as being distinct in Figure 2, in the majority of cases the overlapping portions of the two records are identical, representing the same measurements that have made their way into more than one data set.
A third record for the same location, if it exists, is then combined with the mean of the first two records in the same way, with all records present for a given year contributing equally to the mean temperature for that year (HL87). This process is continued until all stations with overlap at a given location are employed. If there are additional stations without overlap, these are also combined, without adjustment, provided that the gap between records is no more than 10 years and the mean temperatures for the nearest five year periods of the two records differ by less than one standard deviation. Stations with larger gaps are treated as separate records.
My own code uses the following functions:
You can download it with a source instruction. Note that the function read.giss is handy for downloading GISS data in a usable form.
This function calculates the “Hansen bias” between two time series by comparing the annual averages. It is calculated by “Hansen summing” in an overlap period. Two formats are allowed for: the default method=”hansen” uses the table layout used in GISTEMP: year, 12 months, 4 quarters, annual; method = “ts” will compare two annual time series. This function has been tested against our crossword puzzle examples.
You can test this function as follows (watch for quotation marks out of WordPress, which you may have to replace):
This function will calculate quarters and annual averages Hansen-style from monthly data with missing values. Monthly averages are calculated for all months and also quarterly averages (“normal”); note that this is over available dates not a reference period which will cause results to migrate over time. Quarters are DJF, MAM, JJA, SON which causes problems when a scribal version starts in January as we’ve oberved.
If there are 2 or 3 months in a quarter, the quarterly anomaly is calculated as the average anomaly in available months plus the quarterly normal. If there are 3 or 4 quarters, the annual aanomaly is calculated the same way as the quarterly anomaly. The quarterly and annual values are then calculated by re-centering based on the quarterly and monthly averages calculated previously.
Here is a test of this function:
In this case, all values are within 0.05, but don’t match exactly. There’s a loose end in the rounding somewhere, that I need to attend to. I think that Jean S’ rounding is exact.
This is a little function that calculates the most recent value in a time series; used in ordering
This is a little function that calculates the length of overlap between two ANNUAL time series.
This is the function that emulates Hansen’s calculation of dset=1 from dset=0. Its only necessary value is id0, which here is the he 11-digit GISS id for a station: I find it best to use characters. I’ve included a check and patch if you input numeric values, but haven’t tested the patch. The GISS id is 11 digits as follows:
CCCWWWWWIII – 3 -digit country; 5-digit WMO; 3-digit local
There is an option for automatic download of current values from GISS or you can set download0=FALSE and pick up values that you already have for station0 and station1 for testing purposes – as long as these are in Hansen 18-column format.
A variety of objects returned for diagnostic purposes
$station0 the input dset0 as monthly ts
$annual0 -input dset0 as annual
$station1 – input dset0 annual as ts
$annual1 – input dset1 annual as ts
$adjusted -Hansen adjusted monthly
$adjusted_annual – Hansen adjusted annual
$reference – closing Hansen quarterly as ts
$delta – adjustments for each version
Here’s a quick example for Gassim used for the hansen_bias example above.
# 0.15 0.00
Here’s how to do another example.
# -0.34 -0.16 0.00
If you want to access the related objects e.g. if you want to plot the dset=1 annual, you can do so quickly.
The first series selected is typically the MCDW series (See discussion http://www.climateaudit.org/?p=2034) using Hansen’s “get_best” routine. As others have observed, this is inconsistent with the published methodology and had to be decoded.